Microsoft Releases Magentic Marketplace to Test AI Agent Safety Tech Happened

Microsoft researchers, in collaboration with Arizona State University, have released an open-source simulation environment called the Magentic Marketplace to test how AI agents behave when interacting, competing, and collaborating in realistic scenarios. The platform is designed to let researchers run experiments at scale, for example, simulating a customer-agent ordering dinner while multiple business-agents compete to win the sale.

In initial experiments, the team ran 100 customer-side agents against 300 business-side agents, using a mix of leading models including GPT-4o, GPT-5, and Gemini-2.5-Flash. The results revealed vulnerabilities agents were susceptible to manipulation by adversarial business tactics and struggled as the number of options grew, which overwhelmed their attention and reduced decision efficiency.

Researchers also found that agent collaboration often broke down when roles and responsibilities weren’t explicitly defined. Performance improved with clearer instructions, but the experiments show today’s models lack robust default abilities for unsupervised coordination and negotiation.

Ece Kamar, Managing Director of Microsoft Research’s AI Frontiers Lab, emphasized the importance of studying these dynamics: “There is really a question about how the world is going to change by having these agents collaborating and talking to each other and negotiating. We want to understand these things deeply.”

Because the Magentic Marketplace and its code are open source, other teams can reproduce the experiments, explore attack vectors, and develop stronger safeguards, a step Microsoft believes is essential for building reliable, safe agentic systems before they see broad real-world deployment.