In today’s fast-evolving world of generative AI, agent creation has never been easier—or more popular. With tools like Copilot Studio, business users and developers alike are spinning up AI-powered agents to automate tasks, answer customer questions, and drive productivity. But with great power comes great responsibility—and increasingly, our customers are asking an important question:
“How do we know these agents are performing well and safely?”
This is where observability and evaluation enter the picture.
As outlined in Azure AI Foundry, observability isn’t just for DevOps pipelines—it’s essential for AI systems. It’s the process of collecting, analyzing, and acting on signals from your AI applications to ensure they’re reliable, safe, and aligned with business goals.
Without observability, hallucinations or off-topic responses may go unnoticed until they cause customer frustration—or worse.
In fact, Microsoft recommends that AI solutions follow continuous monitoring and evaluation practices. This includes capturing runtime metrics, logging interactions, and continuously evaluating those interactions against key criteria. You can learn more about this approach here and here.
To address this need, we built a solution that connects Copilot Studio agents with Azure AI Observability and Evaluation services. Here’s how it works:
Conversation Capture
Every conversation with your Copilot Studio agent is automatically stored in Dataverse tables (standard functionality).
Evaluation Pipeline
The solution extracts these conversations and sends them to Azure AI Observability via the Evaluation API.
These reports allow you to:
No need to start from scratch—simply import the provided Power BI reports following these steps, and you’ll have instant visibility into your agents’ performance and risk areas.
In other words:
➡️ Extract Copilot Studio conversations → Evaluate in Azure → Visualize in Power BI.
This approach unlocks continuous improvement for your Copilot Studio agents:
It also aligns perfectly with Microsoft’s AI principles around safety, transparency, and accountability.
The full solution is open-source and available here on GitHub. You’ll find:
Whether you’re a seasoned AI developer or just beginning your Copilot Studio journey, this project will help you create better, safer, and more accountable agents.
Many people contributed to this solution! A big thanks to the team for their efforts.