Release v1.0.0 · microsoft/eval-guide

Initial release of the AI agent evaluation toolkit for Copilot Studio.

What's included

6 skills: /eval-guide, /eval-suite-planner, /eval-generator, /eval-result-interpreter, /eval-triage-and-improvement, /eval-faq
Interactive HTML dashboards for reviewing and editing eval artifacts at each stage
Architecture-aware eval scoping — automatically adjusts test depth for prompt-level, RAG, and agentic architectures
Single-response and conversation (multi-turn) test case generation
Copilot Studio CSV import — generated test cases import directly into Copilot Studio
Works in both Claude Code and GitHub Copilot

Claude Code:

claude plugin add microsoft/eval-guide

GitHub Copilot:

npx skills add microsoft/eval-guide