Methodology · Curated marketplace
evaluation
This skill should be used when the user asks to "evaluate agent performance", "build test framework", "measure agent quality", "create evaluation rubrics", or mentions LLM-as-judge,…
Composite
C 3.9 · A 2.4
How we got there
1 source verified
- Best source
skillsmp.com - Authority tier Tier 2 — Curated marketplace
- Stars ★ 15,635
- Source link https://skillsmp.com/skills/muratcankoylan-agent-skills-for-context-engineering-skills-evaluation-skill-md ↗
- First published 2026-05-22
Use this skill
/plugin install evaluation More in Methodology
claude-api
Reference for the Claude API / Anthropic SDK — model ids, pricing, params, streaming, tool use, MCP, agents, caching, token counting, model migration.
prompt-engineering
Universal prompt engineering techniques for any LLM.
github-swyxio-ai-notes
notes for software engineers getting up to speed on new AI developments.
mcp-builder
Builds production MCP servers via 4-phase methodology: research, implement, test, evaluate. Triggers: build MCP, new MCP, MCP integration, MCP server scaffold.
Auto-indexed. Editorial review pending — score is based on the rubric only.