AI Engineeringintermediate

LLM evaluation rubric generator

Build a weighted rubric and adversarial test set without rewarding surface-level polish.

Prompt240 characters
Design an evaluation rubric for {{task}}.

Create 5-8 non-overlapping criteria with weights totaling 100%, anchored descriptions for scores 1, 3, and 5, ten normal cases, five adversarial cases, and an escalation rule for ambiguous results.
Tested: GPT-4.1Tested: Claude SonnetTested: Gemini 2.5 Pro#evals#quality#testing
Community critique

0 replies

No replies yet. Add the first useful critique.