benchmark-models — Skillopedia

<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -- <!-- Regenerate: bun run gen:skill-docs -- When to invoke this skill Runs the same prompt through Claude, GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, and optionally quality via LLM judge. Answers "which model is actually best for this skill?" with data instead of vibes. Separate from /benchmark, which measures web page performance. Use when: "benchmark models", "compare models", "which model is best for X", "cross-model comparison", "model shootout". Voice triggers (speech-to-text aliases): "co…