feat: #3583203 Add multi-model comparison support to compare.py (!8) · Merge requests · project / ai_best_practices

Add --models flag to run evals across multiple models in a single invocation (e.g., --models sonnet haiku). Prints per-model comparison tables followed by a cross-model summary showing which models benefit from guidance. Backwards-compatible: --model (singular) works as before.

Includes duplicate model detection, mutual exclusion with --model, and branching JSON output (single-model uses old format, multi-model uses per_model + summary structure).

By: zorz

Closes #3583203

feat: #3583203 Add multi-model comparison support to compare.py

Merge request reports