feat: #3583241 Add documentation evals and check_markdown_structure assertion

Adds eval suite for how-to-write-documentation skill:

  • 5 behavioral cases testing handbook writing, doc silo awareness, anti-pattern rewriting, over-documentation advice, and README generation
  • 12 static checks verifying skill file structure and key concepts
  • New check_markdown_structure assertion type for grading non-code output (headings, required sections, code blocks, paragraphs, forbidden patterns)

Fixes eval isolation: adds --setting-sources "" and --strict-mcp-config to claude -p invocations. Without these flags, pipe mode loads all user plugins, hooks, skills, and MCP servers from ~/.claude/settings.json, contaminating both configs with unmeasured context. Also adds --cwd flag to run from a neutral directory.

Results (Sonnet, 3 runs, fully isolated):

  • Baseline (no skill): 15/15 (100%)
  • Treatment (with skill): 15/15 (100%)
  • Delta: 0%, skill reduces output 36% (-413 tokens avg)

Updates CONTRIBUTING.md with check_markdown_structure docs and isolation guidance.

Merge request reports

Loading