Chapter 07 · The honest picture

The brutally
honest pros & cons.

Skills can genuinely make your AI agent behave like a teammate. They can also silently burn money, degrade output quality, and introduce bugs you'll blame on the model. The evidence is here.

The real wins

What skills deliver

Consistency, reusability, team standardization
Token efficiency when done correctly
Composability
More predictable results for document workflows

Where skills harm

What skills cost you

The always-apply token taxCursor's alwaysApply: true rules and Copilot's copilot-instructions.md are injected into every single message.
Context rot — the invisible degradationChroma Research's 2025 paper tested 18 frontier models including GPT-4.1, Claude 4, Gemini 2.5, and Qwen3 — every single one degrades as context grows.

When skills aren't worth it

One striking data point.

Vercel's published agent evaluations found a compressed-docs index embedded directly in AGENTS.md maxed 100% pass rate, while skills maxed out at 79% — even with explicit instructions — and performed no better than baseline when left to trigger naturally.

💸

Token usage hard numbers

Claude Skills metadata costs ~100 tokens per skill; the body under 5K when triggered. GitHub Context windows range from 128K (most Cursor models) to 200K (Claude standard) to 1M (Claude Beta).

📉

Degradation starts early

Chroma's context-rot work shows measurable degradation starting at 10–25% window fill for complex tasks. Copilot docs recommend instruction files at ~1,000 lines.

⚖️

The community consensus

Every rule should earn its tokens, and anything that costs more than it saves is lighting money on fire while degrading output quality.