Skip to main content

Extended Thinking Mode

Control model reasoning depth with the --thinking flag for supported providers.

Overview

Extended thinking allocates compute tokens for step-by-step reasoning before generating responses. Models “think out loud” internally to solve complex problems through systematic analysis. Use for: Complex architecture, multi-step reasoning, debugging obscure issues, mathematical proofs, strategic planning.

Two Modes

Budget Mode (Token Count)

Specify exact token budget for thinking phase.
ccs gemini --thinking 8192      # Allocate 8K tokens
ccs agy --thinking 24576         # Deep analysis

Level Mode (Named Levels)

Use predefined levels for simplified control.
ccs codex --thinking low         # Quick (1K tokens)
ccs codex --thinking medium      # Standard (8K tokens)
ccs codex --thinking high        # Deep (24K tokens)
ccs codex --thinking xhigh       # Maximum (32K tokens)
Level mappings: minimal=512, low=1024, medium=8192, high=24576, xhigh=32768

Provider Support Matrix

ProviderModelTypeRangeDynamic
AntigravityClaude Opus 4.5 Thinkingbudget1024-100000
AntigravityClaude Sonnet 4.5 Thinkingbudget1024-100000
GeminiGemini 2.5 Probudget128-32768
GeminiGemini 3 Prolevelslow, high
CodexGPT-5.2 Codexlevelsmedium, high, xhigh
CodexGPT-5 Minilevelsmedium, high
  • Type: budget (numeric) vs. levels (named presets)
  • Dynamic: Supports auto mode (model decides dynamically)

Usage

ccs agy --thinking auto           # Dynamic budget selection
ccs gemini --thinking auto        # Model optimizes cost/quality

Custom Budgets

ccs gemini --thinking 1024        # Light thinking
ccs gemini --thinking 32768       # Deep analysis
ccs agy --thinking 100000         # Maximum (Antigravity only)

Named Levels

ccs codex --thinking off          # Disable thinking
ccs codex --thinking medium       # Balanced
ccs codex --thinking xhigh        # Maximum effort

Cross-Type Compatibility

CCS automatically converts between budgets and levels.
ccs gemini --thinking high        # → 24576 tokens
ccs codex --thinking 8192         # → "medium" level

Auto-Capping Behavior

CCS validates values and auto-adjusts invalid inputs.

Budget Clamping

ccs gemini --thinking 50000       # → Clamped to 32768 (max)

Level Capping

ccs codex --thinking xhigh        # → Capped to "high" (GPT-5 Mini)

Fuzzy Matching

ccs codex --thinking hi           # → "high"
ccs codex --thinking med          # → "medium"

Cost Implications

Higher budgets = more tokens = higher cost.
  • low (1K): Minimal cost, fast
  • medium (8K): Moderate cost, balanced
  • high (24K): Higher cost, deep analysis
  • xhigh (32K): Maximum cost, maximum depth
  • Custom (100K): Very high cost (Antigravity only)
Best practices: Use auto for optimization, reserve high budgets for complex problems, start with low/medium for routine tasks.

Troubleshooting

Model Doesn’t Support Thinking

Error: Model gemini-claude-sonnet-4-5 does not support extended thinking Solution: Use thinking-enabled variants (e.g., gemini-claude-sonnet-4-5-thinking) or GLMT profile for GLM/Kimi.

Budget Exceeds Maximum

Warning: Thinking budget 50000 exceeds maximum. Clamped to 32768. Solution: Use budget within range or switch to model with higher limit.

Level Not Supported

Warning: Level "xhigh" not valid for gpt-5-mini. Mapped to "high". Solution: CCS auto-maps to closest valid level. Check support matrix.

Dynamic Thinking Unavailable

Warning: Model does not support dynamic/auto thinking Solution: Specify explicit level or budget. Check “Dynamic” column in matrix.