Extended Thinking Mode

Control model reasoning depth with the cross-provider --thinking flag. Native Claude launches also accept a session-scoped --effort override when you want Claude’s own effort control without rewriting saved settings.

Available since v7.38.0 - Extended context window support with --1m flag for 1M token context.

Overview

Extended thinking allocates compute tokens for step-by-step reasoning before generating responses. Models “think out loud” internally to solve complex problems through systematic analysis. Use for: Complex architecture, multi-step reasoning, debugging obscure issues, mathematical proofs, strategic planning.

Priority Order

Thinking settings are resolved in this order (highest wins):

--thinking CLI flag
CCS_THINKING environment variable
config.yaml thinking section

Native Claude `--effort`

When a launch stays on native Claude, CCS also accepts Claude’s effort flag directly:

ccs --effort high "debug this regression"
ccs work --effort max "review this architecture"

Behavior:

accepted values: low, medium, high, xhigh, max
CCS validates the value before spawn and normalizes case
the flag stays session-scoped; CCS does not rewrite ~/.claude/settings.json or ~/.ccs/config.yaml
CLIProxy-backed thinking flows still treat --effort as the CCS alias for --thinking

Use --thinking when you want portable CCS semantics across providers. Use --effort when you specifically want a native Claude one-session override.

`CCS_THINKING` Environment Variable

Override thinking per-session without changing config:

# Named levels
CCS_THINKING=high ccs claude "analyze this"
CCS_THINKING=auto ccs claude "complex task"

# Disable thinking
CCS_THINKING=off ccs codex "quick task"
CCS_THINKING=none ccs codex "quick task"
CCS_THINKING=disabled ccs codex "quick task"
CCS_THINKING=0 ccs codex "quick task"

# Integer budget (0-100000)
CCS_THINKING=24576 ccs claude "deep analysis"

Accepted values:

Named levels: minimal, low, medium, high, xhigh, max, auto
Off values: off, none, disabled, 0 (all equivalent — disable thinking)
Integer budget: 0–100000

Two Modes

Budget Mode (Token Count)

Specify exact token budget for thinking phase.

ccs claude --thinking 8192      # Allocate 8K tokens
ccs claude --thinking 24576     # Deep analysis

Level Mode (Named Levels)

Use predefined levels for simplified control.

ccs codex --thinking low         # Quick (1K tokens)
ccs codex --thinking medium      # Standard (8K tokens)
ccs codex --thinking high        # Deep (24K tokens)
ccs codex --thinking xhigh       # Maximum (32K tokens)
ccs claude --thinking max        # Adaptive ceiling where supported

Level mappings: minimal=512, low=1024, medium=8192, high=24576, xhigh=32768, max=adaptive ceiling

Provider Support Matrix

Provider	Model	Type	Range	Dynamic
Claude	Claude Opus 4.7	levels	low, medium, high, xhigh, max	✓
Claude	Claude Opus 4.6	budget	1024-128000	✓
Claude	Claude Sonnet 4.6	budget	1024-128000	✓
Antigravity	Claude Opus 4.6	budget	1024-128000	✓
Antigravity	Claude Sonnet 4.6	budget	1024-64000	✓
Gemini	Gemini 2.5 Pro	budget	128-32768	✓
Gemini	Gemini 3 Pro	levels	low, high	✓
Codex	GPT-5.4	levels	low, medium, high, xhigh	✗
Codex	GPT-5.4 Mini	levels	low, medium, high	✗
Codex	GPT-5.2	levels	low, medium, high, xhigh	✗

Type: budget (numeric) vs. levels (named presets)
Dynamic: Supports auto mode (model decides dynamically)

Usage

Auto Mode (Recommended)

ccs claude --thinking auto        # Dynamic budget selection
ccs codex --thinking medium       # Fixed level selection

Custom Budgets

ccs claude --thinking 1024        # Light thinking
ccs claude --thinking 32768       # Deep analysis
ccs claude --thinking 64000       # Very deep analysis

Named Levels

ccs codex --thinking off          # Disable thinking
ccs codex --thinking medium       # Balanced
ccs codex --thinking xhigh        # Maximum effort
ccs claude --thinking max         # Adaptive max on Opus 4.7

Cross-Type Compatibility

CCS automatically converts between budgets and levels.

ccs claude --thinking high        # → 24576 tokens
ccs codex --thinking 8192         # → "medium" level

Auto-Capping Behavior

CCS validates values and auto-adjusts invalid inputs.

Budget Clamping

ccs claude --thinking 200000      # → Clamped to 128000 (max)

Level Capping

ccs codex --thinking xhigh        # → Capped to "high" (GPT-5 Mini)
ccs codex --thinking max          # → Mapped to "xhigh" when max is not native

Fuzzy Matching

ccs codex --thinking hi           # → "high"
ccs codex --thinking med          # → "medium"

Cost Implications

Higher budgets = more tokens = higher cost.

low (1K): Minimal cost, fast
medium (8K): Moderate cost, balanced
high (24K): Higher cost, deep analysis
xhigh (32K): Maximum cost, maximum depth
Custom (32K-128K): Very high cost, provider/model dependent

Best practices: Use auto for optimization, reserve high budgets for complex problems, start with low/medium for routine tasks.

Troubleshooting

Model Doesn’t Support Thinking

Error: Model gemini-claude-sonnet-4-5 does not support extended thinking Solution: Use thinking-enabled variants (e.g., gemini-claude-sonnet-4-5-thinking) or switch to a supported reasoning-first profile such as ccs km when you need Kimi API reasoning. Legacy ccs glmt remains compatibility-only.

Budget Exceeds Maximum

Warning: Thinking budget 200000 exceeds maximum. Clamped to 128000. Solution: Use budget within range or switch to model with higher limit.

Level Not Supported

Warning: Level "xhigh" not valid for gpt-5-mini. Mapped to "high". Solution: CCS auto-maps to closest valid level. Check support matrix.

Dynamic Thinking Unavailable

Warning: Model does not support dynamic/auto thinking Solution: Specify explicit level or budget. Check “Dynamic” column in matrix.

`ccs config thinking` Command

Manage thinking configuration interactively or via flags:

# Show current thinking config
ccs config thinking

# Set thinking mode
ccs config thinking --mode auto           # Dynamic tier-based defaults
ccs config thinking --mode off            # Disable thinking entirely
ccs config thinking --mode manual         # Use explicit override

# Persistent override (applies to all providers)
ccs config thinking --override high       # Set persistent override level
ccs config thinking --override 24576      # Set persistent override budget
ccs config thinking --clear-override      # Remove persistent override

# Per-tier defaults
ccs config thinking --tier opus high      # Set opus tier to "high"
ccs config thinking --tier sonnet medium  # Set sonnet tier to "medium"
ccs config thinking --tier haiku low      # Set haiku tier to "low"

# Per-provider overrides
ccs config thinking --provider-override gemini opus xhigh
ccs config thinking --provider-override agy sonnet high
ccs config thinking --clear-provider-override gemini        # Clear all gemini overrides
ccs config thinking --clear-provider-override gemini opus   # Clear gemini opus only

Dashboard Thinking Settings

The CCS Dashboard includes a Thinking settings panel with:

Mode selector — auto / off / manual
Persistent Override panel — set a global level that overrides tier defaults
Tier defaults — configure opus/sonnet/haiku default levels
Provider Overrides section — per-provider tier level customization

CLI Flags Reference - --thinking flag syntax, CCS_THINKING env var
Antigravity Provider - Budget mode, 100K max
Gemini Provider - Budget/level hybrid
Codex Provider - Level-based, maxLevel caps
GLMT Deprecation - Legacy compatibility and migration guidance

Extended Context Window

Available since v7.38.0

Enable 1M token context window for supported models using the --1m flag.

Usage

# Enable 1M context
ccs claude --1m "analyze this large codebase"

# Disable (use default context)
ccs claude --no-1m "quick prompt"

# Combine with thinking mode
ccs claude --thinking auto --1m "deep analysis of large project"

How It Works

The --1m flag appends [1m] suffix to model names, routing to extended context variants:

# Without --1m
gemini-claude-sonnet-4-5

# With --1m
gemini-claude-sonnet-4-5[1m]

Provider Support

Auto-Enabled:

Native Gemini models (always use 1M context by default)

Opt-In:

Claude models via Gemini proxy
Antigravity models
Codex models

Not Supported:

Settings-based profiles (GLM, KM, custom APIs)
Local models (Ollama)

Best Practices

When to use --1m:

Large codebase analysis
Multi-file refactoring
Documentation generation across many files
Complex architectural planning

When NOT to use:

Simple queries (wastes quota)
Short prompts (no benefit)
Rate-limited scenarios (uses more quota faster)

Cost Implications

Extended context consumes quota faster. Use selectively for tasks requiring large context windows.

Disable Extended Context

# Force standard context
ccs claude --no-1m "your prompt"

Useful when:

Quota conservation needed
Faster response time preferred
Task doesn’t require large context

Getting Started

Providers

Features

Tutorials

Extended Thinking

Extended Thinking Mode

Overview

Priority Order

Native Claude `--effort`

`CCS_THINKING` Environment Variable

Two Modes

Budget Mode (Token Count)

Level Mode (Named Levels)

Provider Support Matrix

Usage

Auto Mode (Recommended)

Custom Budgets

Named Levels

Cross-Type Compatibility

Auto-Capping Behavior

Budget Clamping

Level Capping

Fuzzy Matching

Cost Implications

Troubleshooting

Model Doesn’t Support Thinking

Budget Exceeds Maximum

Level Not Supported

Dynamic Thinking Unavailable

`ccs config thinking` Command

Dashboard Thinking Settings

Extended Context Window

Usage

How It Works

Provider Support

Best Practices

Cost Implications

Disable Extended Context

Getting Started

Providers

Features

Tutorials

Documentation Index

​Extended Thinking Mode

​Overview

​Priority Order

​Native Claude --effort

​CCS_THINKING Environment Variable

​Two Modes

​Budget Mode (Token Count)

​Level Mode (Named Levels)

​Provider Support Matrix

​Usage

​Auto Mode (Recommended)

​Custom Budgets

​Named Levels

​Cross-Type Compatibility

​Auto-Capping Behavior

​Budget Clamping

​Level Capping

​Fuzzy Matching

​Cost Implications

​Troubleshooting

​Model Doesn’t Support Thinking

​Budget Exceeds Maximum

​Level Not Supported

​Dynamic Thinking Unavailable

​ccs config thinking Command

​Dashboard Thinking Settings

​Related

​Extended Context Window

​Usage

​How It Works

​Provider Support

​Best Practices

​Cost Implications

​Disable Extended Context

Extended Thinking Mode

Overview

Priority Order

Native Claude `--effort`

`CCS_THINKING` Environment Variable

Two Modes

Budget Mode (Token Count)

Level Mode (Named Levels)

Provider Support Matrix

Usage

Auto Mode (Recommended)

Custom Budgets

Named Levels

Cross-Type Compatibility

Auto-Capping Behavior

Budget Clamping

Level Capping

Fuzzy Matching

Cost Implications

Troubleshooting

Model Doesn’t Support Thinking

Budget Exceeds Maximum

Level Not Supported

Dynamic Thinking Unavailable

`ccs config thinking` Command

Dashboard Thinking Settings

Related

Extended Context Window

Usage

How It Works

Provider Support

Best Practices

Cost Implications

Disable Extended Context