> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ccs.kaitran.ca/llms.txt
> Use this file to discover all available pages before exploring further.

# Extended Thinking

> Configure reasoning budgets across providers with unified --thinking flag

# Extended Thinking Mode

Control model reasoning depth with the cross-provider `--thinking` flag. Native
Claude launches also accept a session-scoped `--effort` override when you want
Claude's own effort control without rewriting saved settings.

<Note>Available since v7.38.0 - Extended context window support with `--1m` flag for 1M token context.</Note>

## Overview

Extended thinking allocates compute tokens for step-by-step reasoning before generating responses. Models "think out loud" internally to solve complex problems through systematic analysis.

**Use for:** Complex architecture, multi-step reasoning, debugging obscure issues, mathematical proofs, strategic planning.

## Priority Order

Thinking settings are resolved in this order (highest wins):

1. `--thinking` CLI flag
2. `CCS_THINKING` environment variable
3. `config.yaml` thinking section

## Native Claude `--effort`

When a launch stays on native Claude, CCS also accepts Claude's effort flag
directly:

```bash theme={null}
ccs --effort high "debug this regression"
ccs work --effort max "review this architecture"
```

Behavior:

* accepted values: `low`, `medium`, `high`, `xhigh`, `max`
* CCS validates the value before spawn and normalizes case
* the flag stays session-scoped; CCS does not rewrite `~/.claude/settings.json`
  or `~/.ccs/config.yaml`
* CLIProxy-backed thinking flows still treat `--effort` as the CCS alias for
  `--thinking`

Use `--thinking` when you want portable CCS semantics across providers. Use
`--effort` when you specifically want a native Claude one-session override.

## `CCS_THINKING` Environment Variable

Override thinking per-session without changing config:

```bash theme={null}
# Named levels
CCS_THINKING=high ccs claude "analyze this"
CCS_THINKING=auto ccs claude "complex task"

# Disable thinking
CCS_THINKING=off ccs codex "quick task"
CCS_THINKING=none ccs codex "quick task"
CCS_THINKING=disabled ccs codex "quick task"
CCS_THINKING=0 ccs codex "quick task"

# Integer budget (0-100000)
CCS_THINKING=24576 ccs claude "deep analysis"
```

**Accepted values:**

* Named levels: `minimal`, `low`, `medium`, `high`, `xhigh`, `max`, `auto`
* Off values: `off`, `none`, `disabled`, `0` (all equivalent — disable thinking)
* Integer budget: `0`–`100000`

## Two Modes

### Budget Mode (Token Count)

Specify exact token budget for thinking phase.

```bash theme={null}
ccs claude --thinking 8192      # Allocate 8K tokens
ccs claude --thinking 24576     # Deep analysis
```

### Level Mode (Named Levels)

Use predefined levels for simplified control.

```bash theme={null}
ccs codex --thinking low         # Quick (1K tokens)
ccs codex --thinking medium      # Standard (8K tokens)
ccs codex --thinking high        # Deep (24K tokens)
ccs codex --thinking xhigh       # Maximum (32K tokens)
ccs claude --thinking max        # Adaptive ceiling where supported
```

**Level mappings:** minimal=512, low=1024, medium=8192, high=24576, xhigh=32768, max=adaptive ceiling

## Provider Support Matrix

| Provider    | Model             | Type   | Range                         | Dynamic |
| ----------- | ----------------- | ------ | ----------------------------- | ------- |
| Claude      | Claude Fable 5    | levels | low, medium, high, xhigh, max | ✓       |
| Claude      | Claude Opus 4.8   | levels | low, medium, high, xhigh, max | ✓       |
| Claude      | Claude Opus 4.7   | levels | low, medium, high, xhigh, max | ✓       |
| Claude      | Claude Opus 4.6   | budget | 1024-128000                   | ✓       |
| Claude      | Claude Sonnet 4.6 | budget | 1024-128000                   | ✓       |
| Antigravity | Claude Opus 4.6   | budget | 1024-128000                   | ✓       |
| Antigravity | Claude Sonnet 4.6 | budget | 1024-64000                    | ✓       |
| Gemini      | Gemini 2.5 Pro    | budget | 128-32768                     | ✓       |
| Gemini      | Gemini 3 Pro      | levels | low, high                     | ✓       |
| Codex       | GPT-5.4           | levels | low, medium, high, xhigh      | ✗       |
| Codex       | GPT-5.4 Mini      | levels | low, medium, high             | ✗       |
| Codex       | GPT-5.2           | levels | low, medium, high, xhigh      | ✗       |

* **Type:** budget (numeric) vs. levels (named presets)
* **Dynamic:** Supports `auto` mode (model decides dynamically)

## Usage

### Auto Mode (Recommended)

```bash theme={null}
ccs claude --thinking auto        # Dynamic budget selection
ccs codex --thinking medium       # Fixed level selection
```

### Custom Budgets

```bash theme={null}
ccs claude --thinking 1024        # Light thinking
ccs claude --thinking 32768       # Deep analysis
ccs claude --thinking 64000       # Very deep analysis
```

### Named Levels

```bash theme={null}
ccs codex --thinking off          # Disable thinking
ccs codex --thinking medium       # Balanced
ccs codex --thinking xhigh        # Maximum effort
ccs claude --thinking max         # Adaptive max on Fable 5 / Opus 4.8
```

### Cross-Type Compatibility

CCS automatically converts between budgets and levels.

```bash theme={null}
ccs claude --thinking high        # → 24576 tokens
ccs codex --thinking 8192         # → "medium" level
```

## Auto-Capping Behavior

CCS validates values and auto-adjusts invalid inputs.

### Budget Clamping

```bash theme={null}
ccs claude --thinking 200000      # → Clamped to 128000 (max)
```

### Level Capping

```bash theme={null}
ccs codex --thinking xhigh        # → Capped to "high" (GPT-5 Mini)
ccs codex --thinking max          # → Mapped to "xhigh" when max is not native
```

### Fuzzy Matching

```bash theme={null}
ccs codex --thinking hi           # → "high"
ccs codex --thinking med          # → "medium"
```

## Cost Implications

Higher budgets = more tokens = higher cost.

* **low (1K):** Minimal cost, fast
* **medium (8K):** Moderate cost, balanced
* **high (24K):** Higher cost, deep analysis
* **xhigh (32K):** Maximum cost, maximum depth
* **Custom (32K-128K):** Very high cost, provider/model dependent

**Best practices:** Use `auto` for optimization, reserve high budgets for complex problems, start with `low`/`medium` for routine tasks.

## Troubleshooting

### Model Doesn't Support Thinking

**Error:** `Model gemini-claude-sonnet-4-5 does not support extended thinking`

**Solution:** Use thinking-enabled variants (e.g., `gemini-claude-sonnet-4-5-thinking`) or switch to a supported reasoning-first profile such as `ccs km` when you need Kimi API reasoning. Legacy `ccs glmt` remains compatibility-only.

### Budget Exceeds Maximum

**Warning:** `Thinking budget 200000 exceeds maximum. Clamped to 128000.`

**Solution:** Use budget within range or switch to model with higher limit.

### Level Not Supported

**Warning:** `Level "xhigh" not valid for gpt-5-mini. Mapped to "high".`

**Solution:** CCS auto-maps to closest valid level. Check support matrix.

### Dynamic Thinking Unavailable

**Warning:** `Model does not support dynamic/auto thinking`

**Solution:** Specify explicit level or budget. Check "Dynamic" column in matrix.

## `ccs config thinking` Command

Manage thinking configuration interactively or via flags:

```bash theme={null}
# Show current thinking config
ccs config thinking

# Set thinking mode
ccs config thinking --mode auto           # Dynamic tier-based defaults
ccs config thinking --mode off            # Disable thinking entirely
ccs config thinking --mode manual         # Use explicit override

# Persistent override (applies to all providers)
ccs config thinking --override high       # Set persistent override level
ccs config thinking --override 24576      # Set persistent override budget
ccs config thinking --clear-override      # Remove persistent override

# Per-tier defaults
ccs config thinking --tier opus high      # Set opus tier to "high"
ccs config thinking --tier sonnet medium  # Set sonnet tier to "medium"
ccs config thinking --tier haiku low      # Set haiku tier to "low"

# Per-provider overrides
ccs config thinking --provider-override gemini opus xhigh
ccs config thinking --provider-override agy sonnet high
ccs config thinking --clear-provider-override gemini        # Clear all gemini overrides
ccs config thinking --clear-provider-override gemini opus   # Clear gemini opus only
```

### Dashboard Thinking Settings

The CCS Dashboard includes a **Thinking** settings panel with:

* **Mode selector** — auto / off / manual
* **Persistent Override** panel — set a global level that overrides tier defaults
* **Tier defaults** — configure opus/sonnet/haiku default levels
* **Provider Overrides** section — per-provider tier level customization

## Related

* [CLI Flags Reference](/reference/cli-flags) - `--thinking` flag syntax, `CCS_THINKING` env var
* [Antigravity Provider](/providers/oauth/agy) - Budget mode, 100K max
* [Gemini Provider](/providers/oauth/gemini) - Budget/level hybrid
* [Codex Provider](/providers/oauth/codex) - Level-based, maxLevel caps
* [GLMT Deprecation](/features/ai/glmt-controls) - Legacy compatibility and migration guidance

## Extended Context Window

<Note>Available since v7.38.0</Note>

Enable 1M token context window for supported models using the `--1m` flag.

### Usage

```bash theme={null}
# Enable 1M context
ccs claude --1m "analyze this large codebase"

# Disable (use default context)
ccs claude --no-1m "quick prompt"

# Combine with thinking mode
ccs claude --thinking auto --1m "deep analysis of large project"
```

### How It Works

The `--1m` flag appends `[1m]` suffix to model names, routing to extended context variants:

```bash theme={null}
# Without --1m
gemini-claude-sonnet-4-5

# With --1m
gemini-claude-sonnet-4-5[1m]
```

### Provider Support

**Auto-Enabled:**

* Native Gemini models (always use 1M context by default)

**Opt-In:**

* Claude models via Gemini proxy
* Antigravity models
* Codex models

**Not Supported:**

* Settings-based profiles (GLM, KM, custom APIs)
* Local models (Ollama)

### Best Practices

**When to use `--1m`:**

* Large codebase analysis
* Multi-file refactoring
* Documentation generation across many files
* Complex architectural planning

**When NOT to use:**

* Simple queries (wastes quota)
* Short prompts (no benefit)
* Rate-limited scenarios (uses more quota faster)

### Cost Implications

Extended context consumes quota faster. Use selectively for tasks requiring large context windows.

### Disable Extended Context

```bash theme={null}
# Force standard context
ccs claude --no-1m "your prompt"
```

Useful when:

* Quota conservation needed
* Faster response time preferred
* Task doesn't require large context