OpenAI-Compatible Provider Routing

CCS can bridge Claude Code into OpenAI-compatible providers through a local Anthropic-compatible proxy. This is useful for Hugging Face Inference Providers, OpenRouter, Ollama, llama.cpp, OpenAI-compatible self-hosted gateways, and similar APIs.

Quick Start

Create an API profile, then launch it normally:

ccs api create --preset hf
ccs hf

For Claude-target launches, CCS detects compatible OpenAI-style endpoints, starts a local proxy on 127.0.0.1, translates Anthropic /v1/messages requests into OpenAI chat-completions requests, and translates streaming responses back into Anthropic SSE.

Manual Proxy Lifecycle

Use the ccs proxy command when you want to start, inspect, activate, or stop the local proxy explicitly:

ccs proxy start hf
eval "$(ccs proxy activate hf)"
ccs proxy status hf
ccs proxy stop hf

Useful variants:

ccs proxy start hf --host 127.0.0.1
ccs proxy start hf --port 3460
ccs proxy activate --fish

ccs proxy activate prints the local runtime contract for the selected profile, including ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, model defaults, timeout settings, telemetry suppression, and NO_PROXY.

Adaptive Ports

CCS stores local proxy state per profile, so multiple compatible profiles can run at the same time. Port selection precedence:

CLI --port
proxy.profile_ports[profile]
shared preferred proxy.port
adaptive per-profile fallback

Legacy shared proxy.port: 3456 values are treated as unset so older configs move onto adaptive ports instead of staying on the hot legacy default. Pin 3456 explicitly with --port or proxy.profile_ports only if you really need that exact binding.

proxy:
  port: 45000
  profile_ports:
    hf: 3460
    openrouter: 3461

Request-Time Routing

The proxy can route a request to another compatible profile or model at request time:

Selector	Example	Behavior
`profile:model`	`deepseek:deepseek-reasoner`	route to exact profile and model
`profile`	`openrouter`	route to that profile default
plain model id	`deepseek-chat`	match configured model slots exactly

Scenario routing is configured under proxy.routing:

proxy:
  routing:
    default: "deepseek:deepseek-chat"
    background: "ollama:qwen2.5-coder:0.5b"
    think: "deepseek:deepseek-reasoner"
    longContext: "openrouter:google/gemini-2.5-pro"
    longContextThreshold: 60000
    webSearch: "openrouter:perplexity/sonar-pro"

Current scenarios detect background models, thinking-enabled requests, long-context requests, and web-search tool usage.

Reasoning Model Payloads

OpenAI GPT-5 and o-series chat-completions models reject max_tokens and expect max_completion_tokens instead. CCS automatically reshapes payloads for known public model names such as gpt-5.4, openai/gpt-5.4, and o3. If your OpenAI-compatible gateway exposes a reasoning model through an opaque deployment ID, CCS cannot infer that from the model string. Add this env key to that profile’s settings:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://gateway.example.com/v1",
    "ANTHROPIC_AUTH_TOKEN": "sk-...",
    "ANTHROPIC_MODEL": "deployment-id",
    "CCS_DROID_PROVIDER": "generic-chat-completion-api",
    "CCS_OPENAI_REASONING_MODEL": "1"
  }
}

With this opt-in, CCS renames max_tokens to max_completion_tokens and strips OpenAI-incompatible metadata from the upstream chat-completions request. Leave it unset for generic gateways whose models still require max_tokens. claude-code-router is a standalone router whose transformer architecture informed this CCS flow. Use CCR when you want a standalone router. Use CCS when you want routing integrated with CCS profiles, runtime bridges, and the ccs command surface.

​OpenAI-Compatible Provider Routing

​Quick Start

​Manual Proxy Lifecycle

​Adaptive Ports

​Request-Time Routing

​Reasoning Model Payloads

​Related Project

​Related