> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ccs.kaitran.ca/llms.txt
> Use this file to discover all available pages before exploring further.

# Ollama Provider

> Local and cloud Ollama models with zero API costs and full privacy

# Ollama Provider

Run local open-source models via Ollama with zero API costs and complete privacy, or use Ollama Cloud for hosted models.

## Quick Start

```bash theme={null}
# Local Ollama (no API key needed)
ccs ollama "explain this code"

# Ollama Cloud (requires API key)
ccs ollama-cloud "refactor this function"
```

## Variants

### Ollama (Local)

Run models on your local machine with complete privacy and zero API costs.

**Configuration:**

* Base URL: `http://localhost:11434`
* Default model: `qwen3-coder`
* API key: Not required
* Context: 32K+ tokens

**Prerequisites:** Ollama must be installed and running locally.

### Ollama Cloud

Access Ollama's hosted models via their cloud API.

**Configuration:**

* Base URL: `https://ollama.com`
* Default models: `glm-4.7:cloud`, `minimax-m2.1:cloud`
* API key: Required from ollama.com
* Context: Varies by model

## Prerequisites

### Installing Ollama (Local)

<Steps>
  <Step title="Download Ollama">
    Visit [ollama.com](https://ollama.com) and download for your platform
  </Step>

  <Step title="Install">
    Follow platform-specific installation instructions
  </Step>

  <Step title="Verify Installation">
    ```bash theme={null}
    ollama --version
    ```
  </Step>

  <Step title="Pull Model">
    ```bash theme={null}
    ollama pull qwen3-coder
    ```
  </Step>
</Steps>

### Ollama Cloud Setup

<Steps>
  <Step title="Create Account">
    Sign up at [ollama.com](https://ollama.com)
  </Step>

  <Step title="Get API Key">
    Navigate to API settings and generate your API key
  </Step>

  <Step title="Configure CCS">
    ```bash theme={null}
    ccs setup --preset ollama-cloud
    # Enter your API key when prompted
    ```
  </Step>
</Steps>

## Configuration

### Local Ollama Setup

```bash theme={null}
# Via API profile preset
ccs api create --preset ollama

# Or direct shortcut
ccs ollama "test local model"

# Manual config in ~/.ccs/config.yaml
profiles:
  ollama:
    env:
      ANTHROPIC_BASE_URL: "http://localhost:11434"
      ANTHROPIC_MODEL: "qwen3-coder"
      ANTHROPIC_AUTH_TOKEN: "ollama"
```

### Ollama Cloud Setup

```yaml theme={null}
# ~/.ccs/config.yaml
profiles:
  ollama-cloud:
    env:
      ANTHROPIC_BASE_URL: "https://ollama.com"
      ANTHROPIC_MODEL: "glm-4.7:cloud"
      ANTHROPIC_AUTH_TOKEN: "YOUR_OLLAMA_CLOUD_API_KEY"
```

## Model Selection

### Popular Local Models

| Model            | Size | Context | Use Case             |
| ---------------- | ---- | ------- | -------------------- |
| `qwen3-coder`    | 7B   | 32K     | Coding (recommended) |
| `deepseek-coder` | 6.7B | 16K     | Code completion      |
| `codellama`      | 7B   | 16K     | Code generation      |
| `mistral`        | 7B   | 8K      | General purpose      |

### Pulling Models

```bash theme={null}
# Install recommended coding model
ollama pull qwen3-coder

# List available models
ollama list

# Remove unused models
ollama rm model-name
```

### Cloud Models

| Model                | Description              |
| -------------------- | ------------------------ |
| `glm-4.7:cloud`      | GLM via Ollama Cloud     |
| `minimax-m2.1:cloud` | Minimax via Ollama Cloud |

## Usage Examples

### Local Ollama

```bash theme={null}
# Basic usage
ccs ollama "explain this function"

# Switch models
ANTHROPIC_MODEL=deepseek-coder ccs ollama "review this code"

# Custom temperature
ANTHROPIC_TEMPERATURE=0.7 ccs ollama "generate unit tests"
```

### Ollama Cloud

```bash theme={null}
# Use cloud variant
ccs ollama-cloud "debug this error"

# Specific cloud model
ANTHROPIC_MODEL=minimax-m2.1:cloud ccs ollama-cloud "optimize performance"
```

## Troubleshooting

### Connection Refused

**Symptom:** `Error: connect ECONNREFUSED 127.0.0.1:11434`

**Cause:** Ollama service not running

**Solution:**

```bash theme={null}
# Start Ollama service
ollama serve

# Or on macOS/Windows, launch Ollama app
```

### Model Not Found

**Symptom:** `Error: model 'qwen3-coder' not found`

**Cause:** Model not pulled locally

**Solution:**

```bash theme={null}
# Pull the model first
ollama pull qwen3-coder

# Verify installation
ollama list
```

### Slow Responses

**Symptom:** Long response times

**Causes & Solutions:**

* **CPU-only inference:** Use smaller model or add GPU support
* **Large model:** Switch to smaller variant (e.g., `qwen3-coder:3b`)
* **Insufficient RAM:** Close other apps, use quantized models

**Optimize performance:**

```bash theme={null}
# Use smaller quantized model
ollama pull qwen3-coder:q4_0

# Update model in config
ANTHROPIC_MODEL=qwen3-coder:q4_0 ccs ollama "test"
```

### Ollama Cloud API Errors

**Symptom:** `401 Unauthorized` or `403 Forbidden`

**Solution:**

```bash theme={null}
# Verify API key is correct
ccs config
# Navigate to ollama-cloud profile
# Re-enter API key
```

## Performance Tuning

### Context Length

```yaml theme={null}
# ~/.ccs/config.yaml
profiles:
  ollama:
    env:
      ANTHROPIC_MAX_TOKENS: "32768"  # Adjust based on model
```

### Concurrency

Ollama handles concurrent requests via queue. For better performance:

```bash theme={null}
# Increase parallel requests (Ollama config)
OLLAMA_NUM_PARALLEL=4 ollama serve
```

## Cost Information

| Variant        | Cost                    | Privacy                                |
| -------------- | ----------------------- | -------------------------------------- |
| Ollama (Local) | **\$0** (hardware only) | Complete - data never leaves machine   |
| Ollama Cloud   | Varies by usage         | Depends on Ollama Cloud privacy policy |

## Storage Locations

| Path                          | Description                            |
| ----------------------------- | -------------------------------------- |
| `~/.ollama/models/`           | Downloaded model files                 |
| `~/.ccs/config.yaml`          | CCS profile configuration              |
| `~/.ccs/ollama.settings.json` | Model preferences (if using Dashboard) |

## Ollama vs llama.cpp

| Feature             | Ollama                  | llama.cpp                          |
| ------------------- | ----------------------- | ---------------------------------- |
| **Model format**    | Ollama format           | GGUF (raw)                         |
| **Setup**           | Easier (install + pull) | More manual                        |
| **Model selection** | Built-in model library  | Any GGUF file                      |
| **Performance**     | Good                    | Better (more optimization options) |
| **Community**       | Large, many models      | Smaller but growing                |
| **Best for**        | Getting started quickly | Fine-tuned control                 |

Use **Ollama** for quick setup with curated models. Use **llama.cpp** if you need specific GGUF models or advanced tuning.

## Next Steps

<CardGroup cols={2}>
  <Card title="API Profiles" icon="server" href="/providers/concepts/api-profiles">
    Configure custom Ollama endpoints
  </Card>

  <Card title="llama.cpp Provider" icon="code-branch" href="/providers/api/llamacpp">
    Alternative GGUF-based local inference
  </Card>

  <Card title="Dashboard" icon="gauge" href="/features/dashboard/overview">
    Manage models via web interface
  </Card>

  <Card title="Remote Proxy" icon="network-wired" href="/features/proxy/remote-proxy">
    Run Ollama on remote server
  </Card>
</CardGroup>