Koda — Run open models on your machine

Run models locally

Pull and run Llama, Mistral, Gemma, and more with a single command. No cloud, no API keys, no Docker.

koda pull llama3.2

$ koda pull llama3.2
Downloading llama3.2 (Q4_K_M)...
████████████████████████████ 100%
Saved to ~/.koda/models/

$ koda run llama3.2
Loading llama3.2...

› What is the capital of France?
The capital of France is Paris.

› /bye
✓ Session ended

Drop-in API compatibility

Koda speaks both the Ollama and OpenAI protocols. Point any compatible client at it — no code changes needed.

koda serve

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="koda"
)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{
        "role": "user",
        "content": "Hello"
    }]
)
print(response.choices[0].message.content)

Your data stays yours

Everything runs on your hardware
No data ever leaves your machine
Run fully offline for sensitive work
GPU accelerated — CUDA and Metal supported

Ask from your terminal

Get answers without leaving your shell. Koda reads your directory and shell history for context-aware help.

koda ask how do I undo the last git commit

$ koda ask how do I undo the last git commit

To undo the last commit but keep changes:

  git reset --soft HEAD~1

To undo and discard changes:

  git reset --hard HEAD~1

Get up and running with open models.

Run models locally

Drop-in API compatibility

Your data stays yours

Ask from your terminal

Get started with Koda