One interface for all AI coding assistants — auto-routes to the best model for every task
You have API keys for Claude, GPT, Gemini, and Ollama. Each model is best at different things. But you don't want to think about which model to use every time — you just want the best answer.
omni-coder picks the right model automatically.
┌────────────────────────────────────────────────────────┐
│ omni-coder │
│ │
│ "Review this code" ─────── → Claude Opus (deep) │
│ "Write a component" ─────── → Claude Sonnet (quality) │
│ "Fix this typo" ─────── → Haiku (fast + cheap) │
│ "Creative blog post"─────── → GPT-4o (creative) │
│ "Analyze 50k lines" ─────── → Gemini Pro (1M context) │
│ "Private code" ─────── → Ollama (local, free) │
│ │
│ All through ONE command: omni <prompt> │
└────────────────────────────────────────────────────────┘
# Set at least one API key
export ANTHROPIC_API_KEY=sk-ant-xxxxx
# Auto-routes to the best model
npx omni-coder "Write a rate limiter middleware in Express"
# Force a specific model
npx omni-coder -m opus "Review this auth code for vulnerabilities"
# Include a file as context
npx omni-coder -f src/api.js "Find bugs in this file"
# Compare models side-by-side
npx omni-coder --compare "Implement binary search in Python"
# See which model would be used (no API call)
npx omni-coder route "Fix the button styling"
# Dry run — show routing + cost estimate
npx omni-coder --dry-run "Generate a full CRUD API"omni-coder analyzes your prompt and routes to the optimal model:
| Task Type | Routes To | Why |
|---|---|---|
| Code review, security audit | Claude Opus | Deepest analysis, catches subtle bugs |
| Code generation, implementation | Claude Sonnet | Best code quality at reasonable cost |
| Quick edits, typo fixes | Claude Haiku | Fast and cheap for simple tasks |
| Math, algorithms, reasoning | Claude Opus / o1 | Strongest reasoning capability |
| Creative writing, marketing | GPT-4o | Strong creative output |
| Large codebase analysis | Gemini 1.5 Pro | 1M token context window |
| Private/sensitive code | Ollama | Data never leaves your machine |
| Tests, specs | Claude Sonnet | Reliable test generation |
# Force a model with -m
omni -m gpt-4o "Write a haiku about coding"
omni -m ollama "Explain this function" # stays local| Provider | Models | API Key |
|---|---|---|
| Anthropic | opus, sonnet, haiku | ANTHROPIC_API_KEY |
| OpenAI | gpt-4o, gpt-4o-mini, o1 | OPENAI_API_KEY |
| gemini-pro, gemini-flash | GOOGLE_API_KEY |
|
| Ollama | Any local model | None (free, local) |
omni models # Show all models with pricing| Model | Input (per 1M) | Output (per 1M) | Best For |
|---|---|---|---|
| Claude Opus | $15.00 | $75.00 | Complex analysis |
| Claude Sonnet | $3.00 | $15.00 | Code generation |
| Claude Haiku | $0.25 | $1.25 | Quick tasks |
| GPT-4o | $2.50 | $10.00 | Creative work |
| GPT-4o Mini | $0.15 | $0.60 | Bulk operations |
| Gemini Pro | $1.25 | $5.00 | Large context |
| Gemini Flash | $0.075 | $0.30 | Speed + cost |
| Ollama | Free | Free | Privacy |
omni-coder saves you money by routing cheap tasks to cheap models and only using expensive models when the task demands it.
See how different models handle the same prompt:
omni --compare "Implement a debounce function in TypeScript"Output:
Comparing across: sonnet, gpt-4o, gemini-flash
─── sonnet ───
function debounce<T extends (...args: any[]) => void>(fn: T, ms: number)...
[1.2s | $0.0003]
─── gpt-4o ───
const debounce = (func: Function, delay: number)...
[0.9s | $0.0002]
─── gemini-flash ───
function debounce(func: (...args: any[]) => void, wait: number)...
[0.4s | $0.0001]
# Add to .bashrc / .zshrc
export ANTHROPIC_API_KEY=sk-ant-xxxxx
export OPENAI_API_KEY=sk-xxxxx
export GOOGLE_API_KEY=AIzaxxxxx
# Optional
export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODEL=qwen2.5-coder:7b| Flag | Description |
|---|---|
-m, --model <name> |
Force specific model |
-f, --file <path> |
Include file as context |
-s, --system <prompt> |
Custom system prompt |
--max-tokens <n> |
Max output tokens (default: 4096) |
--dry-run |
Show routing + cost without API call |
--compare |
Compare responses from multiple models |
| Feature | omni-coder | LiteLLM | Raw APIs |
|---|---|---|---|
| Smart routing | Yes | No | No |
| Cost estimation | Yes | No | Manual |
| Compare mode | Yes | No | No |
| Zero config | Yes | Setup needed | Manual |
| Offline/local | Yes (Ollama) | Limited | Manual |
| Single binary | Yes (npx) | pip install | N/A |
- claude-enchant — Auto-triggering behavioral rules
- spellbook — CLI skills framework
- claude-agent-kit — Multi-agent orchestration
- eve-ai — Animated AI companion with EVE eyes
MIT
Built by Anuar AX