Skip to content

AtomicBot-ai/Atomic-Chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8,378 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Atomic Chat

Atomic Chat

Local AI app and inference engine for agents. Run open-weight LLMs locally — private, on your machine.

Stars  Forks  Contributors  Last Commit  Tauri  Node.js

Getting Started · Hugging Face · Discord · X / Twitter · Bug Reports

Atomic Chat — local AI chat in action


📦 Download

Desktop

Download for macOS  Download for Windows  Download for Linux

Mobile

Download for iOS  Download for Android


🔌 Use It as an API

Atomic Chat runs an OpenAI-compatible server at http://localhost:1337/v1 — a drop-in replacement for the OpenAI SDK. Load a model in the app, then point any client at it:

curl http://localhost:1337/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model-id-loaded-in-atomic-chat>",
    "messages": [{ "role": "user", "content": "Say hello in one word" }]
  }'
from openai import OpenAI

# Atomic Chat is OpenAI API-compatible — only the base_url changes.
client = OpenAI(base_url="http://localhost:1337/v1", api_key="not-needed")

resp = client.chat.completions.create(
    model="<model-id-loaded-in-atomic-chat>",
    messages=[{"role": "user", "content": "Say hello in one word"}],
)
print(resp.choices[0].message.content)

Bound to 127.0.0.1 by default; set host: 0.0.0.0 to expose it on your LAN. Works with any agent, CLI, or IDE plugin that speaks the OpenAI API — see Launch With below.


✨ Features

Local models

  • Run open-weight LLMs locally from HuggingFace — Llama, Gemma, Qwen, Mistral, Phi, and others
  • Multi-Token Prediction (MTP) speculative decoding — 30–70% throughput boost on supported models, up to 3× on Gemma 4
  • DFlash block-diffusion decoding — up to 6× faster on Qwen 3.6, Gemma 4, Kimi K2.5
  • Flash Attention toggle (on / off / auto)
  • Automatic reasoning-context tracking for chain-of-thought models
  • Auto context-window expansion with overflow notifications
  • EAGLE-3 speculative decoding for Gemma 4 on Apple Silicon (MLX)
  • MTP on MLX for Qwen 3.5 / 3.6 and DeepSeek V4
  • TurboQuant KV cache (turbo3 / turbo4) on llama.cpp — now on Windows & Linux too, not just macOS: up to ~4.3× smaller KV cache footprint, CPU and GPU (CUDA / Vulkan)
  • TurboQuant KV cache on MLX-VLM — smaller memory footprint via RHT-correct fast paths

Cloud models

  • Built-in providers: OpenAI, Anthropic, Mistral, Groq, MiniMax, Qwen, Moonshot
  • Bring your own key, switch model per chat, mix local and cloud freely

Tools & integrations

  • One-click agent launch — launch coding agents like Claude Code, Codex CLI, Cline, OpenCode, Droid, Goose, OpenHands, Copilot CLI, Kilo Code and Zed in one click from the Integrations tab
  • Artifacts — live preview panel for HTML/CSS/JS code with copy, download and print
  • Connect multiple MCP servers — bring your own tools, file access, web search
  • Custom assistants with per-assistant system prompts
  • Projects with conversation tree view in the sidebar

Local API

  • OpenAI-compatible server at http://localhost:1337/v1 — drop-in replacement for the OpenAI SDK
  • Works with any agent, CLI, or IDE plugin that speaks the OpenAI API
  • Bound to 127.0.0.1 by default; set host: 0.0.0.0 to expose on LAN

Privacy

  • Everything runs locally when you want it to — local server is loopback-only by default
  • Your conversations and keys stay on your machine

⚙️ Inference Engines

Three engines under the hood, all exposed through one OpenAI-compatible API at http://localhost:1337/v1:

  • atomic-llama-cpp-turboquant — our llama.cpp fork with TurboQuant KV-cache optimizations (turbo3 / turbo4) for faster, lower-memory quantized inference. Now a selectable second provider ("Atomic Llama.cpp Turboquant") on all three desktops — macOS, Windows, and Linux — CPU and GPU (CUDA / Vulkan).
  • Upstream llama.cpp — official ggml-org build, the default engine on Windows and Linux for the widest hardware coverage and MTP support.
  • MLX-VLM — Apple Silicon-native engine for vision-language models, running on the Neural Engine and unified memory. Faster than llama.cpp on M-series chips for supported models.

Speculative-decoding features available across backends:

  • MTP (Multi-Token Prediction) — a draft model predicts ahead, the full model verifies in one pass. Available on macOS and Windows.
  • DFlash — block-diffusion speculative decoding for Qwen 3.6, Gemma 4, Kimi K2.5 and others. Apple Silicon only; can't be enabled together with MTP.
  • Flash Attention — Settings → on / off / auto.

Tools talking to http://localhost:1337/v1 don't need to know which backend is running underneath — switch engines without reconfiguring clients.


🚀 Launch With

Atomic Chat runs an OpenAI-compatible server at http://localhost:1337/v1, so any agent, CLI, IDE plugin, or app that speaks the OpenAI API can run on top of your local models — no extra glue needed. Just point its base URL at Atomic Chat and you're done.

A few projects already ship first-class support with their own setup docs:

Tool What it is Setup
OpenCode Open-source TUI coding agent. Add Atomic Chat as a local provider in opencode.json. Setup guide →
Goose Open-source extensible AI agent (CLI, desktop, API). Setup guide →
nanobot Ultra-lightweight personal AI agent with chat channels, MCP, and WebUI. Repo →
nanoclaw Containerized agent runtime that calls Atomic Chat as an MCP tool. Skill guide →
OpenClaude Open-source coding-agent CLI for cloud and local models. Lists Atomic Chat as a supported provider. Providers list →
Kilo Code Open-source AI coding agent for VS Code, JetBrains, and CLI. Ships with first-class Atomic Chat provider support and auto-discovery. Setup guide →
Hermes Desktop Native desktop companion for Hermes Agent. Includes an Atomic Chat local preset at http://localhost:1337/v1. Repo →
Hermes Workspace Local-first agent workspace built on Nous Research's Hermes. Uses Atomic Chat as its inference backend. Repo →

Built something that runs on Atomic Chat? Open a PR and we'll add it here.


🛠️ Build from Source

Prerequisites

  • Node.js ≥ 20.0.0
  • Yarn ≥ 4.5.3
  • Make ≥ 3.81
  • Rust (for Tauri)
  • (Apple Silicon) MetalToolchain xcodebuild -downloadComponent MetalToolchain

Run with Make

git clone https://github.com/AtomicBot-ai/Atomic-Chat
cd Atomic-Chat
make dev

This handles everything: installs dependencies, builds core components, and launches the app.

Available make targets:

  • make dev — full development setup and launch
  • make build — production build
  • make test — run tests and linting
  • make clean — delete everything and start fresh

Manual Commands

yarn install
yarn build:tauri:plugin:api
yarn build:core
yarn build:extensions
yarn dev

💻 System Requirements

  • macOS: 13.6+ (8GB RAM for 3B models, 16GB for 7B, 32GB for 13B)
  • Windows: 10/11 x64 (same RAM recommendations as macOS)
  • Linux: x86_64, glibc ≥ 2.35 (Ubuntu 22.04+, Debian 12+, Fedora 40+, Arch, Mint, Pop!_OS — same RAM recommendations as macOS). Optional: a Vulkan loader (vulkan-1 package, or mesa-vulkan-drivers / proprietary NVIDIA driver) for GPU acceleration.
  • iOS: download from App Store
  • Android: download from Google Play

🐧 Running on Linux

Atomic Chat ships as a single self-contained .AppImage — no installer, no root:

chmod +x Atomic.Chat_*_amd64.AppImage
./Atomic.Chat_*_amd64.AppImage

If prompted about FUSE on first launch: sudo apt install fuse libfuse2 (Debian/Ubuntu) or sudo dnf install fuse fuse-libs (Fedora). GPU acceleration (Vulkan) is auto-detected on first launch; only GGUF models run on Linux.


🧯 Troubleshooting

If something isn't working:

  1. Copy your error logs and system specs
  2. Open an issue on GitHub
  3. Or ask for help in our Discord

👥 Contributors

Atomic Chat is built by a small core team and 140+ contributors — including everyone who shaped the project from its earliest days. Pull requests welcome — see CONTRIBUTING.md for how to get started.

Vect0rM dtorey-d danyurkin MaxKoshJob Albert-Atomic yanalialiuk corevibe555 claytonlin1110 urmauur hohieuai Vanalite Minh141120 hiento09 hahuyhoang411 hiro-v qnixsynapse namchuai dan-menlo freelerobot ramonpzg ux-han aindrajaya dinhlongviolin1 louis-jan LazyYuuki eckartal david-menloai Van-QA gau-nernst github-roushan tikikun markmehere samhvw8 danielcwq bob-ros2 dev-miro26 shmutalov drakehere dataCenter430 lugnicca ethanova thewulf7 linhtran174 avb-is-me vansangpfiev cmppoon Ssstars fredatgithub px100 sharunkumar atoz96 since-2017-hub bytrangle SuperCowProducts bxdoan gabrielle-ong trilh-dev gary149 DistractionRectangle marknguyen1302 cuhong mykh-hailo DESU-CLUB 0xgokuz new5558 linuxid10t 0rzech Kuzmich55 Crystora mmngn statxc vikram761 MrAlaminH Lokimorty copyhold STRRL Dexterity104 QuentinMacheda Gri-ffin eltociear jamesdam razzeee metaspartan locnguyen1986 irfanpena cs-cat theproductiveprogrammer Diane0111 GenkaOk Helloyunho janpio kamal Louis454545 tuananhlai MauroDruwel zwpaper Realmbird reneleonhardt RONNCC SamPatt mesaugat 0saurabh0 sesajad sdhrt lucido-simon Haleshot vabatista volodya-lombrozo ynshung cashcon57 ddri hooray804 ldebs oolokioo7 phoval theishangoswami utenadev zhhanging mishrababhishek sr-albert gdmka deining Angelopgit anebot B0sh chindris-mihai-alexandru EndlessLucky mooncool Jasper-256 trunghaiy niesink maxx-ukoo myakura matthewbcool MichalZem Marco-9456 eren-karakus0 thunhuanh Fieldnote-Echo Eruis2579 akaMrNagar


⭐ Star History

Star History

📄 License

Apache 2.0 — see LICENSE for details.

🙏 Acknowledgements

Built on the shoulders of giants:


🌱 Heritage

Atomic Chat began as a fork of Jan by Menlo Research — an excellent open-source local-AI app. We're grateful to the Jan team and its contributors for the foundation they built. Atomic Chat has since grown its own direction, engines, and roadmap, but we tip our hat to where it started. 🙏


© 2026 Atomic Chat · Built with ❤️ · atomic.chat