Agent Zero + BitNet b1.58 (Hybrid Architecture)

High-Performance Local Intelligence on Windows ARM64

This repository contains the configuration and scripts to run Agent Zero backed by Microsoft's BitNet b1.58 inference engine. This setup uses a Hybrid Architecture to maximize performance on Windows ARM64 devices (like Surface Laptop 7 / Copilot+ PCs):

Inference Server: Native Windows C++ (bitnet.cpp) for raw performance (No Docker overhead).
Agent Framework: Agent Zero running in Docker, connected via local networking.

🚀 Features

True 1.58-bit Quantization: Uses the official bitnet.cpp implementation for maximum speed and energy efficiency.
Native ARM64 Support: Fully compiled for Windows ARM64 with NEON/DotProd/i8mm optimizations.
OpenAI-Compatible API: Provides a standard endpoint (/v1) that Agent Zero (or any other tool) can consume easily.
Zero-Config Startup: Includes run_bitnet_server.ps1 to launch the server instantly.

🛠️ Prerequisites

Windows 11 on ARM64 (e.g., Snapdragon X Elite)
Visual Studio 2022 with C++ ARM64 Build Tools
Docker Desktop (for running Agent Zero)

📦 Installation & Build

(If you haven't built the binaries yet)

Clone this repository

Build BitNet:

cmake -B build -G Ninja -DCMAKE_SYSTEM_PROCESSOR=ARM64 ... (see docs)
cmake --build build --config Release --target llama-server

(Note: This repo assumes you have already compiled llama-server.exe into build/bin/)

Download Model: Ensure ggml-model-i2_s.gguf is present in models/BitNet-b1.58-2B-4T/.

🏃 Usage

1. Start the Inference Server

Run the included PowerShell script to launch the high-performance native server:

.\run_bitnet_server.ps1

URL: http://localhost:8080/v1
Documentation: http://localhost:8080/docs

2. Connect Agent Zero

Run Agent Zero (in Docker) and point it to your host's native server:

docker run -it --rm `
  -p 50001:80 `
  -v "${HOME}/.agent-zero:/app/work_dir" `
  --add-host=host.docker.internal:host-gateway `
  frdel/agent-zero:latest

Agent Zero Config (agent-zero/config/chat_models.json):

{
  "bitnet": {
    "provider": "openai",
    "name": "bitnet-b1.58-2B-4T",
    "kwargs": {
      "base_url": "http://host.docker.internal:8080/v1",
      "api_key": "bitnet-local"
    }
  }
}

📂 Repository Structure

run_bitnet_server.ps1 - Start Here. Launches the native inference server.
utils/ - Helper scripts for quantization and codegen.
src/ & include/ - Core BitNet C++ implementation.

🏗️ Architecture

graph LR
    subgraph "Windows Host (Native)"
        SERVER["BitNet Server (llama-server.exe)"]
        MODEL["BitNet-b1.58-2B-4T"]
        SERVER --> MODEL
    end
    
    subgraph "Docker Container"
        AGENT["Agent Zero"]
    end
    
    AGENT -->|"HTTP :8080/v1"| SERVER

Verified on Surface Laptop 7 (Snapdragon X Elite)

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
3rdparty		3rdparty
assets		assets
docs		docs
gpu		gpu
include		include
media		media
patches		patches
preset_kernels		preset_kernels
src		src
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt
run_bitnet_server.ps1		run_bitnet_server.ps1
run_inference.py		run_inference.py
run_inference_server.py		run_inference_server.py
setup_env.py		setup_env.py
start_all.ps1		start_all.ps1
test_completions.ps1		test_completions.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Zero + BitNet b1.58 (Hybrid Architecture)

🚀 Features

🛠️ Prerequisites

📦 Installation & Build

🏃 Usage

1. Start the Inference Server

2. Connect Agent Zero

📂 Repository Structure

🏗️ Architecture

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent Zero + BitNet b1.58 (Hybrid Architecture)

🚀 Features

🛠️ Prerequisites

📦 Installation & Build

🏃 Usage

1. Start the Inference Server

2. Connect Agent Zero

📂 Repository Structure

🏗️ Architecture

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages