A production-ready reference app demonstrating the RunAnywhere Flutter SDK capabilities for on-device AI. This app showcases how to build privacy-first, offline-capable AI features with LLM chat, speech-to-text, text-to-speech, and a complete voice assistant pipeline—all running locally on your device.
Important: This sample app consumes the RunAnywhere Flutter SDK through local path dependencies. A clean clone needs Flutter packages plus the Android JNI libraries and iOS XCFrameworks staged into the Flutter plugin packages.
Prerequisites:
- Flutter 3.24+ and Dart 3.5+ on
PATH. - Android Studio with Android SDK 24+, platform tools, CMake, and NDK; export
ANDROID_HOMEandANDROID_NDK_HOME. - Xcode 15+ and CocoaPods for iOS simulator builds.
- JDK 17 and enough disk for native artifacts and downloaded AI models.
From a fresh checkout:
cd examples/flutter/RunAnywhereAI
flutter pub get
# Build or refresh local native artifacts when the checkout has no staged binaries.
cd ../../..
./scripts/build/build-core-android.sh arm64-v8a
./sdk/runanywhere-swift/scripts/build-core-xcframework.sh
cd examples/flutter/RunAnywhereAI
flutter analyze
flutter build apk --debug
flutter build ios --simulator --debugNotes:
scripts/build/build-core-android.shstages JNI libraries intosdk/runanywhere-flutter/packages/*/android/src/main/jniLibs.sdk/runanywhere-swift/scripts/build-core-xcframework.shstagesRACommons.xcframework,RABackendLLAMACPP.xcframework,RABackendONNX.xcframework, andRABackendSherpa.xcframeworkinto the Flutter pluginios/Frameworksdirectories.runanywhere_genieis Android/Snapdragon-only; iOS builds do not expect a Genie XCFramework.- If the iOS build reports stale Pods or generated Flutter config, run
cd ios && pod install && cd ..afterflutter pub get. scripts/verify.shrunspub get, analysis, APK build, and optional iOS/native artifact refresh gates.
This sample app's pubspec.yaml uses path dependencies to reference the local Flutter SDK packages:
This Sample App → Local Flutter SDK packages (sdk/runanywhere-flutter/packages/)
↓
Local XCFrameworks/JNI libs (in each package's ios/Frameworks/ and android/src/main/jniLibs/)
↑
Built by: ./sdk/runanywhere-swift/scripts/build-core-xcframework.sh + ./scripts/build/build-core-android.sh
Repo-root native build scripts (called from project root):
./sdk/runanywhere-swift/scripts/build-core-xcframework.sh— builds iOS XCFrameworks and stages them intosdk/runanywhere-flutter/packages/*/ios/Frameworks/../scripts/build/build-core-android.sh <ABI>— builds Android.solibraries and stages them intosdk/runanywhere-flutter/packages/*/android/src/main/jniLibs/<ABI>/.
Local consumption is enabled by the runanywhere.useLocalNatives=true Gradle property (default for development checkouts).
- Dart SDK code changes: Run
flutter runagain (hot reload works for most changes). - C++ code changes (in
runanywhere-commons):# From repo root ./scripts/build/build-core-android.sh arm64-v8a ./sdk/runanywhere-swift/scripts/build-core-xcframework.sh
Try the native iOS and Android apps to experience on-device AI capabilities immediately. The Flutter sample app demonstrates the same features using the cross-platform Flutter SDK.
This sample app demonstrates the full power of the RunAnywhere Flutter SDK:
| Feature | Description | SDK Integration |
|---|---|---|
| AI Chat | Interactive LLM conversations with streaming responses | RunAnywhere.llm.generateStream() |
| Thinking Mode | Support for models with <think>...</think> reasoning |
Thinking tag parsing |
| Real-time Analytics | Token speed, generation time, inference metrics | MessageAnalytics |
| Speech-to-Text | Voice transcription with batch & live modes | RunAnywhere.stt.transcribe() |
| Text-to-Speech | Neural voice synthesis with Piper TTS | RunAnywhere.tts.synthesize() |
| Voice Assistant | Full STT to LLM to TTS pipeline with auto-detection | RunAnywhere.voice |
| Model Management | Download, load, and manage multiple AI models | RunAnywhere.models / RunAnywhere.downloads |
| Storage Management | View storage usage and delete models | RunAnywhere.downloads.getStorageInfo() |
| Offline Support | All features work without internet | On-device inference |
The app follows Flutter best practices with a clean architecture pattern:
┌─────────────────────────────────────────────────────────────────────┐
│ Flutter/Material UI │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Chat │ │ STT │ │ TTS │ │ Voice │ │ Settings │ │
│ │Interface │ │ View │ │ View │ │Assistant │ │ View │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └─────┬──────┘ │
├───────┼────────────┼────────────┼────────────┼─────────────┼────────┤
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Feature ViewModels + UI State │ │
│ │ (SDK facades, Services, ListenableBuilder) │ │
│ └──────────────────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ RunAnywhere Flutter SDK │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Core API (generate, transcribe, synthesize) │ │
│ │ Model Management (download, load, unload, delete) │ │
│ │ Voice Session (STT → LLM → TTS pipeline) │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────────┴──────────────────┐ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ LlamaCpp │ │ ONNX Runtime │ │
│ │ (LLM/GGUF) │ │ (STT/TTS) │ │
│ └─────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
- Feature-Local State — Screens keep ephemeral UI state local and call SDK facades directly
- Feature-First Structure — Each feature is self-contained with its own views and logic
- Shared Core Services —
AudioRecordingService,AudioPlayerService, persistence and device helpers - Design System — Consistent
AppColors,AppTypography,AppSpacingtokens - SDK Integration — Direct SDK calls with async/await and Stream support
RunAnywhereAI/
├── lib/
│ ├── main.dart # App entry point
│ │
│ ├── app/
│ │ ├── runanywhere_ai_app.dart # SDK initialization, model registration
│ │ └── content_view.dart # Main tab navigation (5 tabs)
│ │
│ ├── core/
│ │ ├── design_system/
│ │ │ ├── app_colors.dart # Color palette with dark mode support
│ │ │ ├── app_spacing.dart # Spacing constants
│ │ │ └── typography.dart # Text styles
│ │ │
│ │ ├── models/
│ │ │ └── app_types.dart # Shared type definitions
│ │ │
│ │ ├── services/
│ │ │ ├── audio_recording_service.dart # Microphone capture
│ │ │ ├── audio_player_service.dart # TTS playback
│ │ │ ├── permission_service.dart # Permission handling
│ │ │ ├── conversation_store.dart # Chat history persistence
│ │ │ └── device_info_service.dart # Device capabilities
│ │ │
│ │ └── utilities/
│ │ ├── constants.dart # Preference keys, defaults
│ │ └── keychain_helper.dart # Secure storage wrapper
│ │
│ ├── features/
│ │ ├── chat/
│ │ │ └── chat_interface_view.dart # LLM chat with streaming
│ │ │
│ │ ├── voice/
│ │ │ ├── speech_to_text_view.dart # Batch & live STT
│ │ │ ├── text_to_speech_view.dart # TTS synthesis & playback
│ │ │ └── voice_assistant_view.dart # Full STT→LLM→TTS pipeline
│ │ │
│ │ ├── models/
│ │ │ ├── models_view.dart # Model browser
│ │ │ ├── model_selection_sheet.dart # Model picker bottom sheet
│ │ │ ├── model_list_view_model.dart # Model list logic
│ │ │ ├── model_components.dart # Reusable model UI widgets
│ │ │ ├── model_status_components.dart # Status badges, indicators
│ │ │ ├── model_types.dart # Framework enums, model info
│ │ │ └── add_model_from_url_view.dart # Import custom models
│ │ │
│ │ └── settings/
│ │ └── combined_settings_view.dart # Storage & logging config
│ │
│ └── helpers/
│ └── adaptive_layout.dart # Responsive layout utilities
│
├── pubspec.yaml # Dependencies, SDK references
├── android/ # Android platform config
├── ios/ # iOS platform config
└── README.md # This file
- Flutter 3.10.0 or later (install guide)
- Dart 3.0.0 or later (included with Flutter)
- iOS — Xcode 14+ (for iOS builds)
- Android — Android Studio + SDK 21+ (for Android builds)
- ~2GB free storage for AI models
- Device — Physical device recommended for best performance
# Clone the repository
git clone https://github.com/RunanywhereAI/runanywhere-sdks.git
cd runanywhere-sdks/examples/flutter/RunAnywhereAI
# Install dependencies
flutter pub get
# Run on connected device
flutter run- Open the project in VS Code or Android Studio
- Wait for Flutter dependencies to resolve
- Select a physical device (iOS or Android)
- Press F5 (VS Code) or Run (Android Studio)
# Android APK
flutter build apk --release
# Android App Bundle
flutter build appbundle --release
# iOS (requires Xcode)
flutter build ios --releaseThe SDK is initialized in runanywhere_ai_app.dart:
import 'package:runanywhere/runanywhere.dart';
import 'package:runanywhere_llamacpp/runanywhere_llamacpp.dart';
import 'package:runanywhere_onnx/runanywhere_onnx.dart';
// 1. Initialize SDK in development mode
await RunAnywhere.initialize();
// 2. Register LlamaCpp module for LLM models (GGUF)
await LlamaCpp.register();
RunAnywhere.models.register(
id: 'smollm2-360m-q8_0',
name: 'SmolLM2 360M Q8_0',
url: Uri.parse('https://huggingface.co/prithivMLmods/SmolLM2-360M-GGUF/resolve/main/SmolLM2-360M.Q8_0.gguf'),
framework: InferenceFramework.INFERENCE_FRAMEWORK_LLAMA_CPP,
memoryRequirement: 500000000,
);
// 3. Register ONNX module for STT/TTS models
await Onnx.register();
RunAnywhere.models.register(
id: 'sherpa-onnx-whisper-tiny.en',
name: 'Sherpa Whisper Tiny (ONNX)',
url: Uri.parse('https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/sherpa-onnx-whisper-tiny.en.tar.gz'),
framework: InferenceFramework.INFERENCE_FRAMEWORK_SHERPA,
modality: ModelCategory.MODEL_CATEGORY_SPEECH_RECOGNITION,
memoryRequirement: 75000000,
);// Download with progress tracking
final progressStream = RunAnywhere.downloads.start('smollm2-360m-q8_0');
await for (final p in progressStream) {
if (p.stage == DownloadStage.DOWNLOAD_STAGE_COMPLETED) break;
}
// Load LLM model
await RunAnywhere.llm.load('smollm2-360m-q8_0');
// Check if model is loaded
final isLoaded = RunAnywhere.isLLMModelLoaded;// Generate with streaming (real-time tokens)
final stream = RunAnywhere.llm.generateStream(prompt, options);
await for (final event in stream) {
if (event.isFinal) break;
if (event.token.isNotEmpty) {
setState(() {
_responseText += event.token;
});
}
}
// Or non-streaming
final result = await RunAnywhere.llm.generate(prompt, options);
print('Response: ${result.text}');
print('Speed: ${result.tokensPerSecond} tok/s');// Load STT model
await RunAnywhere.stt.load('sherpa-onnx-whisper-tiny.en');
// Transcribe audio bytes
final result = await RunAnywhere.stt.transcribe(audioBytes);
print('Transcription: ${result.text}');// Load TTS voice
await RunAnywhere.tts.loadVoice('vits-piper-en_US-lessac-medium');
// Synthesize speech with options
final result = await RunAnywhere.tts.synthesize(
text,
TTSOptions(rate: 1.0, pitch: 1.0, volume: 1.0),
);
// Play audio (result.audio is Uint8List PCM16)
await audioPlayer.play(result.audio, result.sampleRate);// Subscribe to the voice agent event stream
final sub = RunAnywhere.voice.eventStream().listen((event) {
if (event.hasUserSaid()) {
print('User said: ${event.userSaid.text}');
} else if (event.hasAssistantToken()) {
print('Token: ${event.assistantToken.text}');
}
});
// Initialize pipeline with loaded models
await RunAnywhere.voice.initializeWithLoadedModels();
// Cancel when done
await sub.cancel();What it demonstrates:
- Streaming text generation with real-time token display
- Thinking mode support (
<think>...</think>tags) - Message analytics (tokens/sec, generation time)
- Conversation history with Markdown rendering
- Model selection bottom sheet integration
Key SDK APIs:
RunAnywhere.llm.generateStream()— Streaming generationRunAnywhere.llm.generate()— Non-streaming generationRunAnywhere.currentLLMModel— Get loaded model info
What it demonstrates:
- Batch mode: Record full audio, then transcribe
- Live mode: Real-time streaming transcription (when supported)
- Audio level visualization
- Mode selection (batch vs. live)
Key SDK APIs:
RunAnywhere.stt.load()— Load Whisper modelRunAnywhere.stt.transcribe()— Batch transcriptionRunAnywhere.isSTTModelLoaded— Check model status
What it demonstrates:
- Neural voice synthesis with Piper TTS
- Speed and pitch controls with sliders
- Audio playback with progress indicator
- Audio metadata display (duration, sample rate, size)
Key SDK APIs:
RunAnywhere.tts.loadVoice()— Load TTS modelRunAnywhere.tts.synthesize()— Generate speech audioRunAnywhere.isTTSVoiceLoaded— Check voice status
What it demonstrates:
- Complete voice AI pipeline (STT to LLM to TTS)
- Model configuration for all 3 components
- Audio level visualization during recording
- Conversation turn management
- Session state machine (connecting, listening, processing, speaking)
Key SDK APIs:
RunAnywhere.voice.eventStream()— Voice agent event streamRunAnywhere.voice.initializeWithLoadedModels()— Initialize pipelineVoiceEvent— Proto-typed voice session events
What it demonstrates:
- Storage usage overview (total, available, model storage)
- Downloaded model list with details
- Model deletion with confirmation dialog
- Analytics logging toggle
Key SDK APIs:
RunAnywhere.downloads.getStorageInfo()— Get storage detailsRunAnywhere.downloads.list()— List modelsRunAnywhere.downloads.delete()— Remove model
| Model | Size | Memory | Description |
|---|---|---|---|
| SmolLM2 360M Q8_0 | ~400MB | 500MB | Fast, lightweight chat |
| Qwen 2.5 0.5B Q6_K | ~500MB | 600MB | Multilingual, efficient |
| LFM2 350M Q4_K_M | ~200MB | 250MB | LiquidAI, ultra-compact |
| LFM2 350M Q8_0 | ~350MB | 400MB | Higher quality version |
| Llama 2 7B Chat Q4_K_M | ~4GB | 4GB | Powerful, larger model |
| Mistral 7B Instruct Q4_K_M | ~4GB | 4GB | High quality responses |
| Model | Size | Description |
|---|---|---|
| Sherpa Whisper Tiny (EN) | ~75MB | Fast English transcription |
| Sherpa Whisper Small (EN) | ~250MB | Higher accuracy |
| Model | Size | Description |
|---|---|---|
| Piper US English (Medium) | ~65MB | Natural American voice |
| Piper British English (Medium) | ~65MB | British accent |
# Run all tests
flutter test
# Run with coverage
flutter test --coverage
# Run specific test file
flutter test test/widget_test.dart# Analyze code quality
flutter analyze
# Format code
dart format lib/ test/
# Fix issues automatically
dart fix --applyThe app uses debugPrint() extensively. Filter logs by:
# Flutter logs
flutter logs | grep -E "RunAnywhere|SDK"| Log Prefix | Description |
|---|---|
SDK |
SDK initialization |
SUCCESS |
Success operations |
ERROR |
Error conditions |
MODULE |
Module registration |
LOADING |
Loading/processing |
AUDIO |
Audio operations |
RECORDING |
Recording operations |
- Run app in profile mode:
flutter run --profile - Open DevTools: Press
pin terminal - Navigate to Memory tab
- Expected: ~300MB-2GB depending on model size
The SDK automatically detects the environment:
// Development mode (default)
if (kDebugMode) {
await RunAnywhere.initialize();
}
// Production mode
else {
await RunAnywhere.initialize(
apiKey: 'your-api-key',
baseURL: 'https://api.runanywhere.ai',
environment: SDKEnvironment.SDK_ENVIRONMENT_PRODUCTION,
);
}User preferences are stored via SharedPreferences:
| Key | Type | Default | Description |
|---|---|---|---|
useStreaming |
bool | true |
Enable streaming generation |
defaultTemperature |
double | 0.7 |
LLM temperature |
defaultMaxTokens |
int | 500 |
Max tokens per generation |
- ARM64 Recommended — Native libraries optimized for arm64 (x86 emulators may be slow)
- Memory Usage — Large models (7B+) require devices with 6GB+ RAM
- First Load — Initial model loading takes 1-3 seconds (cached afterward)
- Live STT — Best with Sherpa-ONNX streaming models (limited in plain ONNX)
- Platform Channels — Some SDK features use FFI/platform channels
The iOS example app is the canonical reference. This app mirrors its tab
structure, model catalog (lib/core/services/model_catalog_bootstrap.dart),
model-picker filtering, generated solutions YAML, ViewModel layering, hybrid
STT, and benchmarks. Intentionally unsupported iOS-only surfaces:
- Voice Keyboard — depends on the iOS app-extension targets
(
RunAnywhereKeyboard,RunAnywhereActivityExtension) and Live Activities; there is no Flutter analogue for a system keyboard extension. - FoundationModels smart conversation titles — Apple-platform-gated (iOS 26 FoundationModels); the Flutter app uses a deterministic first-user-message title fallback instead.
We welcome contributions! See CONTRIBUTING.md for guidelines.
# Fork and clone
git clone https://github.com/YOUR_USERNAME/runanywhere-sdks.git
cd runanywhere-sdks/examples/flutter/RunAnywhereAI
# Create feature branch
git checkout -b feature/your-feature
# Make changes and test
flutter pub get
flutter analyze
flutter test
# Commit and push
git commit -m "feat: your feature description"
git push origin feature/your-feature
# Open Pull RequestThis project is licensed under the Apache License 2.0 - see LICENSE for details.
- Discord: Join our community
- GitHub Issues: Report bugs
- Email: san@runanywhere.ai
- Twitter: @RunanywhereAI
- RunAnywhere Flutter SDK — Full SDK documentation
- iOS Example App — iOS counterpart
- Android Example App — Android counterpart
- React Native Example — React Native option
- Main README — Project overview
