Under the hood · how it works
How OpenAlly actually works
The technical half of the story — without the marketing gloss. What it does with your files, where the thinking happens, how your data stays yours, and the Rust core that runs it all.
Hand it your docs. It stops guessing.
Drop in your files and your AI answers from them — your contracts, your prices, your handbook — instead of guessing.
A regular AI knows a little about everything and nothing about you. Hand your agent your own documents and that flips. It reads your files, remembers them, and answers from what's actually true for your business — not from a guess. Drop in your menu, your contracts, your handbook — and it quotes the real thing, every time. And it all happens on your phone, so your documents never leave your hands.
Finds the right answer
keyword + meaning · on-deviceIt understands what you mean and also catches the exact words you half-remember.
Just drop things in
PDFs, web pages, notes — add them and they're instantly searchable.
Each agent, its own memory
Give every agent its own knowledge, and let them share when you say so.
Share a base, keep control
Hand a knowledge base to your team — they can read and ask, but only you can change what's in it.
The smart parts run on your phone
OpenAlly bundles real AI models that work without the cloud — so the most sensitive reading and listening happens on the device in your hand.
A local language model
Gemma · coming soonRun a small model entirely on your phone, so everyday prompts never leave it.
Speech-to-text, offline
AndroidWhisper turns your voice notes and recordings into text right on the device.
On-device text understanding
AndroidA small classifier reads your SMS for spending, bills, and scams without uploading a word.
Search that works from first launch
ShippedBundled embeddings power your knowledge-base search offline — no setup, no round-trip.
Run real code, locally
AndroidAn on-device Python runtime lets your AI compute and crunch data without a server.
What ships on-device depends on your phone and is Android-only today; packaged local language models (Gemma) are coming soon. Everything here runs without sending your prompts, audio, or texts to us.
No round-trip, no eavesdropping middleman — your phone does the thinking.
Built so we genuinely can't peek
Most apps say they care about your privacy. We'd rather not be able to break it in the first place.
Here's the deadpan version. The private stuff — your messages, your notes, your numbers — is read and figured out right on your phone, not shipped off to some server. Your passwords and keys live in the same locked hardware your banking app trusts. And your backups are sealed before they ever leave, into a drive only you can open — and only a 12-word recovery phrase you alone hold can ever reopen them. None of that asks you to take our word for it. That's the whole point.
And the honest small print: Free and ad-supported (Google AdMob), with anonymous crash reports — never your conversations. When you pick a cloud model or connect a chat app, those messages go to the provider you chose — on your key, never through us. And what counts as risky isn't an AI guess — a fixed rule engine decides, so banking and payments stay off-limits.
Your data stays on your phone
on-device AIThe sensitive stuff — your messages, your notes, your numbers — is read and understood right on your device, not shipped off to a server.
Your keys never leave
hardware-sealed · AES-256Your passwords and API keys are locked in your phone's secure hardware — the same vault your banking app trusts. Not in a file, not in the cloud, not visible to us.
Backups only you can open
end-to-end encrypted · your recovery phraseYour backup is scrambled on your phone before it ever leaves, and saved to your own Google Drive. We can't read it. Neither can anyone else.
Your AI only does what you allow
capability-gated · signedEvery agent is boxed in to the exact powers you grant it, and every add-on is checked before it's trusted. No rogue behaviour.
We don't have a "trust us" button. We have math.
A Rust core, with Go where it earns its place
For the people who read the architecture diagram before the pricing page. Hi.
The whole device-side runtime is the embedded Rust cortex-kernel — a large modular workspace handling the agent loop, providers, channels, memory, and security — dispatched into the app over React Native JSI. It's the sole runtime on the device. No background Node process, no IPC hop — just Rust, doing the work. Go shows up exactly where it's the right tool, and nowhere it isn't: the WhatsApp engine (whatsmeow, statically linked via cgo), the app build toolchain (an esbuild bridge), and the tsgo type-checker. The agent, provider, and channel stacks are pure Rust. We like Go; we just don't pretend it runs the show.
By the numbers
- native chat engines
- built right into the core
- native tool crates
- files, web, contacts & more
- Rust crates, one binary
- no background processes
- models, one switch
- every major provider
Why it feels quick — and stays solid
Native Rust on its own thread: no garbage-collector pauses, no frozen UI, memory-safe by design, with token streaming coalesced for smooth output.
On-device runtimes
The runtimes that ship inside the app and execute entirely on your phone — no round-trip required.
Always on, reachable anywhere
The quiet 24/7 backbone — it keeps running on your phone, and reaches the wider internet only when, and how, you decide.
It keeps running, quietly
A lightweight background service keeps your AI alive and survives a reboot — with a live status notification so you always know it's on.
Reachable anywhere, on your terms
When you want it reachable from the internet, it opens a tunnel you control — ngrok, Cloudflare, Tailscale, or your own — with an honest up/down status.
That's the how. Want the what?
It's free to start, runs on your phone, and works the way you do. The hardest part is deciding which agent to build first.
On Android today — more platforms on the way.