Early access open

Your hardware.
Your AI.
Finally easy.

Qalarc AI-OS is a pre-configured operating system and curated model marketplace for AI mini PCs. Stop spending months on configuration. Get running in hours.

50× faster first token
90% cost reduction
100% data privacy
requests / day
inference speed
36 tok/s
Llama 3 70B · local · no cloud
monthly cost
$0 after setup
vs $3,000–50,000 cloud
latency
10ms
first token · vs 500ms cloud
Private by design · No subscriptions · Works offline · HIPAA ready · GDPR compliant · 256 GB unified memory · Llama · Qwen · DeepSeek · Mixtral · Setup in hours, not months · Private by design · No subscriptions · Works offline · HIPAA ready · GDPR compliant · 256 GB unified memory · Llama · Qwen · DeepSeek · Mixtral · Setup in hours, not months ·

Everything you need
for production-ready local AI

One package. Hardware, OS, models, and knowledge — assembled and tested, ready to deploy.

⚙️

Qalarc AI-OS

Pre-configured Linux optimised for AI workloads. All dependencies installed, security hardened, ready on boot.

🤖

Curated Model Marketplace

Tested, optimised models ready to run. Llama, CodeLlama, Mixtral, DeepSeek. One-click install, automatic updates.

🔧

Setup Services

Remote or on-site configuration. We handle the complexity — you get running systems in hours, not months.

📚

Offline Knowledge Base

Full Wikipedia, technical documentation, curated datasets. No internet required for inference.

📊

Management Dashboard

Web UI and CLI tools for monitoring, updates, model management. Full system control at a glance.

🖥️

Hardware Agnostic

Works with GMKTEC, Framework, Minisforum and other unified memory systems. Bring your own or buy through our guide.

$ qalarc model list
llama3.3-70b-q4 loaded · 36 tok/s
qwen2.5-32b-q4 ready · 9 tok/s
codellama-13b-q4 ready · 20 tok/s
deepseek-r1-70b-q4 install available
$ qalarc serve llama3.3-70b-q4
↑ API running on http://localhost:11434

Numbers that
change the decision

Cloud AI can cost

$50K
per month at enterprise scale

Qalarc: one-time cost, zero per-token fees.

10ms
first token latency

Cloud averages 500ms+. Local inference is 50× faster for the first token.

requests / day

No rate limits. No throttling. Ever.

100%
data stays local

HIPAA, GDPR, and air-gap ready by design.

256GB
unified memory

Run 70B models with headroom. Mac Studio caps at 192GB.

Real speeds.
Real hardware.

These are measured tokens-per-second on AMD Strix Halo hardware with Q4 quantised models. No cloud, no tricks.

Llama 3.3 7B 36 tok/s
Llama 3.3 13B 20 tok/s
Qwen 2.5 32B 9–52 tok/s
Llama 3.3 70B 25–30 tok/s

* 32B range: 9 tok/s dense · 52 tok/s MoE w/ Q4 quantisation

Q4 Quantisation

Reduces model size by 75% with minimal quality loss. Llama 405B fits in 200GB instead of 800GB.

System RAM > GPU VRAM

Large models run efficiently on system RAM. Our 256GB systems outperform $7K Mac Studios on 405B models.

Local Inference

10ms first token, unlimited throughput, zero network dependency. Data never leaves your building.

The models you know.
Running locally.

Tested and optimised for unified memory architecture. One-click install from the Qalarc marketplace.

Llama 3.3 7B
Llama 3.3 13B
Llama 3.3 70B
Llama 405B
CodeLlama
Qwen 2.5 32B
DeepSeek R1
Mixtral 8×7B
+ more coming

Works with the
best AI mini PCs

Qalarc AI-OS runs on leading unified memory systems. Bring your own hardware or follow our guide.

★ Recommended
GMKTEC EVO X2 AI
AMD Ryzen AI Max+ 395
128 GB unified memory (LPDDR5X-8000)
126 TOPS · 40-core RDNA 3.5 GPU
Compact desktop form factor
View on GMKTEC
Enterprise
NVIDIA DGX Spark
Professional workstation AI
Higher memory configurations
Enterprise support & warranty
Data centre deployment ready
Learn more
Modular
Framework Desktop
Modular, upgradeable design
Right-to-repair philosophy
128 GB ships immediately
Strong community support
View desktop

Simple, honest
pricing

One-time purchase or software licence. No subscriptions, no per-token fees, no surprise bills.

Software
$299 one-time

Install on your own hardware. Full Qalarc AI-OS + model marketplace + setup guide.

  • Qalarc AI-OS licence
  • Model marketplace access
  • Remote setup support
  • Offline knowledge base
  • 6 months updates
Enterprise
Talk to us

Custom deployments. White-glove service. Multiple units, custom models, integrations.

  • Everything in Turnkey
  • Multi-unit deployments
  • Custom model fine-tuning
  • API + workflow integration
  • Dedicated support SLA
  • White-label option
Qalarc
  • Qal Hebrew (קל) for "lightweight" or "easy"
  • QAL Quantised Agents Local — our core technology
  • Arc A secure collection — like Noah's Arc protecting precious cargo

Together: "Lightweight Quantised Agents Local, secured like an Arc."

Cloud AI is broken for serious use

Expensive, slow, and your data leaves your network. Mac Studio costs $7,000 and still can't run 405B models. DIY local AI takes months.

Our solution is complete systems

We don't just sell hardware. We build it, configure the OS, load the models, integrate knowledge bases, test everything, and ship it ready to deploy.

"From unboxing to production
deployment in hours."

Ready to deploy
AI locally?

Join the waitlist for early access and priority deployment. We respond within 24 hours.

Or email us directly at team@qalarc.com

NERV / Dark Pastel / Soft ✓ Cyber Gruvbox Evangelion 3D Parallax