Qalarc — Local AI, Finally Easy

What's included

Everything you need
for production-ready local AI

One package. Hardware, OS, models, and knowledge — assembled and tested, ready to deploy.

⚙️

Qalarc AI-OS

Pre-configured Linux optimised for AI workloads. All dependencies installed, security hardened, ready on boot.

🤖

Curated Model Marketplace

Tested, optimised models ready to run. Llama, CodeLlama, Mixtral, DeepSeek. One-click install, automatic updates.

🔧

Setup Services

Remote or on-site configuration. We handle the complexity — you get running systems in hours, not months.

📚

Offline Knowledge Base

Full Wikipedia, technical documentation, curated datasets. No internet required for inference.

📊

Management Dashboard

Web UI and CLI tools for monitoring, updates, model management. Full system control at a glance.

🖥️

Hardware Agnostic

Works with GMKTEC, Framework, Minisforum and other unified memory systems. Bring your own or buy through our guide.

$ qalarc model list

✓ llama3.3-70b-q4 loaded · 36 tok/s

✓ qwen2.5-32b-q4 ready · 9 tok/s

✓ codellama-13b-q4 ready · 20 tok/s

↓ deepseek-r1-70b-q4 install available

$ qalarc serve llama3.3-70b-q4

↑ API running on http://localhost:11434

Why local AI

Numbers that
change the decision

Cloud AI can cost

$50K

per month at enterprise scale

Qalarc: one-time cost, zero per-token fees.

10ms

first token latency

Cloud averages 500ms+. Local inference is 50× faster for the first token.

∞

requests / day

No rate limits. No throttling. Ever.

100%

data stays local

HIPAA, GDPR, and air-gap ready by design.

256GB

unified memory

Run 70B models with headroom. Mac Studio caps at 192GB.

Performance

Real speeds.
Real hardware.

These are measured tokens-per-second on AMD Strix Halo hardware with Q4 quantised models. No cloud, no tricks.

Llama 3.3 7B 36 tok/s

Llama 3.3 13B 20 tok/s

Qwen 2.5 32B 9–52 tok/s

Llama 3.3 70B 25–30 tok/s

* 32B range: 9 tok/s dense · 52 tok/s MoE w/ Q4 quantisation

The science behind it

Q4 Quantisation

Reduces model size by 75% with minimal quality loss. Llama 405B fits in 200GB instead of 800GB.

System RAM > GPU VRAM

Large models run efficiently on system RAM. Our 256GB systems outperform $7K Mac Studios on 405B models.

Local Inference

10ms first token, unlimited throughput, zero network dependency. Data never leaves your building.

Model marketplace

The models you know.
Running locally.

Tested and optimised for unified memory architecture. One-click install from the Qalarc marketplace.

Llama 3.3 7B

Llama 3.3 13B

Llama 3.3 70B

Llama 405B

CodeLlama

Qwen 2.5 32B

DeepSeek R1

Mixtral 8×7B

+ more coming

Compatible hardware

Works with the
best AI mini PCs

Qalarc AI-OS runs on leading unified memory systems. Bring your own hardware or follow our guide.

★ Recommended

GMKTEC EVO X2 AI

AMD Ryzen AI Max+ 395
128 GB unified memory (LPDDR5X-8000)
126 TOPS · 40-core RDNA 3.5 GPU
Compact desktop form factor

View on GMKTEC

Enterprise

NVIDIA DGX Spark

Professional workstation AI
Higher memory configurations
Enterprise support & warranty
Data centre deployment ready

Learn more

Modular

Framework Desktop

Modular, upgradeable design
Right-to-repair philosophy
128 GB ships immediately
Strong community support

View desktop

Pricing

Simple, honest
pricing

One-time purchase or software licence. No subscriptions, no per-token fees, no surprise bills.

Software

$299 one-time

Install on your own hardware. Full Qalarc AI-OS + model marketplace + setup guide.

Qalarc AI-OS licence
Model marketplace access
Remote setup support
Offline knowledge base
6 months updates

Turnkey System

Custom quote

Complete system: hardware sourced, assembled, burned-in, OS loaded, models installed. Plug in and go.

Hardware procurement
Professional assembly + burn-in
Full Qalarc AI-OS + models
Knowledge base integration
On-site setup option
Priority support

Enterprise

Talk to us

Custom deployments. White-glove service. Multiple units, custom models, integrations.

Everything in Turnkey
Multi-unit deployments
Custom model fine-tuning
API + workflow integration
Dedicated support SLA
White-label option

About Qalarc

Qalarc

Qal Hebrew (קל) for "lightweight" or "easy"
QAL Quantised Agents Local — our core technology
Arc A secure collection — like Noah's Arc protecting precious cargo

Together: "Lightweight Quantised Agents Local, secured like an Arc."

The problem we solve

Cloud AI is broken for serious use

Expensive, slow, and your data leaves your network. Mac Studio costs $7,000 and still can't run 405B models. DIY local AI takes months.

Our solution is complete systems

We don't just sell hardware. We build it, configure the OS, load the models, integrate knowledge bases, test everything, and ship it ready to deploy.

"From unboxing to production
deployment in hours."

Get started

Ready to deploy
AI locally?

Join the waitlist for early access and priority deployment. We respond within 24 hours.

Or email us directly at team@qalarc.com

Your hardware.Your AI.Finally easy.

Everything you needfor production-ready local AI