Choose a model - Isaree Docs

Picking an on-device model used to mean juggling parameter counts, quantization levels, and RAM tables. The model picker on the Community Hub now does that work for you: tell it which device you use, and it shows what actually runs well on it. This guide shows how to use the picker and explains the reasoning behind its recommendations — including why they look conservative.

Set your device once

When you pick a model while building an Agent or a Scribe Agent, the picker asks which device you use — iPhone, iPad, or Mac. Every recommendation then tailors to it:

Each model shows how much RAM it needs and a fits your device badge. Models that fit float to the top; ones that don’t are dimmed.
LLMs and VLMs live in one unified list — no tab-hopping.
Start typing to search Hugging Face directly. Results show the organization, task, license, and usage stats inline, and carry the same Runs / Needs X GB badge as the curated models.

What the badge guarantees

The RAM number next to a model means runs well, not barely loads. A model that fits your device is guaranteed three things:

A full 8K-token context window — enough working memory to get through a real session.
The model runs solo — Isa never stacks multiple models in memory at once.
Headroom for the operating system — enough margin that iOS won’t force-quit Isa mid-session.

The last one surprises people: an “8 GB” iPhone gives an app only about 4.8 GB of usable memory before iOS starts force-quitting it. The picker budgets against what’s actually available, not the number on the box.

Why the recommendations look conservative

You might see a ~3 GB model asking for a 12 GB iPhone and think something’s off. It’s not — the context window drives the RAM number, not the weights. Here is the same model, Qwen 3.5 4B (about 3 GB on disk), under three different context guarantees:

Context guarantee	Peak memory	Needs
2K tokens	~4.3 GB	8 GB — iPhone 15 Pro / 16 / 17
8K tokens	~5.2 GB	12 GB — iPhone 17 Pro / 17 Air
32K tokens	~8.8 GB	16 GB — iPad Pro only

The weights stay the same size; what grows is the model’s working memory — every extra token of context needs more RAM to keep track of. The picker guarantees 8K tokens because that’s what a real session needs. Promising 2K would make the requirements look friendlier, but a longer consultation would run out of memory and crash mid-visit. That’s the philosophy behind the picker: the best model is the one that runs well on your device — fast and reliable — not the biggest one that technically loads.

What fits which device

The picker is the source of truth — set your device and read the badges. As a rough map of where things land today:

RAM	Devices	Recommended model
4 GB	iPhone 13, iPhone SE (3rd gen)	Qwen 3 1.7B
6 GB	iPhone 13 Pro, iPhone 14, iPhone 15	Qwen 3.5 2B (vision)
8 GB	iPhone 15 Pro, iPhone 16 / 16 Pro, iPhone 17	Qwen 3 4B, at a shorter context window
12 GB	iPhone 17 Air, iPhone 17 Pro / Pro Max	Qwen 3.5 4B, at the full 8K (vision)

iPads and Apple Silicon Macs ship with more RAM — 16 GB and up is common — so larger variants and longer context windows fit there. The picker covers the latest iPhones, iPads, and Macs.

If nothing fits

If your device can’t run the model you need, there are two ways out:

Move the Primary Agent to the cloud. A cloud model doesn’t consume device RAM — you bring your own API key, and data leaves the device. See On-device vs. cloud.
Use Isa on a Mac. Macs have the most usable memory of the supported devices — see Hardware requirements.

Agents remain on-device only, so they always need a model that fits.

Build an agent

Put the picker to work — build and publish an Agent on the Community Hub.

RAM and device memory

Why RAM is the constraint that decides which models your device can run.

Quantization

How a 4B-parameter model fits in about 3 GB.

​Set your device once

​What the badge guarantees

​Why the recommendations look conservative

​What fits which device

​If nothing fits

​Next