VRAM COMPATIBILITY // LOCAL AI

What Can I Run?

Select your GPU to see every Ollama-compatible local AI model that fits in its VRAM — from tiny 1B chat models to frontier-class 70B reasoners.

Elite/Model Training ReadyNVIDIA

NVIDIA GeForce RTX 5090

GB VRAM

Models Fit

1792 GB/s

$2,049.99

See compatible models →

Prosumer/Mid-RangeNVIDIA

NVIDIA GeForce RTX 5080

GB VRAM

Models Fit

960 GB/s

$1,349.99

See compatible models →

Prosumer/Mid-RangeNVIDIA

NVIDIA GeForce RTX 5070 Ti

GB VRAM

Models Fit

896 GB/s

$499.99

See compatible models →

Entry LevelNVIDIA

NVIDIA GeForce RTX 5070

GB VRAM

Models Fit

672 GB/s

$649.99

See compatible models →

Prosumer/Mid-RangeAMD

AMD Radeon RX 9070 XT

GB VRAM

Models Fit

640 GB/s

$579.99

See compatible models →

Prosumer/Mid-RangeAMD

AMD Radeon RX 9070

GB VRAM

Models Fit

640 GB/s

$469.99

See compatible models →

Elite/Model Training ReadyNVIDIA

NVIDIA GeForce RTX 4090

GB VRAM

Models Fit

1008 GB/s

$1,799.00

See compatible models →

Prosumer/Mid-RangeNVIDIA

NVIDIA GeForce RTX 4080 Super

GB VRAM

Models Fit

736 GB/s

$879.99

See compatible models →

Prosumer/Mid-RangeNVIDIA

NVIDIA GeForce RTX 4070 Ti Super

GB VRAM

Models Fit

672 GB/s

$599.99

See compatible models →

Entry LevelNVIDIA

NVIDIA GeForce RTX 4070 Super

GB VRAM

Models Fit

504 GB/s

$609.99

See compatible models →

Elite/Model Training ReadyNVIDIA

NVIDIA GeForce RTX 3090

GB VRAM

Models Fit

936 GB/s

$599.99

See compatible models →

Entry LevelNVIDIA

NVIDIA GeForce RTX 3060 12GB

GB VRAM

Models Fit

360 GB/s

$249.99

See compatible models →

How VRAM Compatibility Works

Quantization: Models are compressed using Q4_K_M quantization — roughly halving VRAM usage with minimal quality loss vs. full FP16.

KV Cache: Our recommendations include ~1–3 GB of VRAM headroom for the KV cache needed during inference. Tighter fits may limit context length.

One Command: All models shown can be launched via Ollama with a single terminal command — no CUDA setup, no driver headaches.