THE_LEXICON_v2.0

Local AI Glossary

16 essential terms for local AI hardware and LLM deployment โ€” from VRAM to quantization to CUDA.

VRAM

The on-GPU memory that stores model weights. Determines which AI models you can run.

Learn more

Quantization

Compressing model weights from 16-bit to 4-bit precision to massively reduce VRAM usage.

Learn more

FP8 / FP4

Next-generation precision formats that dramatically accelerate inference on NVIDIA Blackwell and Ada GPUs.

Learn more

Tokens per Second (TPS)

The universal speed metric for LLMs โ€” how many words (tokens) your GPU generates per second.

Learn more

Memory Bandwidth

How fast your GPU can read model weights from VRAM โ€” the real determinant of inference speed.

Learn more

KV Cache

The memory that stores the AI's 'conversation history' during generation โ€” it lives in VRAM.

Learn more

Context Window

How much text the AI can 'remember' and process at once โ€” directly tied to VRAM through the KV cache.

Learn more

GGUF

The universal file format for running quantized LLMs locally via llama.cpp and Ollama.

Learn more

Ollama

The easiest way to run open-source LLMs locally with one command โ€” like Docker for AI models.

Learn more

LM Studio

A polished desktop GUI for discovering, downloading, and chatting with local AI models.

Learn more

TDP (Thermal Design Power)

The maximum sustained power your GPU draws under full load โ€” critical for PSU selection.

Learn more

LLM (Large Language Model)

The AI models like Llama, Mistral, and DeepSeek that generate human-like text โ€” the software your GPU runs.

Learn more

CUDA

NVIDIA's proprietary parallel computing platform โ€” the secret behind why NVIDIA GPUs dominate AI.

Learn more

Inference

Running a trained AI model to generate outputs โ€” what your local GPU does when you chat with an LLM.

Learn more

ROCm

AMD's open-source answer to CUDA โ€” enables AI inference on Radeon GPUs.

Learn more

Llama.cpp

The open-source engine that made running 70B models on consumer hardware possible in 2023.

Learn more

Ready to Build?

Now that you speak the language, use our AI Computer Builder to spec out your hardware with precision.