Key components

RAM vs. VRAM

FeatureRAM (System Memory)VRAM (GPU Memory)
LocationMotherboard, accessed by CPUOn the GPU card
Speed and bandwidthSlower, lower bandwidthMuch faster, high bandwidth
Primary use in LLMsLoading model data, OS and CPU tasksStoring model weights, gradients, KV cache
Critical forModel loading, data pipelines, background tasksModel training and inference performance
Size considerationShould be at least as large as model sizeMust be sufficient to fit model + context

Without enough VRAM, large models cannot be run effectively, while insufficient RAM can bottleneck loading and system stability . Both memories are essential, but VRAM is the key limiting factor for large model training and inference.