Motherboard sales are now collapsing amid unprecedented shortages fueled by AI

floofloof@lemmy.ca · 1 day ago

Motherboard sales are now collapsing amid unprecedented shortages fueled by AI

boonhet@sopuli.xyz · 8 hours ago

Ollama and llama.cpp allow it too but it’s super slow in my experience.

SabinStargem@lemmy.today · 4 minutes ago

Speed depends on how much of the model is on VRAM, and the dense/MoE architecture of that model. The RAM’s benefit is more about having the ability to run the model in the first place. In any case, a dense Qwen3.6 27b would take up about 27-33gb-ish of memory, plus whatever context size you set.

Upcoming implementation of MTP will increase the size of models, but in exchange, they will also run faster. About a 30%ish boost for dense models, a bit less for Mixture of Expert varieties, from the looks of it.