• SabinStargem@lemmy.today
    link
    fedilink
    English
    arrow-up
    1
    ·
    4 hours ago

    I am using a 5950x, with 128gb of DDR4 3600 memory. The GPUs are a 3060 and 4090, totaling 36gb of VRAM. IMO, being bottlenecked by the CPU is definitely a thing, it just comes third after the VRAM and RAM considerations.

    With a 35b+3a MoE at Q8 with KV8, I get…

    [11:54:32] CtxLimit:18858/262144, Init:0.18s, Processed:17294 in 7.66s (2259.18T/s), Generated:1564/32768 in 29.01s (53.91T/s), Total:36.85s