SuspiciousCarrot78@aussie.zone to Selfhosted@lemmy.worldEnglish · 12 hours agoDo you host your own AI?message-squaremessage-square130linkfedilinkarrow-up188arrow-down124file-text
arrow-up164arrow-down1message-squareDo you host your own AI?SuspiciousCarrot78@aussie.zone to Selfhosted@lemmy.worldEnglish · 12 hours agomessage-square130linkfedilinkfile-text
minus-squaree0qdk@reddthat.comlinkfedilinkEnglisharrow-up3·3 hours agoIf you just pulled the default version of qwen3.5 from ollama’s repo you downloaded a mediocre one that only uses ~6GB. Check ollama show qwen3.5 and see if you get something like this in the result: Model architecture qwen35 parameters 9.7B context length 262144 embedding length 4096 quantization Q4_K_M This is the default version I got when I first tried using ollama without any experience. It worked, but it’s a heavily quantized, lower parameter version of the model – i.e. it’s pretty dumb – compared to what you can actually run on your hardware.
If you just pulled the default version of qwen3.5 from ollama’s repo you downloaded a mediocre one that only uses ~6GB.
Check
ollama show qwen3.5and see if you get something like this in the result:Model architecture qwen35 parameters 9.7B context length 262144 embedding length 4096 quantization Q4_K_MThis is the default version I got when I first tried using ollama without any experience. It worked, but it’s a heavily quantized, lower parameter version of the model – i.e. it’s pretty dumb – compared to what you can actually run on your hardware.