Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • e0qdk@reddthat.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 hours ago

    If you just pulled the default version of qwen3.5 from ollama’s repo you downloaded a mediocre one that only uses ~6GB.

    Check ollama show qwen3.5 and see if you get something like this in the result:

      Model
        architecture        qwen35    
        parameters          9.7B      
        context length      262144    
        embedding length    4096      
        quantization        Q4_K_M 
    

    This is the default version I got when I first tried using ollama without any experience. It worked, but it’s a heavily quantized, lower parameter version of the model – i.e. it’s pretty dumb – compared to what you can actually run on your hardware.