RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.
It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.
M$ are dragging their feet with BITNET for sure and no one else seems to be cooking. They were meant to have released 8b and 70b models by now (according to source files in repo). Here’s hoping.
RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.
It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.
M$ are dragging their feet with BITNET for sure and no one else seems to be cooking. They were meant to have released 8b and 70b models by now (according to source files in repo). Here’s hoping.