The future of AI has to be local and self-hosted. Soon enough you’ll have super powerful models that can run on your phone. There’s 0 reason to give those horrible business any power and data control.
Not to mention the one that I run locally on my GPU is trained on ethically-sourced data without breaking any copyright or data licensing laws, and yet it somehow works BETTER at ChatGPT for coding.
Please enlighten me how that would work? Because even if you only use open source, that would still mean, if it’s a permissive licence, you would have to give proper attribution (which AI can’t do) and if it’s copyleft, all your code would have to be under the same licence as the code and also give proper attribution.
Edit: I just looked your model up, apparently they ensure “ethically sourced training data” by only using pupicly available data and “respecting machine readable opt outs”, which is not how copyright works.
I agree with you that it needs to be local and self-hosted… I currently have an incredible AI assistant running locally using Qwen3-Coder-Next. It is fast, smart and very capable. However, I could not have gotten it setup as well as I have without the help of Claude Code… and even now, as great as my local model is, it still isn’t to the point that it can handle modifying its own code as well as Claude. The future is local, but to help us get there a powerful cloud-based AI adds a lot of value.
Yeah, some communities on Lemmy don’t like it when you have a nuanced take on something so I’m pleasantly surprised by the upvotes I’ve gotten.
I’m running a Framework Desktop with a Strix Halo and 128GB RAM and up until Qwen3 Next I was having a hard time running a useful local LLM, but this model is very fast, smart and capable. I’m currently building a frontend for it to give it some structure and make it a bit autonomous so it can monitor my systems and network and help keep everything healthy. I’ve also integrated it into my Home Assistant and it does great there as well.
I’m having difficulty with getting off the ground with these. Primarily I don’t trust the companies or individuals involved. I’m hoping for open source, local, with a GUI for desktop use and an API for automation.
What model do you use? And in what kind of framework?
Huggingface lists thousands of open source models. Each one has a page telling you what base model it’s based on, what other models are merged into it, what data its fine-tuned on, etc.
You can search by number of parameters, you can find quantized versions, you can find datasets to fine-tune your own model on.
I don’t know about GUI, but I’m sure there are some out there. Definitely options for API too
Yeah, more people should know about it. There’s really no reason to pay for an API for these giant 200 billion parameter commercial models sucking up intense resources in data centers.
A quantized 24-32 billion parameter model works just fine, can be self-hosted, and can be fine-tuned on ethically-sourced datasets to suit your specific purposes. Bonus points for running your home lab on solar power.
Not only are the commercial models trained on stolen data, but they’re so generalized that they’re basically worthless for any specialized purpose. A 12 billion parameter model with Retrieval-Augmented Generation is far less likely to hallucinate.
R1 last i checked seems to be decent enough for a local model. customizable. but that was a while ago. its release temporarily crashed Nvidia stock because they showed how smart software design trumps mass spending on cutting edge hardware.
at the end of the day its all of our data. we should own the means, especially if we built it by simply existing on the internet. without consent.
if we wish to do this, its crucial that we do everything in our power to dismantle the “profit” structure and investment hype. sooner or later someone will leak the data, and we will have access to locally run versions we can train ourselves. as long as we dont allow them to monopolize hardware, we can have the brain, and the body of it run local.
thats the only time it will be remotely ethical to use, unless its the persuit of attaining these goals.
right now you can use a Qwen-3-4B fine tuned model (Jan-v1-4B) with search tool and get even better results than Perplexity Pro, and this was 6 moths ago
RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.
It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.
M$ are dragging their feet with BITNET for sure and no one else seems to be cooking. They were meant to have released 8b and 70b models by now (according to source files in repo). Here’s hoping.
The future of AI has to be local and self-hosted. Soon enough you’ll have super powerful models that can run on your phone. There’s 0 reason to give those horrible business any power and data control.
No thanks, I’m good
Not to mention the one that I run locally on my GPU is trained on ethically-sourced data without breaking any copyright or data licensing laws, and yet it somehow works BETTER at ChatGPT for coding.
Please enlighten me how that would work? Because even if you only use open source, that would still mean, if it’s a permissive licence, you would have to give proper attribution (which AI can’t do) and if it’s copyleft, all your code would have to be under the same licence as the code and also give proper attribution.
Edit: I just looked your model up, apparently they ensure “ethically sourced training data” by only using pupicly available data and “respecting machine readable opt outs”, which is not how copyright works.
I agree with you that it needs to be local and self-hosted… I currently have an incredible AI assistant running locally using Qwen3-Coder-Next. It is fast, smart and very capable. However, I could not have gotten it setup as well as I have without the help of Claude Code… and even now, as great as my local model is, it still isn’t to the point that it can handle modifying its own code as well as Claude. The future is local, but to help us get there a powerful cloud-based AI adds a lot of value.
Thank you for honestly stating that. I am in similar position myself.
How do you like Qwen 3 next? With only 8GB vram I’m limited in what I can self host (maybe the Easter bunny will bring me a Strix lol).
Yeah, some communities on Lemmy don’t like it when you have a nuanced take on something so I’m pleasantly surprised by the upvotes I’ve gotten.
I’m running a Framework Desktop with a Strix Halo and 128GB RAM and up until Qwen3 Next I was having a hard time running a useful local LLM, but this model is very fast, smart and capable. I’m currently building a frontend for it to give it some structure and make it a bit autonomous so it can monitor my systems and network and help keep everything healthy. I’ve also integrated it into my Home Assistant and it does great there as well.
I’m having difficulty with getting off the ground with these. Primarily I don’t trust the companies or individuals involved. I’m hoping for open source, local, with a GUI for desktop use and an API for automation.
What model do you use? And in what kind of framework?
Huggingface lists thousands of open source models. Each one has a page telling you what base model it’s based on, what other models are merged into it, what data its fine-tuned on, etc.
You can search by number of parameters, you can find quantized versions, you can find datasets to fine-tune your own model on.
I don’t know about GUI, but I’m sure there are some out there. Definitely options for API too
Huggingface is an absolutly great ressource
Yeah, more people should know about it. There’s really no reason to pay for an API for these giant 200 billion parameter commercial models sucking up intense resources in data centers.
A quantized 24-32 billion parameter model works just fine, can be self-hosted, and can be fine-tuned on ethically-sourced datasets to suit your specific purposes. Bonus points for running your home lab on solar power.
Not only are the commercial models trained on stolen data, but they’re so generalized that they’re basically worthless for any specialized purpose. A 12 billion parameter model with Retrieval-Augmented Generation is far less likely to hallucinate.
R1 last i checked seems to be decent enough for a local model. customizable. but that was a while ago. its release temporarily crashed Nvidia stock because they showed how smart software design trumps mass spending on cutting edge hardware.
at the end of the day its all of our data. we should own the means, especially if we built it by simply existing on the internet. without consent.
if we wish to do this, its crucial that we do everything in our power to dismantle the “profit” structure and investment hype. sooner or later someone will leak the data, and we will have access to locally run versions we can train ourselves. as long as we dont allow them to monopolize hardware, we can have the brain, and the body of it run local.
thats the only time it will be remotely ethical to use, unless its the persuit of attaining these goals.
right now you can use a Qwen-3-4B fine tuned model (Jan-v1-4B) with search tool and get even better results than Perplexity Pro, and this was 6 moths ago
How is it both 6 months ago and right now?
“I used to do drugs. I still do drugs but I used to too” - Mitch Hedberg
Still same, I writed a post that explains why they suck https://lemmy.zip/post/58970686
No need to leak the data, it’s open source. https://arxiv.org/abs/2211.15533
more like reclamation of data. if anything.
Self-hosting is already an option, go have a look around huggingface
I use the Apertus model on the LM Studio software. It’s all open source:
https://github.com/swiss-ai/apertus-tech-report/blob/main/Apertus_Tech_Report.pdf
RAM constraints make phone running difficult. As do the more restricted quantization schemes NPUs require. 1B-8B LLMs are shockingly good backed with RAG, but still kind of limited.
It seemed like Bitnet would solve all that, but the big model trainers have ignored it, unfortunately. Or at least not told anyone about their experiments with it.
M$ are dragging their feet with BITNET for sure and no one else seems to be cooking. They were meant to have released 8b and 70b models by now (according to source files in repo). Here’s hoping.