There is no magically reliable source of data that will make everything in one LLMs consistently accurate because their underlying design requires some randomization to reflect human conversation.
Dedicated models for specific use purposes where terminology is defined and they are designed to be deterministic would make them a lot better for actual use. We have had those models for years, just without the pretending to be conversational crap and they were constantly improving and actually useful.
because their underlying design requires some randomization to reflect human conversation.
That’s just false. Although the first step of creating an LLM from scratch is to generate a gaussian distribution, which is randomized, those matrices get overwritten multiple times throughout the process of pre-training and fine-tuning, when parametric weights are finely adjusted based on the training data.
During inferencing, tokens pass through various layers along specific embedded vectors weighted for relevance. It’s not random at all. It’s non-deterministic, but that’s not the same thing as random.
If the training data all came from JSTOR or DevDocs or even WikiPedia, it’s going to make much more accurate inferences than if it was trained on Reddit, Quora, and Yahoo Answers.
I’m not defending AI here, but lets keep our criticisms factual.
Except if you make the output token temperature too cold, it has a higher tendency to get stuck in loops and the like. A little bit of actual randomness is important.
It’s not unique to AI, no, but no one said it was. My point is that the noise is important to the functioning of the AI - and makes it even less deterministic, which also makes it poorly suited to automation in critical systems.
There is no magically reliable source of data that will make everything in one LLMs consistently accurate because their underlying design requires some randomization to reflect human conversation.
Dedicated models for specific use purposes where terminology is defined and they are designed to be deterministic would make them a lot better for actual use. We have had those models for years, just without the pretending to be conversational crap and they were constantly improving and actually useful.
That’s just false. Although the first step of creating an LLM from scratch is to generate a gaussian distribution, which is randomized, those matrices get overwritten multiple times throughout the process of pre-training and fine-tuning, when parametric weights are finely adjusted based on the training data.
During inferencing, tokens pass through various layers along specific embedded vectors weighted for relevance. It’s not random at all. It’s non-deterministic, but that’s not the same thing as random.
If the training data all came from JSTOR or DevDocs or even WikiPedia, it’s going to make much more accurate inferences than if it was trained on Reddit, Quora, and Yahoo Answers.
I’m not defending AI here, but lets keep our criticisms factual.
Except if you make the output token temperature too cold, it has a higher tendency to get stuck in loops and the like. A little bit of actual randomness is important.
That’s just adding noise, it’s not unique to AI. It’s also used in audio and visual design, and even cryptography.
It’s not unique to AI, no, but no one said it was. My point is that the noise is important to the functioning of the AI - and makes it even less deterministic, which also makes it poorly suited to automation in critical systems.