

11·
1 month agoTokens are a lossless conversion, you can convert it back to the original text.


Tokens are a lossless conversion, you can convert it back to the original text.


You are mixing two kind of AI, LLM and diffusion.
It’s way harder for a diffusion model to not change the rest, the first step of a diffusion model is to use a lossy compression to transform the picture into a soup of digits that the diffusion model can understand.
You will never guess who already did that.


Consider that a new power efficient CPU may be cheaper by consuming less electricity over a few years!
In order to make such affirmation or infirmation we’ll need to define understanding.
The example you gave can be explained by other way than “it doesn’t understand”.
For example, the “how many ‘r’ in strawberry”, LLMs see tokens, and the dataset they use, doesn’t contain a lot of data about the letters that are present in a token.