• jj4211@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago

    One things that is enlightening is why the seahorse LLM confusion happens.

    The model has one thing to predict, can it produce a spexified emoji, yes or no? Well some reddit thread swore there was a seahorse emoji (along others) so it decided “yes”, and then easily predicted the next words to be “here it is:” At that point and not an instant before, it actually tries to generate the indicated emoji, and here, and only here it falls to find something of sufficient confidence, but the preceding words demand an emoji so it generates the wrong emoji. Then knowing the previous token wasn’t a match, it generates a sequence of words to try again and again…

    It has no idea what it is building to, it is building results the very next token at a time. Which is wild how well that works, but lands frequently in territory where previously generated tokens back itself into a corner and the best fit for subsequent tokens is garbage.