Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the ‘reasoning’ models.

  • Snot Flickerman@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    1
    ·
    edit-2
    7 hours ago

    Part of a properly functioning LLM is absolutely it understanding implicit instructions. That’s a huge aspect of data annotation work in helping LLMs become better tools, is grading them on either understanding or lack of understanding of implicit instructions. I would say more than half of the work I have done in that arena has focused on training them to more clearly understand implicit instructions.

    So sure, if you explain it like the LLM is a five year old human, you’ll get a better response, but the whole point is if we’re dumping so much money, resources, destroying the environment, and consumer electronics market for these tools, you shouldn’t have to explain it like it’s five.

    Seriously what is the point of trashing the planet for this shit if you have to talk to it like it’s the most oblivious person alive and practically hold it’s hand for it to understand implicit concepts?