Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase

sanitation@lemmy.today · 1 day ago

Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase

expr@programming.dev · 23 hours ago

That’s not lying. There’s nothing linguistic about numerical computation.

Communist@lemmy.frozeninferno.xyz · 16 hours ago

No.

https://www.nature.com/articles/d41586-025-02343-x

It’s lying

zbyte64@awful.systems · 13 hours ago

You know the “DeepMind and OpenAi models” is the hint that the LLM model is not the one doing the math. The LLM provides a hypothesis and the DeepMind model provides grounding or feedback on whether the hypothesis even makes sense or works.

Communist@lemmy.frozeninferno.xyz · 5 hours ago

It is totally irrelevant that the model calls tools to do the math. That is still a success.

Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase

Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase

Just a moment...