This week we discuss The Illusion of Thinking, a new paper from researchers at Apple that challenges today’s evaluation methods and introduces a new benchmark: synthetic puzzles with controllable complexity and clean logic.  Their findings? Large Reasoning Models (LRMs) show surprising failure modes, including a complete collapse on high-complexity tasks and a decline in reasoning effort as problems get harder.  Dylan and Parth dive into the paper's findings as well as the debate around ...

Self-Adapting Language Models: Paper Authors Discuss Implications

Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.&nbsp;