Large language models are increasingly used to turn complex study output into plain-English summaries. But how do we know which models are safest and most reliable for healthcare?  In this most recent community AI research paper reading, Arjun Mukerji, PhD – Staff Data Scientist at Atropos Health – walks us through RWESummary, a new benchmark designed to evaluate LLMs on summarizing real-world evidence from structured study output — an important but often under-tested scenario compared t...

Georgia Tech's Santosh Vempala Explains Why Language Models Hallucinate, His Research With OpenAI

Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.&nbsp;