Best AI papers explained

A podcast by Enoch H. Kang

550 Episodes

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Published: 7/22/2025
The Invisible Leash: Why RLVR May Not Escape Its Origin
Published: 7/20/2025
Language Model Personalization via Reward Factorization
Published: 7/20/2025
Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions
Published: 7/18/2025
Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective
Published: 7/17/2025
Soft Best-of-n Sampling for Model Alignment
Published: 7/16/2025
On Temporal Credit Assignment and Data-Efficient Reinforcement Learning
Published: 7/15/2025
Bradley–Terry and Multi-Objective Reward Modeling Are Complementary
Published: 7/15/2025
Probing Foundation Models for World Models
Published: 7/15/2025
GenAI-Powered Statistical Inference (with Unstructured Data)
Published: 7/14/2025
Interpretable Reward Modeling with Active Concept Bottlenecks
Published: 7/14/2025
PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications
Published: 7/14/2025
A Collectivist, Economic Perspective on AI
Published: 7/14/2025
Textual Bayes: Quantifying Uncertainty in LLM-Based Systems
Published: 7/12/2025
The Winner's Curse in Data-Driven Decisions
Published: 7/11/2025
SPIRAL: Self-Play for Reasoning Through Zero-Sum Games
Published: 7/11/2025
Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence
Published: 7/11/2025
Aligning Learning and Endogenous Decision-Making
Published: 7/11/2025
Reliable Statistical Inference with Synthetic Data from Large Language Models
Published: 7/11/2025
Multi-Turn Reinforcement Learning from Human Preference Feedback
Published: 7/10/2025

9 / 28

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

550 Episodes

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

The Invisible Leash: Why RLVR May Not Escape Its Origin

Language Model Personalization via Reward Factorization

Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions

Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective

Soft Best-of-n Sampling for Model Alignment

On Temporal Credit Assignment and Data-Efficient Reinforcement Learning

Bradley–Terry and Multi-Objective Reward Modeling Are Complementary

Probing Foundation Models for World Models

GenAI-Powered Statistical Inference (with Unstructured Data)

Interpretable Reward Modeling with Active Concept Bottlenecks

PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications

A Collectivist, Economic Perspective on AI

Textual Bayes: Quantifying Uncertainty in LLM-Based Systems

The Winner's Curse in Data-Driven Decisions

SPIRAL: Self-Play for Reasoning Through Zero-Sum Games

Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence

Aligning Learning and Endogenous Decision-Making

Reliable Statistical Inference with Synthetic Data from Large Language Models

Multi-Turn Reinforcement Learning from Human Preference Feedback