546 Episodes

  1. Natural emergent misalignment from reward hacking in production RL

    Published: 11/25/2025
  2. Evolution Strategies at the Hyperscale

    Published: 11/25/2025
  3. The Path Not Taken: RLVR Provably Learns Off the Principals

    Published: 11/23/2025
  4. Back to Basics: Let Denoising Generative Models Denoise

    Published: 11/23/2025
  5. LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

    Published: 11/22/2025
  6. Black-Box On-Policy Distillation of Large Language Models

    Published: 11/20/2025
  7. Solving a million step LLM task with zero errors

    Published: 11/20/2025
  8. Not All Thoughts Matter: Selective Attention for Efficient Reasoning

    Published: 11/19/2025
  9. Sample-Efficient Parametric Learning from Natural Language

    Published: 11/19/2025
  10. Bayesian Optimization in Language space: An Eval-Efficient AI Self-Improvement Framework

    Published: 11/18/2025
  11. Context Engineering: Sessions, Memory

    Published: 11/16/2025
  12. The Era of Agentic Organization: Learning to Organize with Language Models

    Published: 11/15/2025
  13. Understanding neural networks through sparse circuits

    Published: 11/14/2025
  14. Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

    Published: 11/14/2025
  15. Multi-Agent Evolve: LLM Self-Improvement Through Co-Evolution

    Published: 11/14/2025
  16. LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

    Published: 11/14/2025
  17. PREFDISCO: Evaluating Proactive Personalization through Interactive Preference Discovery

    Published: 11/12/2025
  18. Reusing pre-training data at test time is a compute multiplier

    Published: 11/10/2025
  19. Scaling Agent Learning via Experience Synthesis

    Published: 11/9/2025
  20. Continuous Autoregressive Language Models

    Published: 11/8/2025

1 / 28

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.