427 Episodes

  1. Reward Shaping from Confounded Offline Data

    Published: 5/25/2025
  2. Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

    Published: 5/25/2025
  3. Understanding Best-of-N Language Model Alignment

    Published: 5/25/2025
  4. Maximizing Acquisition Functions for Bayesian Optimization - and its relation to Gradient Descent

    Published: 5/24/2025
  5. Bayesian Prompt Ensembles: Model Uncertainty Estimation for Black-Box Large Language Models

    Published: 5/24/2025
  6. Prompting Strategies for Enabling Large Language Models to Infer Causation from Correlation

    Published: 5/24/2025
  7. The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

    Published: 5/24/2025
  8. FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch

    Published: 5/24/2025
  9. Automated Social Science: A Structural Causal Model-Based Approach

    Published: 5/24/2025
  10. Causal Interpretation of Transformer Self-Attention

    Published: 5/24/2025
  11. A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment

    Published: 5/24/2025
  12. Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs

    Published: 5/24/2025
  13. Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation

    Published: 5/24/2025
  14. Prompts from Reinforcement Learning (PRL)

    Published: 5/24/2025
  15. Logits are All We Need to Adapt Closed Models

    Published: 5/24/2025
  16. Large Language Models Are (Bayesian) Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

    Published: 5/23/2025
  17. Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

    Published: 5/23/2025
  18. From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

    Published: 5/23/2025
  19. LLM In-Context Learning as Kernel Regression

    Published: 5/23/2025
  20. Personalizing LLMs via Decode-Time Human Preference Optimization

    Published: 5/23/2025

10 / 22

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.