What’s the Magic Word? A Control Theory of LLM Prompting

Best AI papers explained - A podcast by Enoch H. Kang

Categories:

This paper investigates the mathematical basis of large language model (LLM) prompting by framing LLMs as discrete stochastic dynamical systems and employing control theory. The authors formalize LLM systems and introduce concepts of controllability and reachability in this context. They present a Self-Attention Control Theorem that provides a theoretical limit on controlling self-attention outputs based on singular values of parameter matrices. Empirical results demonstrate that short prompts can significantly alter LLM output likelihood, even making low-probability tokens highly probable. The work highlights the significant and often poorly understood role of input sequences in steering LLM behavior and offers a foundation for improving LLM system capabilities.