About me
I am a lecturer in Artificial Intelligence at Queen Mary University of London. Before becoming a lecturer, I was a postdoctoral researcher in the Swiss AI lab working on reinforcement learning under the supervision of Jürgen Schmidhuber.
I believe that intelligence should be defined as a measure of the ability of an agent to achieve goals in a wide range of environments (Legg and Hutter, 2007), which makes reinforcement learning an excellent framework to study many challenges that intelligent agents are bound to face.
I am currently interested in unlocking the potential of formalization to accelerate the development of machine learning theory.
Formalization is the process of translating mathematical statements and their proofs into a formal language that enables their correctness to be verified algorithmically. The mathematical community has largely adopted the open-source programming language Lean for formalization, whose mathematical library has more than a million lines of code. Several organizations are also developing reliable problem solvers that combine general-purpose large language models with Lean. As these systems improve and become more widely available, they may support the development of provably safe artificial intelligence.
PhD students
- Michelangelo Conserva (now at Google Research).
- Remo Sasso (now at Amazon).
If you would like to work under my supervision, please send me a message with your curriculum vitae and a brief description of your goals after reading this.
Selected papers
- R. Sasso, M. Conserva, D. Jeurissen, P. Rauber. "Exploration with Foundation Models: Capabilities, Limitations, and Hybrid Approaches", 2025.
- M. Conserva, R. Sasso, P. Rauber. "On the Limits of Tabular Hardness Metrics for Deep RL: A Study with the Pharos Benchmark", 2025.
- R. Sasso, M. Conserva, D. Jeurissen, P. Rauber. "Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds", 2025.
- R. Sasso, M. Conserva, P. Rauber. "Posterior Sampling for Deep Reinforcement Learning", International Conference on Machine Learning (ICML), 2023.
- M. Conserva, P. Rauber. "Hardness in Markov Decision Processes: Theory and Practice", Conference on Neural Information Processing Systems (NeurIPS), 2022.
- A. Ramesh*, P. Rauber*, M. Conserva, J. Schmidhuber. "Recurrent Neural-Linear Posterior Sampling for Non-Stationary Contextual Bandits", Neural Computation, 2022.
- P. Rauber, A. Ummadisingu, F. Mutz, J. Schmidhuber. "Hindsight Policy Gradients", International Conference on Learning Representations (ICLR), 2019.
More work is available here.