Download my Curriculum Vitae.
Publications
International Conferences
[C1] Cristiano Migali, Marco Mussi, Gianmarco Genalti and Alberto Maria Metelli. Tightening Regret Lower and Upper Bounds in Restless Rising Bandits. Advances in Neural Information Processing Systems (NeurIPS). 2025.
[Link - To Appear]
[Paper - To Appear]
[Poster - To Appear]
[Slides - To Appear]
[C2] Simone Drago*, Marco Mussi* and Alberto Maria Metelli. Sleeping Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear]
[Paper]
[Poster]
[C3] Simone Drago, Marco Mussi and Alberto Maria Metelli. Towards Theoretical Understanding of Sequential Decision Making with Preference Feedback. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear]
[Paper]
[Poster]
[C4] Alessandro Montenegro, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Convergence Analysis of Policy Gradient Methods with Dynamic Stochasticity. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear]
[Paper]
[Poster]
[C5] Simone Drago, Marco Mussi and Alberto Maria Metelli. Position: Constants are Critical in Regret Bounds for Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear]
[Paper]
[Poster]
[C6] Alessandro Montenegro, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning. Advances in Neural Information Processing Systems (NeurIPS). 2024.
[Link]
[Paper]
[arXiv]
[Poster]
[Slides]
[C7] Marco Mussi*, Simone Drago*, Marcello Restelli and Alberto Maria Metelli. Factored-Reward Bandits with Intermediate Observations. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link]
[Paper]
[Poster]
[Slides]
[C8] Marco Mussi, Alessandro Montenegro, Francesco Trovò, Marcello Restelli and Alberto Maria Metelli. Best Arm Identification for Stochastic Rising Bandits. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link]
[Paper]
[arXiv]
[Poster]
[C9] Alessandro Montenegro, Marco Mussi, Alberto Maria Metelli and Matteo Papini. Learning Optimal Deterministic Policies with Stochastic Policy Gradients. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link]
[Paper]
[arXiv]
[Poster]
[C10] Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni and Alberto Maria Metelli. Graph-Triggered Rising Bandits. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link]
[Paper]
[Poster]
[C11] Francesco Bacchiocchi*, Gianmarco Genalti*, Davide Maran*, Marco Mussi*, Marcello Restelli, Nicola Gatti and Alberto Maria Metelli. Autoregressive Bandits. Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS). 2024.
[Link]
[Paper]
[arXiv]
[Poster]
[Slides]
[C12] Marco Mussi, Alberto Maria Metelli and Marcello Restelli. Dynamical Linear Bandits. Proceedings of the 40th International Conference on Machine Learning (ICML). 2023.
[Link]
[Paper]
[arXiv]
[Poster]
[Slides]
[C13] Marco Mussi*, Gianmarco Genalti*, Alessandro Nuara, Francesco Trovò, Marcello Restelli and Nicola Gatti. Dynamic Pricing with Volume Discounts in Online Settings. Proceedings of the 35th Conference on Innovative Applications of Artificial Intelligence (IAAI). 2023. AAAI. Innovative Application of AI Award.
[Link]
[Paper]
[arXiv]
[Poster]
[Slides]
[Award]
[C14] Marco Mussi, Gianmarco Genalti, Francesco Trovò, Alessandro Nuara, Nicola Gatti and Marcello Restelli. Pricing the Long Tail by Explainable Product Aggregation and Monotonic Bandits. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022.
[Link]
[Paper]
[Poster]
[Slides]
Journals
[J1] Marco Mussi, Alberto Maria Metelli, Marcello Restelli, Gianvito Losapio, Ricardo Jorge Bessa, Daniel Boos, Clark Borst, Giulia Leto, Alberto Castagna, Ricardo Chavarriaga, Duarte Dias, Adrian Egli, Andrina Eisenegger, Yassine El Manyari, Anton Fuxjäger, Joaquim Geraldes, Samira Hamouche, Mohamed Hassouna, Bruno Lemetayer, Milad Leyli-Abadi, Roman Liessner, Jonas Lundberg, Antoine Marot, Maroua Meddeb, Viola Schiaffonati, Manuel Schneider, Thilo Stadelmann, Julia Usher, Herke van Hoof, Jan Viebahn, Toni Waefler and Giacomo Zanotti. Human-AI Interaction in Safety-Critical Network Infrastructures. iScience. 2025.
[Link]
[Paper]
[J2] Marco Mussi*, Simone Drago*, Marcello Restelli and Alberto Maria Metelli. Factored-Reward Bandits with Intermediate Observations: Regret Minimization and Best Arm Identification. Artificial Intelligence. 2025.
[Link]
[Paper]
[J3] Marco Mussi and Alberto Maria Metelli. Generalizing the Regret: an Analysis of Lower and Upper Bounds. Journal of Artificial Intelligence Research. 2025.
[Link]
[Paper]
[J4] Marco Mussi, Luigi Pellegrino, Oscar Francesco Pindaro, Marcello Restelli and Francesco Trovò. A Reinforcement Learning Controller Optimizing Costs and Battery State of Health in Smart Grids. Journal of Energy Storage. 2024.
[Link]
[Paper]
[J5] Marco Mussi, Davide Lombarda, Alberto Maria Metelli, Francesco Trovò and Marcello Restelli. ARLO: A Framework for Automated Reinforcement Learning. Expert Systems with Applications. 2023.
[Link]
[Paper]
[arXiv]
[J6] Marco Mussi, Luigi Pellegrino, Marcello Restelli and Francesco Trovò. An Online State of Health Estimation Method for Lithium-Ion Batteries based on Time Partitioning and Data-Driven Model Identification. Journal of Energy Storage. 2022.
[Link]
[Paper]
[J7] Marco Mussi, Luigi Pellegrino, Marcello Restelli and Francesco Trovò. A voltage dynamic-based state of charge estimation method for batteries storage systems. Journal of Energy Storage. 2021.
[Link]
[Paper]
Workshops
[W1] Federico Corso, Marco Mussi and Alberto Maria Metelli. Trading-off Reward Maximization and Stability in Sequential Decision Making. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link]
[Paper]
[Poster]
[W2] Simone Drago, Marco Mussi and Alberto Maria Metelli. A Theoretical Perspective on Sequential Decision Making with Preference Feedback. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link]
[Paper]
[Poster]
[W3] Alberto Maria Metelli, Simone Drago and Marco Mussi. A Novel Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds for Generalized Kernelized Bandits. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link]
[Paper]
[Poster]
[W4] Davide Salaorni, Vincenzo De Paola, Samuele Delpero, Giovanni Dispoto, Paolo Bonetti, Alessio Russo, Giuseppe Calcagno, Francesco Trovò, Matteo Papini, Alberto Maria Metelli, Marco Mussi and Marcello Restelli. Gym4ReaL: A Benchmark Suite for Evaluating Reinforcement Learning in Realistic Domains. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link]
[Paper]
[Poster]
[W5] Carlo Fabrizio*, Gianvito Losapio*, Marco Mussi, Alberto Maria Metelli and Marcello Restelli. Power Grid Control with Graph-Based Distributed Reinforcement Learning. Workshop on Machine Learning for Sustainable Power Systems at the European Conference on Machine Learning (ECML). 2025.
[Paper]
[arXiv]
[Poster]
[W6] Gianvito Losapio, Davide Beretta, Marco Mussi, Alberto Maria Metelli and Marcello Restelli. State and Action Factorization in Power Grids. Workshop on Machine Learning for Sustainable Power Systems at the European Conference on Machine Learning (ECML). 2024.
[Paper]
[arXiv]
[Poster]
[Slides]
[W7] Simone Drago and Marco Mussi. Open Problem: Tight Bounds for Bernoulli Rewards in Kernelized Multi-Armed Bandits. Workshop on Aligning Reinforcement Learning Experimentalists and Theorists at the International Conference on Machine Learning (ICML). 2024.
[Link]
[Paper]
[Poster]
[W8] Simone Drago, Marco Mussi, Marcello Restelli and Alberto Maria Metelli. Intermediate Observations in Factored-Reward Bandits. Adaptive and Learning Agents Workshop at the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS). 2024.
[Link]
[Paper]
[Slides]
[W9] Francesco Bacchiocchi*, Gianmarco Genalti*, Davide Maran*, Marco Mussi*, Marcello Restelli, Nicola Gatti and Alberto Maria Metelli. Online Learning in Autoregressive Dynamics. European Workshop on Reinforcement Learning (EWRL). 2023.
[Link]
[Paper]
[Poster]
[W10] Alessandro Montenegro, Marco Mussi, Francesco Trovò, Marcello Restelli and Alberto Maria Metelli. Stochastic Rising Bandits: A Best Arm Identification Approach. European Workshop on Reinforcement Learning (EWRL). 2023.
[Link]
[Paper]
[Poster]
[W11] Alessandro Montenegro, Marco Mussi, Francesco Trovò, Marcello Restelli and Alberto Maria Metelli. A Best Arm Identification Approach for Stochastic Rising Bandits. Workshop on New Frontiers in Learning, Control, and Dynamical Systems at International Conference on Machine Learning (ICML). 2023.
[Link]
[Paper]
[Poster]
[W12] Gianmarco Genalti, Marco Mussi, Alessandro Nuara and Nicola Gatti. Dynamic Pricing with Online Data Aggregation and Learning. European Workshop on Reinforcement Learning (EWRL). 2022.
[Link]
[Paper]
[Poster]
[Slides]
[W13] Marco Mussi, Alberto Maria Metelli and Marcello Restelli. Dynamical Linear Bandits for Long-Lasting Vanishing Rewards. Complex Feedback in Online Learning Workshop at International Conference on Machine Learning (ICML). 2022.
[Link]
[Paper]
[Poster]
Book Chapters
[B1] Marco Mussi. Multi-Armed Bandits Algorithms for Pricing and Advertising. Special Topics in Information Technology. 2025.
[Link - To Appear]
[Paper]
Preprints
[P1] Alberto Maria Metelli, Simone Drago and Marco Mussi. Generalized Kernelized Bandits: Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds. arXiv preprint, arXiv:2508.01681. 2025.
[Paper]
[arXiv]
[P2] Davide Salaorni, Vincenzo De Paola, Samuele Delpero, Giovanni Dispoto, Paolo Bonetti, Alessio Russo, Giuseppe Calcagno, Francesco Trovò, Matteo Papini, Alberto Maria Metelli, Marco Mussi and Marcello Restelli. Gym4ReaL: A Suite for Benchmarking Real-World Reinforcement Learning. arXiv preprint, arXiv:2507.00257. 2025.
[Paper]
[arXiv]
[P3] Alessandro Montengro, Federico Mansutti, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Reusing Trajectories in Policy Gradients Enables Fast Convergence. arXiv preprint, arXiv:2506.06178. 2025.
[Paper]
[arXiv]
[P4] Alessandro Montengro, Leonardo Cesani, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes. arXiv preprint, arXiv:2506.05953. 2025.
[Paper]
[arXiv]
[P5] Simone Drago, Marco Mussi and Alberto Maria Metelli. A refined Analysis of UCBVI. arXiv preprint, arXiv:2502.17370. 2025.
[Paper]
[arXiv]
[P6] Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni and Alberto Maria Metelli. Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting. arXiv preprint, arXiv:2409.05980. 2024.
[Paper]
[arXiv]
[P7] Marco Mussi, Simone Drago and Alberto Maria Metelli. Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards. arXiv preprint, arXiv:2407.06321. 2024.
[Paper]
[arXiv]
Technical Reports
[T1] Marco Mussi, Gianvito Losapio, Alberto Maria Metelli, Marcello Restelli, Ricardo Bessa, Antoine Marot, Daniel Boos, Clark Borst, Alberto Castagna, Duarte Dias, Adrian Egli, Andrina Eisenegger, Yassine El Manyari, Anton Fuxjäger, Samira Hamouche, Mohamed Hassouna, Bruno Lemetayer, Roman Liessner, Jonas Lundberg, Manuel Schneider, Irene Sturm, Julia Usher, Herke Van Hoof, Jan Viebahn and Toni Wäfler. Position paper on AI for the operation of critical energy and mobility network infrastructures. AI4REALNET. 2024.
[Link]
[Report]
Education
Ph.D. in Information Technology - Politecnico di Milano (Nov 2020 - Jun 2024)
Focus on Reinforcement Learning and Online Learning.
Supervisor: Prof. Marcello Restelli
[Link]
[Thesis]
[Slides]
M.Sc. in Computer Science and Engineering - Politecnico di Milano (Sep 2017 - Dec 2019)
Main focus: Artificial Intelligence and Machine Learning
Relevant coursework: Machine Learning, Artificial Intelligence, Game Theory, Autonomous Agents and Multi- agent Systems, Foundations of Operational Research, Software Engineering, Principles of Programming Languages, Data Bases II
B.Sc. in Engineering of Computing Systems - Politecnico di Milano (Sep 2014 - Jul 2017)
Relevant coursework: Software Engineering, Theoretical Computer Science, Communication Networks and Internet, Information Systems, Data Bases I, Computer Architecture and Operating Systems, Automatic Control, Calculus I, Calculus II, Linear Algebra and Geometry, Logics and Algebra, Statistics and Probability, Physics, Applied Physics
High School Diploma in Computer Science - IIS Galileo Galilei Crema (Sep 2008 - Jul 2014)
Main Focus: C, Java, HTML, CSS, Javascript
Experience
Postdoctoral Researcher - Politecnico di Milano (Jun 2024 - now)
Supervisor: Prof. Marcello Restelli
Research Scientist - ML cube (Nov 2020 - Jun 2024)
Goal: develop algorithms for dynamic pricing and advertising optimization
Research Assistant - Politecnico di Milano (Jan 2020 - Oct 2020)
Supervisor: Prof. Marcello Restelli
Contacts
Email
marco DOT mussi AT polimi DOT it
Office
Office 19, First Floor of Building 21
Dipartimento di Elettronica, Informazione e Bioingegneria
Politecnico di Milano
Via Ponzio 34/5, Milan, 20133, Italy