Marco Mussi

I am a Postdoctoral Researcher with the Dipartimento di Elettronica, Informazione e Bioingegneria, in the Artificial Intelligence and Robotic Laboratory of Politecnico di Milano. I received the Doctor of Philosophy in Information Technology (with honors) at Politecnico di Milano in June 2024, supervised by Professor Marcello Restelli. My main research topics revolve around artificial intelligence and machine learning, focusing on online learning and reinforcement learning.

Download my Curriculum Vitae.

Publications

International Conferences

[C1] Simone Drago*, Marco Mussi* and Alberto Maria Metelli. Sleeping Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear] [Paper] [Poster]

[C2] Simone Drago, Marco Mussi and Alberto Maria Metelli. Towards Theoretical Understanding of Sequential Decision Making with Preference Feedback. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear] [Paper] [Poster]

[C3] Alessandro Montenegro, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Convergence Analysis of Policy Gradient Methods with Dynamic Stochasticity. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear] [Paper] [Poster]

[C4] Simone Drago, Marco Mussi and Alberto Maria Metelli. Position: Constants are Critical in Regret Bounds for Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning (ICML). 2025.
[Link - To Appear] [Paper] [Poster]

[C5] Alessandro Montenegro, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning. Advances in Neural Information Processing Systems (NeurIPS). 2024.
[Link] [Paper] [arXiv] [Poster] [Slides]

[C6] Marco Mussi*, Simone Drago*, Marcello Restelli and Alberto Maria Metelli. Factored-Reward Bandits with Intermediate Observations. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link] [Paper] [Poster] [Slides]

[C7] Marco Mussi, Alessandro Montenegro, Francesco Trovò, Marcello Restelli and Alberto Maria Metelli. Best Arm Identification for Stochastic Rising Bandits. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link] [Paper] [arXiv] [Poster]

[C8] Alessandro Montenegro, Marco Mussi, Alberto Maria Metelli and Matteo Papini. Learning Optimal Deterministic Policies with Stochastic Policy Gradients. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link] [Paper] [arXiv] [Poster]

[C9] Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni and Alberto Maria Metelli. Graph-Triggered Rising Bandits. Proceedings of the 41st International Conference on Machine Learning (ICML). 2024.
[Link] [Paper] [Poster]

[C10] Francesco Bacchiocchi*, Gianmarco Genalti*, Davide Maran*, Marco Mussi*, Marcello Restelli, Nicola Gatti and Alberto Maria Metelli. Autoregressive Bandits. Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS). 2024.
[Link] [Paper] [arXiv] [Poster] [Slides]

[C11] Marco Mussi, Alberto Maria Metelli and Marcello Restelli. Dynamical Linear Bandits. Proceedings of the 40th International Conference on Machine Learning (ICML). 2023.
[Link] [Paper] [arXiv] [Poster] [Slides]

[C12] Marco Mussi*, Gianmarco Genalti*, Alessandro Nuara, Francesco Trovò, Marcello Restelli and Nicola Gatti. Dynamic Pricing with Volume Discounts in Online Settings. Proceedings of the Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence (IAAI). 2023. AAAI. Innovative Application of AI Award.
[Link] [Paper] [arXiv] [Poster] [Slides] [Award]

[C13] Marco Mussi, Gianmarco Genalti, Francesco Trovò, Alessandro Nuara, Nicola Gatti and Marcello Restelli. Pricing the Long Tail by Explainable Product Aggregation and Monotonic Bandits. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022.
[Link] [Paper] [Poster] [Slides]

Journals

[J1] Marco Mussi, Alberto Maria Metelli, Marcello Restelli, Gianvito Losapio, Ricardo Jorge Bessa, Daniel Boos, Clark Borst, Alberto Castagna, Ricardo Chavarriaga, Duarte Dias, Adrian Egli, Andrina Eisenegger, Yassine El Manyari, Anton Fuxjäger, Joaquim Geraldes, Samira Hamouche, Mohamed Hassouna, Bruno Lemetayer, Milad Leyli-Abadi, Roman Liessner, Jonas Lundberg, Antoine Marot, Maroua Meddeb, Viola Schiaffonati, Manuel Schneider, Thilo Stadelmann, Julia Usher, Herke van Hoof, Jan Viebahn, Toni Waefler and Giacomo Zanotti. Human-AI Interaction in Safety-Critical Network Infrastructures. iScience. 2025.
[Link - To Appear] [Paper - To Appear]

[J2] Marco Mussi*, Simone Drago*, Marcello Restelli and Alberto Maria Metelli. Factored-Reward Bandits with Intermediate Observations: Regret Minimization and Best Arm Identification. Artificial Intelligence. 2025.
[Link] [Paper]

[J3] Marco Mussi and Alberto Maria Metelli. Generalizing the Regret: an Analysis of Lower and Upper Bounds. Journal of Artificial Intelligence Research. 2025.
[Link] [Paper]

[J4] Marco Mussi, Luigi Pellegrino, Oscar Francesco Pindaro, Marcello Restelli and Francesco Trovò. A Reinforcement Learning Controller Optimizing Costs and Battery State of Health in Smart Grids. Journal of Energy Storage. 2024.
[Link] [Paper]

[J5] Marco Mussi, Davide Lombarda, Alberto Maria Metelli, Francesco Trovò and Marcello Restelli. ARLO: A Framework for Automated Reinforcement Learning. Expert Systems with Applications. 2023.
[Link] [Paper] [arXiv]

[J6] Marco Mussi, Luigi Pellegrino, Marcello Restelli and Francesco Trovò. An Online State of Health Estimation Method for Lithium-Ion Batteries based on Time Partitioning and Data-Driven Model Identification. Journal of Energy Storage. 2022.
[Link] [Paper]

[J7] Marco Mussi, Luigi Pellegrino, Marcello Restelli and Francesco Trovò. A voltage dynamic-based state of charge estimation method for batteries storage systems. Journal of Energy Storage. 2021.
[Link] [Paper]

Workshops

[W1] Federico Corso, Marco Mussi and Alberto Maria Metelli. Trading-off Reward Maximization and Stability in Sequential Decision Making. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link - To Appear] [Paper - To Appear] [Poster - To Appear]

[W2] Simone Drago, Marco Mussi and Alberto Maria Metelli. A Theoretical Perspective on Sequential Decision Making with Preference Feedback. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link - To Appear] [Paper - To Appear] [Poster - To Appear]

[W3] Alberto Maria Metelli, Simone Drago and Marco Mussi. A Novel Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds for Generalized Kernelized Bandits. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link - To Appear] [Paper - To Appear] [Poster - To Appear]

[W4] Davide Salaorni, Vincenzo De Paola, Samuele Delpero, Giovanni Dispoto, Paolo Bonetti, Alessio Russo, Giuseppe Calcagno, Francesco Trovò, Matteo Papini, Alberto Maria Metelli, Marco Mussi and Marcello Restelli. Gym4ReaL: A Benchmark Suite for Evaluating Reinforcement Learning in Realistic Domains. European Workshop on Reinforcement Learning (EWRL). 2025.
[Link - To Appear] [Paper - To Appear] [Poster - To Appear]

[W5] Carlo Fabrizio*, Gianvito Losapio*, Marco Mussi, Alberto Maria Metelli and Marcello Restelli. Power Grid Control with Graph-Based Distributed Reinforcement Learning. Workshop on Machine Learning for Sustainable Power Systems at the European Conference on Machine Learning (ECML). 2025.
[Link - To Appear] [Paper - To Appear] [Poster - To Appear] [Slides - To Appear]

[W6] Gianvito Losapio, Davide Beretta, Marco Mussi, Alberto Maria Metelli and Marcello Restelli. State and Action Factorization in Power Grids. Workshop on Machine Learning for Sustainable Power Systems at the European Conference on Machine Learning (ECML). 2024.
[Paper] [arXiv] [Poster] [Slides]

[W7] Simone Drago and Marco Mussi. Open Problem: Tight Bounds for Bernoulli Rewards in Kernelized Multi-Armed Bandits. Workshop on Aligning Reinforcement Learning Experimentalists and Theorists at the International Conference on Machine Learning (ICML). 2024.
[Link] [Paper] [Poster]

[W8] Simone Drago, Marco Mussi, Marcello Restelli and Alberto Maria Metelli. Intermediate Observations in Factored-Reward Bandits. Adaptive and Learning Agents Workshop at the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS). 2024.
[Link] [Paper] [Slides]

[W9] Francesco Bacchiocchi*, Gianmarco Genalti*, Davide Maran*, Marco Mussi*, Marcello Restelli, Nicola Gatti and Alberto Maria Metelli. Online Learning in Autoregressive Dynamics. European Workshop on Reinforcement Learning (EWRL). 2023.
[Link] [Paper] [Poster]

[W10] Alessandro Montenegro, Marco Mussi, Francesco Trovò, Marcello Restelli and Alberto Maria Metelli. Stochastic Rising Bandits: A Best Arm Identification Approach. European Workshop on Reinforcement Learning (EWRL). 2023.
[Link] [Paper] [Poster]

[W11] Alessandro Montenegro, Marco Mussi, Francesco Trovò, Marcello Restelli and Alberto Maria Metelli. A Best Arm Identification Approach for Stochastic Rising Bandits. Workshop on New Frontiers in Learning, Control, and Dynamical Systems at International Conference on Machine Learning (ICML). 2023.
[Link] [Paper] [Poster]

[W12] Gianmarco Genalti, Marco Mussi, Alessandro Nuara and Nicola Gatti. Dynamic Pricing with Online Data Aggregation and Learning. European Workshop on Reinforcement Learning (EWRL). 2022.
[Link] [Paper] [Poster] [Slides]

[W13] Marco Mussi, Alberto Maria Metelli and Marcello Restelli. Dynamical Linear Bandits for Long-Lasting Vanishing Rewards. Complex Feedback in Online Learning Workshop at International Conference on Machine Learning (ICML). 2022.
[Link] [Paper] [Poster]

Book Chapters

[B1] Marco Mussi. Multi-Armed Bandits Algorithms for Pricing and Advertising. Special Topics in Information Technology. 2025. To Appear.
[Link - To Appear] [Paper]

Preprints

[P1] Davide Salaorni, Vincenzo De Paola, Samuele Delpero, Giovanni Dispoto, Paolo Bonetti, Alessio Russo, Giuseppe Calcagno, Francesco Trovò, Matteo Papini, Alberto Maria Metelli, Marco Mussi and Marcello Restelli. Gym4ReaL: A Suite for Benchmarking Real-World Reinforcement Learning. arXiv preprint, arXiv:2507.00257. 2025.
[Paper] [arXiv]

[P2] Alessandro Montengro, Federico Mansutti, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Reusing Trajectories in Policy Gradients Enables Fast Convergence. arXiv preprint, arXiv:2506.06178. 2025.
[Paper] [arXiv]

[P3] Alessandro Montengro, Leonardo Cesani, Marco Mussi, Matteo Papini and Alberto Maria Metelli. Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes. arXiv preprint, arXiv:2506.05953. 2025.
[Paper] [arXiv]

[P4] Simone Drago, Marco Mussi and Alberto Maria Metelli. A refined Analysis of UCBVI. arXiv preprint, arXiv:2502.17370. 2025.
[Paper] [arXiv]

[P5] Gianmarco Genalti, Marco Mussi, Nicola Gatti, Marcello Restelli, Matteo Castiglioni and Alberto Maria Metelli. Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting. arXiv preprint, arXiv:2409.05980. 2024.
[Paper] [arXiv]

[P6] Marco Mussi, Simone Drago and Alberto Maria Metelli. Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards. arXiv preprint, arXiv:2407.06321. 2024.
[Paper] [arXiv]

Technical Reports

[T1] Marco Mussi, Gianvito Losapio, Alberto Maria Metelli, Marcello Restelli, Ricardo Bessa, Antoine Marot, Daniel Boos, Clark Borst, Alberto Castagna, Duarte Dias, Adrian Egli, Andrina Eisenegger, Yassine El Manyari, Anton Fuxjäger, Samira Hamouche, Mohamed Hassouna, Bruno Lemetayer, Roman Liessner, Jonas Lundberg, Manuel Schneider, Irene Sturm, Julia Usher, Herke Van Hoof, Jan Viebahn and Toni Wäfler. Position paper on AI for the operation of critical energy and mobility network infrastructures. AI4REALNET. 2024.
[Link] [Report]

Education

Ph.D. in Information Technology - Politecnico di Milano (Nov 2020 - Jun 2024)
Focus on Reinforcement Learning and Online Learning.
Supervisor: Prof. Marcello Restelli
[Link] [Thesis] [Slides]

M.Sc. in Computer Science and Engineering - Politecnico di Milano (Sep 2017 - Dec 2019)
Main focus: Artificial Intelligence and Machine Learning
Scholarship: Tuition waiver for high academic performance
Relevant coursework: Machine Learning, Artificial Intelligence, Game Theory, Autonomous Agents and Multi- agent Systems, Foundations of Operational Research, Software Engineering, Principles of Programming Languages, Data Bases II

B.Sc. in Engineering of Computing Systems - Politecnico di Milano (Sep 2014 - Jul 2017)
Relevant coursework: Software Engineering, Theoretical Computer Science, Communication Networks and Internet, Information Systems, Data Bases I, Computer Architecture and Operating Systems, Automatic Control, Calculus I, Calculus II, Linear Algebra and Geometry, Logics and Algebra, Statistics and Probability, Physics, Applied Physics

High School Diploma in Computer Science - IIS Galileo Galilei Crema (Sep 2008 - Jul 2014)
Main Focus: C, Java, HTML, CSS, Javascript

Experience

Postdoctoral Researcher - Politecnico di Milano (Jun 2024 - now)
Supervisor: Prof. Marcello Restelli

Research Scientist - ML cube (Nov 2020 - Jun 2024)
Goal: develop algorithms for dynamic pricing and advertising optimization

Research Assistant - Politecnico di Milano (Jan 2020 - Oct 2020)
Supervisor: Prof. Marcello Restelli

Industrial Projects

AD cube Marketing Mix Model - ML cube (Nov 2022 - Oct 2023)
Focus: Budget optimization in advertising, considering advertising campaigns interactions

Data-driven Optimization Marketing Mix Models for Advertising - WebRanking (Feb 2022 - Aug 2022)
Focus: Implementation of a MMM to solve the attribution problem in digital advertising in contexts with scarce and noisy data

Dynamic Pricing for E-commerce - Euroffice (Feb 2021 - May 2022)
Focus: Implementation of a dynamic pricing model for an e-commerce with over 20000 products

AD cube Product Release - ML cube (Nov 2020 - Feb 2022)
Focus: Release of AD Cube, a product for advertising optimization in online campaigns

Last-mile Delivery Optimization - PaxMile (May 2020 - Oct 2020)
Focus: Delivery allocation using Reinforcement Learning and bikers load estimation using Supervised Learning techniques

Reinforcement Learning in Smart-grids - Ricerca Sistema Energetico (Feb 2020 - Oct 2022)
Focus: Exploit Reinforcement Learning solutions to preserve the battery State of Health in smart-grids, optimizing economic variables

European Projects

AI4REALNET - Fundamental Research Work Package (Oct 2023 - now)
Focus: The scope of AI4REALNET covers the perspective of AI-based solutions addressing critical systems (electricity, railway, and air traffic management) modeled by networks that can be simulated, and are traditionally operated by humans, and where AI systems complement and augment human abilities.

Academic Activities

M.Sc. Students Co-supervision

[1] Gianmarco Genalti - "A Multi-Armed Bandit Approach to Dynamic Pricing". Co-supervision. Supervisor: Prof. Nicola Gatti (M.Sc. in Mathematical Engineering, Dec 2021)

[2] Amedeo Cavallo - "A Combinatorial Multi-Armed Bandit Approach to Online Advertising Budget Optimisation". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Dec 2021)

[3] Oscar Francesco Pindaro - "Controlling Lithium-Ion Batteries Through Reinforcement Learning". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Apr 2022)

[4] Davide Lombarda - "Towards Automated Reinforcement Learning". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Mathematical Engineering, Apr 2022)

[5] Thomas Petrone - "Hidden Markov Model for Single User Response Prediction in Digital Advertising Campaigns". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Mathematical Engineering, Jul 2022)

[6] Alessandro Montenegro - "Best Model Selection via Stochastic Rising Bandits". Co-supervision. Supervisor: Prof. Alberto Maria Metelli (M.Sc. in Computer Science and Engineering, May 2023)

[7] Andrea d'Silva - "Integrating Behavioral Cloning into a Reinforcement Learning pipeline". Co-supervision. Supervisor: Prof. Francesco Trovò (M.Sc. in Computer Science and Engineering, May 2023)

[8] Francesco Gonzales - "Stochastic Linear Bandit with Global-Local Structure". Co-supervision. Supervisor: Prof. Francesco Trovò (M.Sc. in Computer Science and Engineering, May 2023)

[9] Vittorio Arianna - "Multi-Armed Bandits for Joint Pricing and Advertising". Co-supervision. Supervisor: Prof. Nicola Gatti (M.Sc. in Computer Science and Engineering, Oct 2023)

[10] Marco Bonalumi - "An Online Learning Algorithm for Real-time Bidding". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Dec 2023)

[11] Alessandro Contù - "Budget Optimization in Marketing Mix Models". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Dec 2023)

[12] Andrea Cerasani - "An Online Dynamic Pricing Algorithm for Complementary Products". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Dec 2023)

[13] Federico Corso - "Smoothed OMD: an Algorithm for No-regret Learning in Adversarial MDPs with Revealed Transitions". Co-supervision. Supervisor: Prof. Alberto Maria Metelli (M.Sc. in Automation and Control Engineering, Jul 2024)

[14] Davide Beretta - "Distributed Reinforcement Learning for Power Grid Operations". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Oct 2024)

[15] Valentina Abbattista - "Online Learning for PID Controller Tuning". Co-supervision. Supervisor: Prof. Alberto Maria Metelli (M.Sc. in Computer Science and Engineering, Oct 2024)

[16] Giacomo Cartechini - "Distributed Reinforcement Learning for Large-Scale Networks". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Dec 2024)

[17] Fabio Patella - "Reinforcement Learning for Digital Advertising Cross-Channel Budget Optimization". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Apr 2025)

[18] Leonardo Cesani - "Learning Deterministic Policies in Constrained Markov Decision Processes with Policy Gradients". Co-supervision. Supervisor: Prof. Matteo Papini (M.Sc. in Computer Science and Engineering, Apr 2025)

[19] Federico Mansutti - "Trajectory Reuse in Policy Gradients". Co-supervision. Supervisor: Prof. Alberto Maria Metelli (M.Sc. in Computer Science and Engineering, Jul 2025)

[20] Cristiano Migali - "Towards Closing the Gap in Restless Rising Bandits". Co-supervision. Supervisor: Prof. Alberto Maria Metelli (M.Sc. in Computer Science and Engineering, Jul 2025)

[21] Carlo Fabrizio - "Graph-Based Multi-Agent Reinforcement Learning for Power Grid Control". Co-supervision. Supervisor: Prof. Marcello Restelli (M.Sc. in Computer Science and Engineering, Jul 2025)

[22] Leonardo Bianconi. Co-supervision. (M.Sc. in Computer Science and Engineering, in progress)

[23] Andrea Fondacaro. Co-supervision. (M.Sc. in Computer Science and Engineering, in progress)

Contacts

Email
marco DOT mussi AT polimi DOT it

Office

Office 19, First Floor of Building 21
Dipartimento di Elettronica, Informazione e Bioingegneria
Politecnico di Milano
Via Ponzio 34/5, Milan, 20133, Italy