The ODE Method for Algorithm Design in Reinforcement Learning

DEIB - Seminar Room "A. Alario" (Bld. 21)
July 10^th, 2025 | 2.30 pm

Contact: Federico Corso

Sommario

On July 10^th, 2025 at 2.30 pm Federico Corso, PHD Student in Information Technology, will hold a seminar on "The ODE Method for Algorithm Design in Reinforcement Learning" at DEIB Seminar Room "Alessandra Alario" (Building 21).

In Reinforcement Learning and Optimal Control, an algorithm is a finite sequence of computer-implementable instructions designed to compute or approximate a policy, its performance, a value function, or related quantities. In algorithm design, it can be helpful to discard the constraints of computers and imagine they can operate with infinite clock speed.
In such an idealized setting, we can think of an algorithm as an ordinary differential equation (ODE). In this way, the richer theory of ODE stability can be used to assess and design the convergence properties of algorithms more easily than in discrete time. An implementable, discrete-time recursive rule can then be obtained by suitable discretization techniques.
In this seminar, the ODE method will be surveyed in the context of Stochastic Approximation and then specifically applied to tame the slow and potentially unstable dynamics of Watkins’ Q-learning algorithm, leading to faster convergence properties and improved numerical stability.

Sommario

Il dipartimento

Ricerca

Formazione

Servizi per le imprese

Relazioni internazionali

Collabora

Contattaci