A03 – Sequential and adaptive learning under dependence and non-standard objective functions

This project is concerned with the problem of learning sequentially, adaptively and in partial information on an uncertain environment. In this setting, the learner collects sequentially and actively the data, which is not available before-hand in a batch form. The process is as follows: at each time t, the learner chooses an action and receives a data point, that depends on the performed action. The learner collects data in order to learn the system, but also to achieve a goal (characterized by an objective function) that depends on the application. In this project, we will aim at solving this problem under general objective functions, and dependency in the data collecting process – exploring variations of the so-called bandit setting which corresponds to this problem with a specific objective function.

As a motivating example, consider the problem of sequential and active attention detection through an eye tracker. A human user is looking at a screen, and the objective of an automatized monitor (learner) is to identify through an eye tracker zones of this screen where the user is not paying sufficient attention. In order to do so, the monitor is allowed at each time t to flash a small zone a t in the screen, e.g. light a pixel (action), and the eye tracker detects through the eye movement if the user has observed this flash. Ideally the monitor should focus on these difficult zones and flash more often there (i.e. choose more often specific actions corresponding to less identified zones). Therefore, sequential and adaptive learning methods are expected to improve the performances of the monitor.

The PhD candidate will focus on developing sequential learning algorithms with mathematical guarantees for learning on given non-stationary processes that are relevant in the context of recommendation systems, and on implementation of the algorithms that will be developed. S/He will collaborate with a second PhD student at the University of Potsdam on the eye tracker based application. A degree in machine learning or in mathematics with an interest in theoretical computer science will be preferred.

  • G. Blanchard and O. Zadorozhnyi (2017). Concentration of weakly dependent Banach-valued sums and applications to kernel lerning methods. arXiv:1712.01934

  • J. Achdou, J. Lam, A. Carpentier, G. Blanchard (2019). A minimax near-optimal algorithm for adaptive rejection sampling. Accepted for ALT'2019. Arxiv 1810.09390

  • G. Blanchard, A. Carpentier, M. Gutzeit (2018). Minimax Euclidean Separation Rates for Testing Convex Hypotheses in Rd. Electron. J. Statist. 12 (2): 3713-3735. Open Access

  • G. Blanchard and O. Zadorozhnyi (2017). Concentration of weakly dependent Banach-valued sums and applications to kernel lerning methods. arXiv:1712.01934