Significance and Coverage in Group Testing on the Social Web
Idir Benouaret
MIAI institute, Grenoble Alpes (France)
DEIB - Conference Room "E. Gatti" (Building 20)
On Line via Webex
April 8th, 2022
3.00 pm
Contacts:
Letizia Tanca
Research Line:
Data, web, and society
MIAI institute, Grenoble Alpes (France)
DEIB - Conference Room "E. Gatti" (Building 20)
On Line via Webex
April 8th, 2022
3.00 pm
Contacts:
Letizia Tanca
Research Line:
Data, web, and society
Abstract
On April 8th, 2022 at 3.00 pm, Idir Benouaret, MIAI institute – Grenoble Alps, will hold a seminar on "Significance and Coverage in Group Testing on the Social Web" in DEIB Conference Room and in live streaming on the WebEx.
This talk discusses the longstanding question of checking hypotheses on the social Web. In particular, we discuss the challenges that arise in the context of testing an input hypothesis on many data samples, in our case, user groups.
This is referred to as Multiple Hypothesis Testing, a method of choice for data-driven discoveries. Ensuring sound discoveries in large datasets poses two challenges: the likelihood of accepting a hypothesis by chance, i.e., returning false discoveries, and the pitfall of not being representative of the input data. We present our framework GroupTest, which takes as input a query and returns groups that are both statistically significant and that are representative of the underlying data.
We formule a generic top-n problem that seeks n user groups satisfying one-sample, two-sample, or multiple-sample tests, and maximizing data coverage. We show the hardness of the problem and present two algorithms, a greedy algorithm with a provable approximation guarantee and a faster heuristic-based algorithm based on the alpha-investing principle. We validate our results on real-world datasets in terms of significance coverage as well as response time.
This talk discusses the longstanding question of checking hypotheses on the social Web. In particular, we discuss the challenges that arise in the context of testing an input hypothesis on many data samples, in our case, user groups.
This is referred to as Multiple Hypothesis Testing, a method of choice for data-driven discoveries. Ensuring sound discoveries in large datasets poses two challenges: the likelihood of accepting a hypothesis by chance, i.e., returning false discoveries, and the pitfall of not being representative of the input data. We present our framework GroupTest, which takes as input a query and returns groups that are both statistically significant and that are representative of the underlying data.
We formule a generic top-n problem that seeks n user groups satisfying one-sample, two-sample, or multiple-sample tests, and maximizing data coverage. We show the hardness of the problem and present two algorithms, a greedy algorithm with a provable approximation guarantee and a faster heuristic-based algorithm based on the alpha-investing principle. We validate our results on real-world datasets in terms of significance coverage as well as response time.
Short Bio
Idir Benouaret graduated in computer science engineering from ESI (Ecole Nationale Supérieure d’Informatique) in Algiers (2012), and obtained a Masters degree in complex systems from Université Paris-Est Créreil (2013). He defended his Phd at Université Technologie de Compiègne (UTC) in 2017
under the supervision of Prof Dominique Lenne. Then, he was a temporary teaching and research assistant in UTC and Université Jean Monnet Saint-Etienne.
In late 2018, he joined the Slide team at Univeristé Grenoble Alpes and worked under the supervision of Sihem Amer-Yahia in an industrial collaboration with TOTAL.
Currently, he is a postdoctoral researcher at the Multidisciplinary Institute in Artificial Intelligence in Grenoble.
His research interests include recommendation systems, data mining and data exploration.
The event will be held online by Webex at link
The registration is available and required only for the presence at the following FORM.
Please, note: Green Pass is needed to attend the event in presence.
under the supervision of Prof Dominique Lenne. Then, he was a temporary teaching and research assistant in UTC and Université Jean Monnet Saint-Etienne.
In late 2018, he joined the Slide team at Univeristé Grenoble Alpes and worked under the supervision of Sihem Amer-Yahia in an industrial collaboration with TOTAL.
Currently, he is a postdoctoral researcher at the Multidisciplinary Institute in Artificial Intelligence in Grenoble.
His research interests include recommendation systems, data mining and data exploration.
The event will be held online by Webex at link
The registration is available and required only for the presence at the following FORM.
Please, note: Green Pass is needed to attend the event in presence.