Deep Learning Models For Text Mining And Analysis
Mark Carman
Politecnico di Milano, DEIB
Politecnico di Milano - DEIB
this event will be online via Microsoft Teams
June 24th, 2020
2.30 pm - 6.30 pm
Contacts:
Giacomo Boracchi
Cesare Alippi
Matteo Matteucci
Politecnico di Milano, DEIB
Politecnico di Milano - DEIB
this event will be online via Microsoft Teams
June 24th, 2020
2.30 pm - 6.30 pm
Contacts:
Giacomo Boracchi
Cesare Alippi
Matteo Matteucci
Abstract
On June 24th, 2020 from 2.30 pm to 6.30 pm, the “Deep Learning Models for Text Mining and Analysis” seminar will take place online, within the PhD Course on Machine Learning for Non-Matrix Data, organized by profs. Giacomo Boracchi, Cesare Alippi, Matteo Matteucci.
Deep Learning has revolutionised the area of text processing recently. Up until to a few years ago, it was inconceivable that one might try to train a classifier on text in one language and then apply it directly to text in another language (without any form of training on the latter). Now it is commonplace to do so. This is possible through the use of powerful language models that have been pre-trained on large multilingual corpora. The application of sophisticated unsupervised pre-training thus provides the ability to easily transfer knowledge from one domain (or natural language) to another.
In this talk I'll run through a brief history of language and sequence modelling techniques. I'll describe the state-of-the-art transformer architectures that are used to build famous models like GPT-2 and BERT. We'll discuss how these models can be used for various types of prediction problems, and describe some interesting applications to problems in multilingual classification, image question answering, data integration, bioinformatics.
Deep Learning has revolutionised the area of text processing recently. Up until to a few years ago, it was inconceivable that one might try to train a classifier on text in one language and then apply it directly to text in another language (without any form of training on the latter). Now it is commonplace to do so. This is possible through the use of powerful language models that have been pre-trained on large multilingual corpora. The application of sophisticated unsupervised pre-training thus provides the ability to easily transfer knowledge from one domain (or natural language) to another.
In this talk I'll run through a brief history of language and sequence modelling techniques. I'll describe the state-of-the-art transformer architectures that are used to build famous models like GPT-2 and BERT. We'll discuss how these models can be used for various types of prediction problems, and describe some interesting applications to problems in multilingual classification, image question answering, data integration, bioinformatics.