Short Seminars by prospective young researches at Computer Science and Engineering - DEIB
Methods and Systems for Genomic Data Integration, Storage and Search
Arif Canakoglu
Research Assistant - DEIB
Politecnico di Milano, this seminar will be held online
September 25th, 2020
1.00 pm
Arif Canakoglu
Research Assistant - DEIB
Politecnico di Milano, this seminar will be held online
September 25th, 2020
1.00 pm
Abstract
Methods and Systems for Genomic Data Integration, Storage and Search In the last decade, with the huge progress of sequencing technologies, genomic data is tremendously increasing, with relevant applications. As genomic information is huge, the availability of massive data flow creates new challenges in method and system development, concerning data integration and storage, then the availability of search interfaces, and then the development of data analytics. Data integration is a crucial problem: as genomic data is available within disparate data sources, the main difficulties are i) heterogeneity in the data structure and values, ii) low data quality, iii) availability of textual data lacking precise annotations, with lots of ambiguities. This talk aims at discussing methods and systems in the bioinformatics domain, mostly focused on human genomics; recent work has addressed also the genomic sequences of SARSCoV-2, the virus responsible for COVID-19 pandemics. Throughout the presentation I will show several systems that I have been developing during my Ph.D. and Post-Doctoral activity; these systems are publicly available and effectively used. From a methodological point of view, I will discuss data integration and curation. From a system development point of view, I will illustrate the approaches that were used for detailed design and optimization of underlying relational databases and cloud-based solution; I will also illustrate the recent extension of a domain-specific query language for genomics, that in now supported on distributed, heterogeneous cloud system. Finally, as all these systems are all publicly available through user-friendly interfaces, I will show examples of how they can be effectively used by biologists and virologists.
Short Bio
Arif Canakoglu is a postdoctoral researcher at the Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB) of Politecnico di Milano. His research interests include data-driven genomic computing, cloud computing, big data analysis and processing, data integration, artificial intelligence applications (machine/deep learning), and statistical analysis.
The seminar will be held online. Please follow the instructions below:
Meeting number: 121 710 1631
Password: 7WtjZm3DEd7
https://politecnicomilano.webex.com/
Join by video system
Dial 1217101631@politecnicomilano.webex.com
You can also dial 62.109.219.4 and enter your meeting number.
Join by phone +44-20-7660-8149
United Kingdom Toll
Access code: 121 710 1631
The seminar will be held online. Please follow the instructions below:
Meeting number: 121 710 1631
Password: 7WtjZm3DEd7
https://politecnicomilano.webex.com/
Join by video system
Dial 1217101631@politecnicomilano.webex.com
You can also dial 62.109.219.4 and enter your meeting number.
Join by phone +44-20-7660-8149
United Kingdom Toll
Access code: 121 710 1631