Ph.D. in Information Technology: Final Dissertations
DEIB - PT1 Room
April 29th, 2016
2.30 pm
April 29th, 2016
2.30 pm
Abstract
On April 29th, 2016 the final dissertations of the candidates of the Ph.D. in Information Technology will be held at DEIB PT1 Room and will start at 2.30 pm:
Vahid JALILI – XXVIII Cycle
"Efficient Data Structures for Cross-Sample Inferences on Genomic Data"
Advisors: Prof. Matteo Matteucci, Prof. Marco Masseroli
Abstract:
The advances in next generation sequencing (NGS), also known as high-throughput sequencing, ubiquitize DNA sequencing as a flexible tool for genome exploration. NGS has opened the possibility of a comprehensive characterization of the genomic and epigenomic landscapes, giving answers to fundamental questions for biological and clinical research, e.g., how DNAprotein interactions and chromatin structure affect gene activity, how cancer develops, how much complex diseases such as diabetes or cancer depend on personal (epi)genomic traits, opening the road to personalized and precision medicine.
A distinguished aspect of NGS-based experiments is the large amount of data it produces. The generated data are broadly applicable and facilitate various functional analysis, includingDNAprotein interaction or histone modification (using Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq)), transcriptional regulation (using RNA-seq), long range chromatin interactions explained by de novo spatial structure of genome (using Hi- C6). Recent studies combine these individuals into larger assays for in-depth interpretations of sequencing data. Yet such interpretations and making sense of data demands complex computation and large scale data retrieval systems. Present dissertation has focused on sensemaking, e.g., discovering how heterogeneous DNA regions concur to determine particular biological processes or phenotypes. Towards such discovery, characteristic operations to be performed on region data regard identifying co-occurrences of regions, from different biological tests and/or of distinct semantic types, possibly within a certain distance from each others and/or from DNA regions with known structural or functional properties.
Present dissertation explains Di4 (1D Interval Incremental Inverted Index) and it's predecessor Di3 (1D Interval Inverted Index). Di4 and Di3 are single-dimension (1D)multi-resolution indexing frameworks, designed to be comprehensive, generic, extensible, and scalable back-end data structures for information retrieval on NGS interval-based data. Di4 and Di3 are defined at data access layer, independent from data layer, business logic layer, and presentation layer; this design makes them adaptable to any underlying persistence technology based on keyvalue pairs, spanning fromclassical BÅtree to LevelDB and Apache HBase, and it makes them suitable for different business logic and presentation layer scenarios. Benchmarking Di4 and Di3 on real and simulated datasets and a comparison with common tools in bioinformatics realm, demonstrate the effectiveness of Di4 and Di3 as a back-end general purpose genomic region manipulation tool, with a console-level interface, and as a software component used withinMuSERA, a graphical tool for comparative analysis of region data replicates from NGS ChIP-seq and DNase-seq tests.
Fabio MARFIA – XXVII Cycle
"On the Use of Description Logics for Generating Policy Decisions and Explanations"
Advisor: Prof. Marco Colombetti
Abstract:
I present an XACML Policy Framework implementation that makes use of Description Logics and reasoning technologies in the present Ph.D. thesis. Reasoning allows to easily generate policy decisions in complex environments for expressive policies, while satisfying the requirements of reliability and consistency for the framework. Furthermore, logical representations are a valid substratum for tackling advanced complex tasks, as, e.g., providing to the user an explanation of a generated authorization response, with a complete rationale.
I describe a method for generating authorization decisions relying on an expressive and advanced logical description of the normative state and context, while allowing to generate a readable explanation of the internal inference procedures that took the security framework to permit or deny an act to the user.
Observed scalability problems in performances are tackled in the present work for Policy Decision and the most part of Policy Explanation procedures, with specific solutions for minimizing the number of logical axioms on which to apply reasoning.
Vahid JALILI – XXVIII Cycle
"Efficient Data Structures for Cross-Sample Inferences on Genomic Data"
Advisors: Prof. Matteo Matteucci, Prof. Marco Masseroli
Abstract:
The advances in next generation sequencing (NGS), also known as high-throughput sequencing, ubiquitize DNA sequencing as a flexible tool for genome exploration. NGS has opened the possibility of a comprehensive characterization of the genomic and epigenomic landscapes, giving answers to fundamental questions for biological and clinical research, e.g., how DNAprotein interactions and chromatin structure affect gene activity, how cancer develops, how much complex diseases such as diabetes or cancer depend on personal (epi)genomic traits, opening the road to personalized and precision medicine.
A distinguished aspect of NGS-based experiments is the large amount of data it produces. The generated data are broadly applicable and facilitate various functional analysis, includingDNAprotein interaction or histone modification (using Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq)), transcriptional regulation (using RNA-seq), long range chromatin interactions explained by de novo spatial structure of genome (using Hi- C6). Recent studies combine these individuals into larger assays for in-depth interpretations of sequencing data. Yet such interpretations and making sense of data demands complex computation and large scale data retrieval systems. Present dissertation has focused on sensemaking, e.g., discovering how heterogeneous DNA regions concur to determine particular biological processes or phenotypes. Towards such discovery, characteristic operations to be performed on region data regard identifying co-occurrences of regions, from different biological tests and/or of distinct semantic types, possibly within a certain distance from each others and/or from DNA regions with known structural or functional properties.
Present dissertation explains Di4 (1D Interval Incremental Inverted Index) and it's predecessor Di3 (1D Interval Inverted Index). Di4 and Di3 are single-dimension (1D)multi-resolution indexing frameworks, designed to be comprehensive, generic, extensible, and scalable back-end data structures for information retrieval on NGS interval-based data. Di4 and Di3 are defined at data access layer, independent from data layer, business logic layer, and presentation layer; this design makes them adaptable to any underlying persistence technology based on keyvalue pairs, spanning fromclassical BÅtree to LevelDB and Apache HBase, and it makes them suitable for different business logic and presentation layer scenarios. Benchmarking Di4 and Di3 on real and simulated datasets and a comparison with common tools in bioinformatics realm, demonstrate the effectiveness of Di4 and Di3 as a back-end general purpose genomic region manipulation tool, with a console-level interface, and as a software component used withinMuSERA, a graphical tool for comparative analysis of region data replicates from NGS ChIP-seq and DNase-seq tests.
Fabio MARFIA – XXVII Cycle
"On the Use of Description Logics for Generating Policy Decisions and Explanations"
Advisor: Prof. Marco Colombetti
Abstract:
I present an XACML Policy Framework implementation that makes use of Description Logics and reasoning technologies in the present Ph.D. thesis. Reasoning allows to easily generate policy decisions in complex environments for expressive policies, while satisfying the requirements of reliability and consistency for the framework. Furthermore, logical representations are a valid substratum for tackling advanced complex tasks, as, e.g., providing to the user an explanation of a generated authorization response, with a complete rationale.
I describe a method for generating authorization decisions relying on an expressive and advanced logical description of the normative state and context, while allowing to generate a readable explanation of the internal inference procedures that took the security framework to permit or deny an act to the user.
Observed scalability problems in performances are tackled in the present work for Policy Decision and the most part of Policy Explanation procedures, with specific solutions for minimizing the number of logical axioms on which to apply reasoning.