
Speaker: Alessandro Cacciatore
December 2, 2025 | 2:15 pm
DEIB, Seminar Room "A. Alario" (Bld. 21)
For further information please contact: Silvia Cascianelli | silvia.cascianelli@polimi.it
Abstract
Tuesday, December 2, 2025 at 2:15 pm Alessandro Cacciatore (Student in Computer Science and Engineering) will hold a seminar titled "An integrative bioinformatics and machine learning approach for non-coding RNA-based signatures in Amyotrophic Lateral Sclerosis" in DEIB "Alessandra Alario" Seminar Room (Building 21) organized by the Data Science for Bioinformatics group.Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disorder that affects motor neurons, leading to a progressive loss of voluntary muscle control. Depending on the symptoms at onset, ALS can be classified as either spinal (affecting the limbs) or bulbar (affecting speech and swallowing). These two subtypes exhibit distinct histopathological, anatomical, and prognostic features, but their underlying biological differences remain poorly characterized, limiting precise diagnosis and treatment.
Epigenetic alterations are increasingly recognized as key modulators of disease mechanisms in neurodegeneration. In particular, non-coding RNAs (ncRNAs), such as microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), play essential regulatory roles in neuronal function, stress response, and inflammation, and their dysregulation has been associated with ALS pathogenesis. Investigating these molecules may therefore uncover epigenetic signatures that support patient stratification within a precision medicine framework.
This project aims to develop a reproducible computational framework for the identification and validation of RNA-based biomarkers. The study initially focused on discovering differentially expressed ncRNAs, and subsequently explores their potential to derive discriminative signatures capable of distinguishing between the two ALS subtypes.
The proposed workflow for the analysis of RNA biomarkers integrates various steps: data pre-processing and normalization, missing value imputation, ensemble feature selection, feature orthogonalization, model optimization and validation with class imbalance correction, and biological validation and interpretation of findings. In ensemble feature selection, the outputs from four independent algorithms were employed (Random Forest, Recursive Feature Elimination, LASSO, and K-Best) to construct a robust and stable ranking of features, which was then filtered by an orthogonalization step to keep only those features that provide non-redundant information. Multi-Omics Factor Analysis (MOFA) was used to perform a comparison with a common integrational technique in this field. Although the MOFA-derived factors achieved a great differentiation of ALS patients from controls, they mainly reflected the global disease-related variance rather than the molecular distinctions specific to subtypes, corroborating the proposed approach as more suitable in this case. In fact, the Partial Least Squares (PLS) model developed in this study, comprising five components, indicated that the second component had a limited yet noticeable ability to differentiate between bulbar and spinal patients. In conclusion, the results confirm that the proposed computational workflow is a trustworthy and biologically interpretable tool for the discovery of RNA biomarkers in ALS, combining statistical robustness and biological relevance. The framework offers a firm ground for subsequent developments, such as the experimental confirmation and the integration of further omics layers for a deeper comprehension of the molecular heterogeneity of ALS.
