Article

Autosomal dominantly inherited alzheimer disease: Analysis of genetic subgroups by machine learning

Castillo-Barnes, Diego; Su, Li; Ramirez, Javier; Salas-Gonzalez, Diego; Martinez-Murcia, Francisco J.; Illan, Ignacio A.; Segovia, Fermin; Ortiz, Andres; Cruchaga, Carlos; Farlow, Martin R.; Xiong, Chengjie; Graff-Radford, Neil R.; Schofield, Peter R.; Masters, Colin L.; Salloway, Stephen; Jucker, Mathias; Mori, Hiroshi; Levin, Johannes; Gorriz, Juan M.

Computer Science

Publicación: INFORMATION FUSION

2020

VL / 58 - BP / 153 - EP / 167

abstract

Despite subjects with Dominantly-Inherited Alzheimer's Disease (DIAD) represent less than 1% of all Alzheimer's Disease (AD) cases, the Dominantly Inherited Alzheimer Network (DIAN) initiative constitutes a strong impact in the understanding of AD disease course with special emphasis on the presyptomatic disease phase. Until now, the 3 genes involved in DIAD pathogenesis (PSEN1, PSEN2 and APP) have been commonly merged into one group (Mutation Carriers, MC) and studied using conventional statistical analysis. Comparisons between groups using null-hypothesis testing or longitudinal regression procedures, such as the linear-mixed-effects models, have been assessed in the extant literature. Within this context, the work presented here performs a comparison between different groups of subjects by considering the 3 genes, either jointly or separately, and using tools based on Machine Learning (ML). This involves a feature selection step which makes use of ANOVA followed by Principal Component Analysis (PCA) to determine which features would be realiable for further comparison purposes. Then, the selected predictors are classified using a Support-Vector-Machine (SVM) in a nested k-Fold cross-validation resulting in maximum classification rates of 72-74% using PiB PET features, specially when comparing asymptomatic Non-Carriers (NC) subjects with asymptomatic PSEN1 Mutation-Carriers (PSEN1-MC). Results obtained from these experiments led to the idea that PSEN1-MC might be considered as a mixture of two different subgroups including: a first group whose patterns were very close to NC subjects, and a second group much more different in terms of imaging patterns. Thus, using a k-Means clustering algorithm it was determined both subgroups and a new classification scenario was conducted to validate this process. The comparison between each subgroup vs. NC subjects resulted in classification rates around 80% underscoring the importance of considering DIAN as an heterogeneous entity.

63_rd in Computer Science

18 InfluRatio