4.3 Article

Genetic signature of differentiated thyroid carcinoma susceptibility: a machine learning approach

期刊

EUROPEAN THYROID JOURNAL
卷 11, 期 5, 页码 -

出版社

BIOSCIENTIFICA LTD
DOI: 10.1530/ETJ-22-0058

关键词

differentiated thyroid cancer; machine learning; single nucleotide polymorphism

资金

  1. Horizon 2020 Program of the European Union
  2. [856620]

向作者/读者索取更多资源

This study investigates the genetic predisposition to differentiated thyroid carcinoma (DTC) by analyzing single nucleotide polymorphisms (SNPs) using polygenic risk score (PRS), Bayesian statistics, and machine learning (ML) classifier. The findings reveal that a selection of 15 DTC-associated SNPs can accurately predict the case or control status based on individual genetic background.
To identify a peculiar genetic combination predisposing to differentiated thyroid carcinoma (DTC), we selected a set of single nucleotide polymorphisms (SNPs) associated with DTC risk, considering polygenic risk score (PRS), Bayesian statistics and a machine learning (ML) classifier to describe cases and controls in three different datasets. Dataset 1 (649 DTC, 431 controls) has been previously genotyped in a genome-wide association study (GWAS) on Italian DTC. Dataset 2 (234 DTC, 101 controls) and dataset 3 (404 DTC, 392 controls) were genotyped. Associations of 171 SNPs reported to predispose to DTC in candidate studies were extracted from the GWAS of dataset 1, followed by replication of SNPs associated with DTC risk (P < 0.05) in dataset 2. The reliability of the identified SNPs was confirmed by PRS and Bayesian statistics after merging the three datasets. SNPs were used to describe the case/control state of individuals by ML classifier. Starting from 171 SNPs associated with DTC, 15 were positive in both datasets 1 and 2. Using these markers, PRS revealed that individuals in the fifth quintile had a seven-fold increased risk of DTC than those in the first. Bayesian inference confirmed that the selected 15 SNPs differentiate cases from controls. Results were corroborated by ML, finding a maximum AUC of about 0.7. A restricted selection of only 15 DTC-associated SNPs is able to describe the inner genetic structure of Italian individuals, and ML allows a fair prediction of case or control status based solely on the individual genetic background.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据