4.3 Article

Machine learning-based screening of the diagnostic genes and their relationship with immune-cell infiltration in patients with lung adenocarcinoma

Journal

JOURNAL OF THORACIC DISEASE
Volume 14, Issue 3, Pages 699-+

Publisher

AME PUBLISHING COMPANY
DOI: 10.21037/jtd-22-206

Keywords

Lung adenocarcinoma (LUAD); immune-cell infiltration; diagnosis; bioinformatic analysis; machine learning

Funding

  1. National Natural Science Foundation of China [81970167, 81800108]

Ask authors/readers for more resources

This study identified 7 DEGs in LUAD tissue that can be considered diagnostic genes based on 2 machine-learning regression methods. These findings are of great importance for the early diagnosis of LUAD. The study also revealed the correlation between immune-infiltrating cells and these diagnostic genes.
Background: Lung adenocarcinoma (LUAD) is the most common type of lung cancer, and has a dismal mortality rate of 80%, mainly due to diagnosis at an advanced stage. Biomarkers with high specificity and sensitivity for the early diagnosis of LUAD are sparse. This study aimed to identify markers for the early diagnosis of LUAD. Methods: The GSE32863 and GSE75037 data sets were standardized and merged to screen for differentially expressed genes (DEGs). Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were conducted. The intersected DEGs from the least absolute shrinkage and selection operator (LASSO) and support vector machine (SVM) regression analyses were considered the hub genes. Then the diagnostic ability and expression of hub genes was tested in GSE63459 data set, Finally, CIBERSORT was used to analyze the correlation between the immune-infiltrating cells and hub genes. Results: The following 7 DEGs were intersected by the LASSO and SVM regression analyses: Locus 401286 (LOC401286), flavin-containing monooxygenase 2 (FMO2), XLKD1, Ras homolog family member J (RHOJ), scavenger receptor Class A member 5 (SCARA5), heat shock protein beta-2 (HSPB2), and serine incorporator 2 (SERINC2). The area under the receiver operating characteristic curve (AUC) of LOC401286, FMO2, XLKD1, RHOJ, SCARA5, HSPB2, and SERINC2 was 0.99, 1.00, 0.99, 1.00, 0.99, 0.99, and 0.98, respectively in the training groups. The AUC of LOC401286, FMO2, XLKD1, RHOJ, SCARA5, HSPB2, and SERINC2 was 0.97, 0.96, 0.94, 0.88, 0.85, 0.94 and 0.89, respectively in the validation group. The immune-cell infiltrations of naive B cells, memory B cells, plasma cells, naive cluster of differentiation (CD) 4 T cells, T follicular helper cells, regulatory T cells, gamma delta T cells, monocytes, M0 macrophages, M1 macrophages, resting mast cells, activated mast cells, and neutrophils were different between the normal and tumor tissues. Notably, these immune cells were correlated with the above-mentioned 7 diagnostic genes. Conclusions: We identified 7 DEGs in LUAD tissue that can be considered diagnostic genes based on 2 machine-learning regression methods, which could be very helpful for the early diagnosis of LUAD in clinical practice

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available