4.7 Article

Prior information-assisted integrative analysis of multiple datasets

Ask authors/readers for more resources

Analyzing genetic data with limited sample size and high dimensionality is a challenge in biomedical research. We propose incorporating prior information and using a convolutional neural network to label textual information from previous studies, which improves the accuracy of integrative analysis of multiple genetic datasets. Our simulation studies show satisfactory performance and we further analyze data on skin cutaneous melanoma to establish practical utility.
Motivation: Analyzing genetic data to identify markers and construct predictive models is of great interest in biomedical research. However, limited by cost and sample availability, genetic studies often suffer from the small sample size, high dimensionality problem. To tackle this problem, an integrative analysis that collectively analyzes multiple datasets with compatible designs is often conducted. For regularizing estimation and selecting relevant variables, penalization and other regularization techniques are routinely adopted. Blindly searching over a vast number of variables may not be efficient. Results: We propose incorporating prior information to assist integrative analysis of multiple genetic datasets. To obtain accurate prior information, we adopt a convolutional neural network with an active learning strategy to label textual information from previous studies. Then the extracted prior information is incorporated using a group LASSO-based technique. We conducted a series of simulation studies that demonstrated the satisfactory performance of the proposedmethod. Finally, data on skin cutaneous melanoma are analyzed to establish practical utility. Availability and implementation: Code is available at https://github.com/ldz7/PAIA. The data that support the findings in this article are openly available in TCGA (The Cancer Genome Atlas) at https://portal.gdc.cancer.gov/.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available