4.6 Article

DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering

Journal

APPLIED SCIENCES-BASEL
Volume 11, Issue 23, Pages -

Publisher

MDPI
DOI: 10.3390/app112311251

Keywords

BERT; GPT-2; XGBoost; T5-Small; medical question answering; transfer learning

Ask authors/readers for more resources

With the outbreak of COVID-19, there has been a growing demand for disease knowledge, leading to the development of a BERT medical pretraining model that utilizes GPT-2 for question augmentation and T5-Small for topic extraction. By calculating cosine similarity and using XGBoost for prediction, the model demonstrates outstanding performance in medical question answering and generation tasks.
With the outbreak of COVID-19 that has prompted an increased focus on self-care, more and more people hope to obtain disease knowledge from the Internet. In response to this demand, medical question answering and question generation tasks have become an important part of natural language processing (NLP). However, there are limited samples of medical questions and answers, and the question generation systems cannot fully meet the needs of non-professionals for medical questions. In this research, we propose a BERT medical pretraining model, using GPT-2 for question augmentation and T5-Small for topic extraction, calculating the cosine similarity of the extracted topic and using XGBoost for prediction. With augmentation using GPT-2, the prediction accuracy of our model outperforms the state-of-the-art (SOTA) model performance. Our experiment results demonstrate the outstanding performance of our model in medical question answering and question generation tasks, and its great potential to solve other biomedical question answering challenges.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available