4.6 Article

Data-Driven Intelligence System for General Recommendations of Deep Learning Architectures

Journal

IEEE ACCESS
Volume 9, Issue -, Pages 148710-148720

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3124633

Keywords

Data models; Training; Architecture; Deep learning; Computer architecture; Adaptation models; Intelligent systems; Deep learning; intelligent system; hyperparameters selection; DL architecture selection; multi-label classification

Funding

  1. Slovenian Research Agency [P2-0098, Z2-1867]

Ask authors/readers for more resources

This paper proposes a novel approach that provides general recommendations for optimal DL architecture and hyperparameters based on analysis of thousands of published research papers. By using NLP methods to convert unstructured data from scientific papers into structured data, intelligent models are able to learn and propose suitable DL architecture, layer types, and activation functions. The advantages of this methodology include leveraging knowledge from numerous DL papers, aiding in selecting optimal DL setups for specific problems, as well as scalability and flexibility for continual improvement with new publications.
Choosing optimal Deep Learning (DL) architecture and hyperparameters for a particular problem is still not a trivial task among researchers. The most common approach relies on popular architectures proven to work on specific problem domains led on the same experiment environment and setup. However, this limits the opportunity to choose or invent novel DL networks that could lead to better results. This paper proposes a novel approach for providing general recommendations of an appropriate DL architecture and its hyperparameters based on different configurations presented in thousands of published research papers that examine various problem domains. This architecture can further serve as a starting point of investigating DL architecture for a concrete data set. Natural language processing (NLP) methods are used to create structured data from unstructured scientific papers upon which intelligent models are learned to propose optimal DL architecture, layer type, and activation functions. The advantage of the proposed methodology is multifold. The first is the ability to eventually use the knowledge and experience from thousands of DL papers published through the years. The second is the contribution to the forthcoming novel researches by aiding the process of choosing optimal DL setup based on the particular problem to be analyzed. The third advantage is the scalability and flexibility of the model, meaning that it can be easily retrained as new papers are published in the future, and therefore to be constantly improved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available