☆ 4.6 Article

Transformers Meet Small Datasets

IEEE ACCESS (2022)

Journal

IEEE ACCESS

Volume 10, Issue -, Pages 118454-118464

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2022.3221138

Keywords

Convolutional neural networks; small datasets; transformer; vision transformer

Funding

Major Projects of the National Social Science Foundation of China [20ZD279]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a hybrid model that combines Transformer and CNN to improve the classification ability of Transformers on small datasets. By introducing more convolution operations, the model achieves state-of-the-art results on 4 small datasets, opening up new paths for the application of Transformers on small datasets.

The research and application areas of transformers have been extensively enlarged due to the success of vision transformers (ViTs). However, due to the lack of local content acquisition capabilities, the pure transformer architectures cannot be trained directly on small datasets. In this work, we first propose a new hybrid model by combining the transformer and convolution neural network (CNN). The proposed model improves the classification ability on small datasets. This is accomplished by introducing more convolution operations in the transformer's two core sections: 1) Instead of the original multi-head attention mechanism, we design a convolutional parameter sharing multi-head attention (CPSA) block that incorporates the convolutional parameter sharing projection in the attention mechanism; 2) the feed-forward network in each transformer encoder block is replaced with a local feed-forward network (LFFN) block that introduces a sandglass block with more depth-wise convolutions to provide more locality to the transformers. We achieve state-of-the-art results when training from scratch on 4 small datasets as compared with the transformers and CNNs without extensive computing resources and auxiliary training. The proposed strategy opens up new paths for the application of transformers on small datasets.

Transformers Meet Small Datasets

Journal

IEEE ACCESS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Transformers Meet Small Datasets

Journal

IEEE ACCESS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper