Towards Robust Waveform-Based Acoustic Models

Article Acoustics

Learning Waveform-Based Acoustic Models Using Deep Variational Convolutional Neural Networks

Dino Oglic et al.

Summary: This study explores the potential of stochastic neural networks in learning effective waveform-based acoustic models, utilizing deep convolutional neural networks and stochastic variational inference to achieve superior performance in empirical results.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2021)

Add to Collection

Proceedings Paper Computer Science, Artificial Intelligence

FRAME-LEVEL SPECAUGMENT FOR DEEP CONVOLUTIONAL NEURAL NETWORKS IN HYBRID ASR SYSTEMS

Xinwei Li et al.

Summary: Inspired by SpecAugment, a frame-level SpecAugment method (f-SpecAugment) is proposed to improve the performance of deep CNNs for hybrid HMM ASR systems. By applying transformations to each convolution window independently during training, f-SpecAugment reduces WER across different ASR tasks and is shown to be effective even with large training data, with benefits comparable to doubling the amount of training data for deep CNNs.

2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT) (2021)

Add to Collection

Proceedings Paper Acoustics

MULTI-SCALE OCTAVE CONVOLUTIONS FOR ROBUST SPEECH RECOGNITION

Joanna Rownicka et al.

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (2020)

Add to Collection

Proceedings Paper Computer Science, Artificial Intelligence

Multi-Modal Data Augmentation for End-to-End ASR

Adithya Renduchintala et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Add to Collection

Article Computer Science, Artificial Intelligence