4.6 Article

Multimodal user interaction with in-car equipment in real conditions based on touch and speech modes in the Persian language

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 82, Issue 9, Pages 12995-13023

Publisher

SPRINGER
DOI: 10.1007/s11042-022-13784-1

Keywords

In-vehicle equipment; Multimodal user interface; Voice command detection; Hidden Markov model; Accessibility

Ask authors/readers for more resources

This paper presents a multimodal user interface design based on touch and speech modes for controlling in-car devices. The research collects a dataset of Persian voice commands in real conditions and aims to solve the technical challenges of speech-based interaction. Evaluation results show that the speech input mode is more efficient and less distracting for drivers compared to the touch input mode.
Nowadays, communication with in-car equipment is performed via a large number of buttons or a touch screen. This increases the need for driver's visual attention and leads to reduce the concentration of drivers while driving. Speech-based interaction has been introduced in recent years as a way to reduce driver distractions. This input mode faces several technical challenges such as the need to memorize voice commands and the difficulties of canceling them. This paper focuses on presenting a multimodal user interface design based on touch and speech modes, for controlling five in-car devices (radio, CD player or music player, fan, heater, and driver-side window). The research is designed to collect a dataset of in-car voice commands in the Persian language in real conditions (in a real car and in the presence of background noises) to firstly create a dataset of Persian voice commands (due to lack of research in this area in Persian speaking countries) and secondly intending to solve the mentioned challenges. To evaluate the proposed user interface, 15 participants performed ten different tasks based on the speech and touch modes, with and without driving simulation. The evaluation results indicated that the speech input mode with and without driving simulation has had in average smaller number of clicks for performing tasks (0.2 and 0.6), smaller task completion time (7.37 and 3.3 seconds), smaller time intervals between clicks (8.2 and 5 seconds) and smaller driver's distraction rate (25.08%) in comparison to the touch input mode, respectively. Moreover, using two different input modes in designing the in-vehicle user interface leads to increased accessibility.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available