☆ 4.6 Article

Multimodal user interaction with in-car equipment in real conditions based on touch and speech modes in the Persian language

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Volume 82, Issue 9, Pages 12995-13023

Publisher

SPRINGER

DOI: 10.1007/s11042-022-13784-1

Keywords

In-vehicle equipment; Multimodal user interface; Voice command detection; Hidden Markov model; Accessibility

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper presents a multimodal user interface design based on touch and speech modes for controlling in-car devices. The research collects a dataset of Persian voice commands in real conditions and aims to solve the technical challenges of speech-based interaction. Evaluation results show that the speech input mode is more efficient and less distracting for drivers compared to the touch input mode.

Nowadays, communication with in-car equipment is performed via a large number of buttons or a touch screen. This increases the need for driver's visual attention and leads to reduce the concentration of drivers while driving. Speech-based interaction has been introduced in recent years as a way to reduce driver distractions. This input mode faces several technical challenges such as the need to memorize voice commands and the difficulties of canceling them. This paper focuses on presenting a multimodal user interface design based on touch and speech modes, for controlling five in-car devices (radio, CD player or music player, fan, heater, and driver-side window). The research is designed to collect a dataset of in-car voice commands in the Persian language in real conditions (in a real car and in the presence of background noises) to firstly create a dataset of Persian voice commands (due to lack of research in this area in Persian speaking countries) and secondly intending to solve the mentioned challenges. To evaluate the proposed user interface, 15 participants performed ten different tasks based on the speech and touch modes, with and without driving simulation. The evaluation results indicated that the speech input mode with and without driving simulation has had in average smaller number of clicks for performing tasks (0.2 and 0.6), smaller task completion time (7.37 and 3.3 seconds), smaller time intervals between clicks (8.2 and 5 seconds) and smaller driver's distraction rate (25.08%) in comparison to the touch input mode, respectively. Moreover, using two different input modes in designing the in-vehicle user interface leads to increased accessibility.

Multimodal user interaction with in-car equipment in real conditions based on touch and speech modes in the Persian language

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multimodal user interaction with in-car equipment in real conditions based on touch and speech modes in the Persian language

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper