4.7 Article

An XGBoost-based physical fitness evaluation model using advanced feature selection and Bayesian hyper-parameter optimization for wearable running monitoring

Journal

COMPUTER NETWORKS
Volume 151, Issue -, Pages 166-180

Publisher

ELSEVIER
DOI: 10.1016/j.comnet.2019.01.026

Keywords

Smart wearables; Physical fitness evaluation model; PPG signal; Advanced feature selection; XGBoost; Bayesian hyper-parameter optimization

Funding

  1. National Natural Science Foundation of China [61401029]
  2. Educational Big Data R&D and its Application -Major Big Data Engineering Project of National Development and Reform Commission
  3. Beijing Advanced Innovation Center for Future Education [BJAICFE2016IR-004]

Ask authors/readers for more resources

Thanks to the improvement of technologies such as Internet of Things, bio-sensing and data mining, smart wearable technologies have recently received increasing attention for teenagers' sport and health monitoring. Despite the powerful data-acquisition ability of the current wearable products on the market, they still suffer performance deficiency in valuable knowledge extraction due to the lack of accurate computational model and in-depth data analysis. Based on this, this paper proposes a machine learning based physical fitness evaluation model oriented to wearable running monitoring for teenagers, in which a variant of the gradient boosting machine (GBM) combined with advanced feature selection and Bayesian hyper-parameter optimization is employed to build a physical fitness evaluation model. To begin with, we design a special experimental paradigm for data acquisition based on a conventional running activity, in which a group of teenagers' photoplethysmography (PPG) signals in different testing stages are collected by a set of smartbands developed by ourselves. Next, PPG signals are processed in four steps which match with the four modules in the proposed model including signal preprocessing, physiological data estimation, feature engineering and classification modules. Firstly, the signal preprocessing module aims for suppressing noise and removing baseline drift in PPG signals by using a smoothness prior approach (SPA) and a median filter (MF), respectively. Secondly, the physiological data estimation module achieves conversion from PPG signals to physiological data such as heart rate (HR) and blood oxygen saturation (SpO(2)). Thirdly, the feature engineering module extracts from the physiological data a group of key features closely related to physical fitness statuses, and then implements a novel advanced feature selection scheme by using Pearson correlation and importance score ranking based sequential forward search (PC-ISR-SFS). Fourthly, the classification module utilizes an extreme gradient boosting (XGBoost) algorithm for classification of each teenager's physical fitness level, in which hyper-parameters are adaptively tuned with Bayesian optimization. Experimental results demonstrate that not only does the proposed model achieve higher evaluation accuracy than the existing reference models, but it also provides a promising solution to future physical fitness evaluation for teenagers through a machine-learning-model based intelligent computing instead of traditional-empirical-model based-manual. calculation. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available