4.3 Article

Partitioning Proportion and Pretreatment Method of Infrared Spectral Dataset

Journal

CHINESE JOURNAL OF ANALYTICAL CHEMISTRY
Volume 50, Issue 9, Pages 1415-1424

Publisher

SCIENCE PRESS
DOI: 10.19756/j.issn.0253?3820.221001

Keywords

Infrared spectrum; Hemoglobin; Partitioning dataset; Pretreatment method

Funding

  1. National Natural Science Foundation of China [61501526,61178087]
  2. Fundamental Research Funds for the Central Universities,South-Central MinZu University [CZQ22006]

Ask authors/readers for more resources

This study investigated the predictability of partial least squares (PLS) model using different dataset partitioning methods and examined the influence of various pretreatment methods on the prediction accuracy of PLS quantitative analysis model using spectral data of blood samples and imitation solution samples with different concentrations of hemoglobin. The optimal dataset partitioning method for both datasets was found to be the SPXY method, and the best pretreatment combinations were determined to be S_G1 + WT for blood samples and SNV + WT for imitation solution samples.
Hemoglobin is an important physiological index of human body. Abnormal concentration of hemoglobin will lead to various diseases. Near infrared spectroscopy can be use to detect hemoglobin content in human body quickly and without reagent. However, the infrared spectrum overlaps seriously, the effective information is low, and it is vulnerable to external noise. Therefore, it is usually necessary to divide and pretreat the spectral data, and then establish quantitative model, so as to remove the adverse effects of interference information on the prediction model. But how to choose the best partition method and the best partition proportion and how to choose the best pretreatment methods are still problems. To solve these issues, by taking the spectral data of 190 blood samples with different concentrations of hemoglobin and 150 imitation solution samples with different concentrations of hemoglobin as the research object, partial least squares (PLS) model predictability with different dataset partitioning methods including equal interval division method, kennard stone (K_S), sample set partitioning based on joint X-Y distances method (SPXY)and duplex algorithm (Duplex)under 41 different partitioning proportions were studied in this work. Pretreatments including wavelet transform (WT), standard normal variable (SNV), direct orthogonal signal correction (DOSC), and S_G (savitzky Golay) first-order derivation form 65 pretreatment combinations (considering order), and the influence of these 65 pretreatment combinations on the prediction accuracy of PLS quantitative analysis model were studied. Experimental results indicated that the optimal dataset partitioning method of PLS model of the two datasets was SPXY method, in which the optimal division proportion of blood sample was 0. 48, and the optimal division proportion of imitation solution was 0. 90. Among the 65 pretreatment methods, the best pretreatment combination of blood samples was S_ G1 + WT, in which the correlation coefficient of prediction set (R-p)was 0. 9808, and the root mean square error of prediction set (RMSEP)was 0? 2701. The best pretreatment combination of imitation solution samples was SNV + WT, in which the R-p was 0. 9952 and the RMSEP was 3. 8154. The superposition of two algorithms showed better denoising effect. The research results provided an idea and method for the processing of this kind of spectral data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available