4.7 Article

Can unsupervised learning methods applied to milk recording big data provide new insights into dairy cow health?

期刊

JOURNAL OF DAIRY SCIENCE
卷 105, 期 8, 页码 6760-6772

出版社

ELSEVIER SCIENCE INC
DOI: 10.3168/jds.2022-21975

关键词

big data; animal health; unsupervised learning; milk; mid-infrared

资金

  1. Walloon Government (Service Public de Wallonie, Namur, Belgium) [D31-1390]
  2. European Union (Brussels, Belgium) [613689]
  3. INTERREG NWE HappyMoo project [NWE 730]
  4. National Fund for Scientific Research (F.R.S.-FNRS, Brussels, Belgium) [T.0095.19, J.0174.18]

向作者/读者索取更多资源

In this study, a holistic approach using big data from milk recording was proposed to assess dairy cow health status. By analyzing the data with unsupervised learning algorithms and validating the results, the study showed the potential to monitor dairy cows on a large scale and detect health disorders.
Among the dairy sector's current concerns, the as-sessment of global animal health status is a complex challenge. Its multidimensionality means that global monitoring tools are rarely considered. Instead, specific disease detection is often studied separately and, due to financial and ethical issues, uses small-scale data sets focusing on few biomarkers. Several studies have already been conducted using milk Fourier transform mid-infrared (FT-MIR) spectroscopy to detect mastitis and lameness or to quantify health-related biomarkers in milk or blood. Those studies are relevant but they focus mainly on one biomarker or disease. To solve this issue and the small-scale data set, in this study, we proposed a holistic approach using big data obtained from milk recording, including milk yield, somatic cell count, and 27 FT-MIR-based predictors related to milk composition and animal health status. Using 740,454 records collected from 114,536 first-parity Holstein cows in southern Belgium, we performed repeated unsupervised learning algorithms based on Ward's agglomerative hierarchical clustering method to find potential interesting patterns. A divide-and-conquer approach was used to overcome the limitation of computational resources in clustering a relatively large data set. Five groups of records were identified. Differences observed in the fourth group suggested a relationship to metabolic disorders. The fifth group seemed to be related to mastitis. In a second step, we performed a partial least squares discriminant analysis (PLS-DA) to predict the probability of belonging to those specific groups for the entire data set. The obtained global ac-curacy was 0.77 and the balanced accuracy (i.e., the mean between sensitivity and specificity) of discrimi-nating the fourth and fifth groups was 0.88 and 0.96, respectively. Then, a validation of the interpretation of those groups was performed using 204 milk and blood reference records. The predicted probability associated with the metabolic disorders issue had significant cor-relations of 0.54 with blood beta-hydroxybutyrate, 0.44 with blood nonesterified fatty acids, -0.32 with blood glucose, -0.23 with milk glucose-6-phosphate, and 0.38 with milk isocitrate. In contrast, the predicted prob-ability of belonging to the mastitis group had correla-tions of 0.69 with milk lactate dehydrogenase, 0.46 with milk N-acetyl-beta-D-glucosaminidase, -0.18 with milk free glucose, and 0.16 with milk glucose-6-phosphate. Consequently, these results suggest that the obtained quantitative traits indirectly reflect some of the main health disorders in dairy farming and could be used to monitor dairy cows on a large scale. By using unsuper-vised learning on large-scale milk recording data and then validating the pattern using reference laboratory measures, we propose a new approach to quickly assess dairy cow health status.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据