4.7 Article

Can unsupervised learning methods applied to milk recording big data provide new insights into dairy cow health?

Journal

JOURNAL OF DAIRY SCIENCE
Volume 105, Issue 8, Pages 6760-6772

Publisher

ELSEVIER SCIENCE INC
DOI: 10.3168/jds.2022-21975

Keywords

big data; animal health; unsupervised learning; milk; mid-infrared

Funding

  1. Walloon Government (Service Public de Wallonie, Namur, Belgium) [D31-1390]
  2. European Union (Brussels, Belgium) [613689]
  3. INTERREG NWE HappyMoo project [NWE 730]
  4. National Fund for Scientific Research (F.R.S.-FNRS, Brussels, Belgium) [T.0095.19, J.0174.18]

Ask authors/readers for more resources

In this study, a holistic approach using big data from milk recording was proposed to assess dairy cow health status. By analyzing the data with unsupervised learning algorithms and validating the results, the study showed the potential to monitor dairy cows on a large scale and detect health disorders.
Among the dairy sector's current concerns, the as-sessment of global animal health status is a complex challenge. Its multidimensionality means that global monitoring tools are rarely considered. Instead, specific disease detection is often studied separately and, due to financial and ethical issues, uses small-scale data sets focusing on few biomarkers. Several studies have already been conducted using milk Fourier transform mid-infrared (FT-MIR) spectroscopy to detect mastitis and lameness or to quantify health-related biomarkers in milk or blood. Those studies are relevant but they focus mainly on one biomarker or disease. To solve this issue and the small-scale data set, in this study, we proposed a holistic approach using big data obtained from milk recording, including milk yield, somatic cell count, and 27 FT-MIR-based predictors related to milk composition and animal health status. Using 740,454 records collected from 114,536 first-parity Holstein cows in southern Belgium, we performed repeated unsupervised learning algorithms based on Ward's agglomerative hierarchical clustering method to find potential interesting patterns. A divide-and-conquer approach was used to overcome the limitation of computational resources in clustering a relatively large data set. Five groups of records were identified. Differences observed in the fourth group suggested a relationship to metabolic disorders. The fifth group seemed to be related to mastitis. In a second step, we performed a partial least squares discriminant analysis (PLS-DA) to predict the probability of belonging to those specific groups for the entire data set. The obtained global ac-curacy was 0.77 and the balanced accuracy (i.e., the mean between sensitivity and specificity) of discrimi-nating the fourth and fifth groups was 0.88 and 0.96, respectively. Then, a validation of the interpretation of those groups was performed using 204 milk and blood reference records. The predicted probability associated with the metabolic disorders issue had significant cor-relations of 0.54 with blood beta-hydroxybutyrate, 0.44 with blood nonesterified fatty acids, -0.32 with blood glucose, -0.23 with milk glucose-6-phosphate, and 0.38 with milk isocitrate. In contrast, the predicted prob-ability of belonging to the mastitis group had correla-tions of 0.69 with milk lactate dehydrogenase, 0.46 with milk N-acetyl-beta-D-glucosaminidase, -0.18 with milk free glucose, and 0.16 with milk glucose-6-phosphate. Consequently, these results suggest that the obtained quantitative traits indirectly reflect some of the main health disorders in dairy farming and could be used to monitor dairy cows on a large scale. By using unsuper-vised learning on large-scale milk recording data and then validating the pattern using reference laboratory measures, we propose a new approach to quickly assess dairy cow health status.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available