Optimized machine learning-collaborative filtering model for mastitis prediction in dairy cows
Keywords:
mastitis prediction, machine learning, feature selection, XGBoost, collaborative filteringAbstract
Mastitis is a major disease affecting dairy cow health and milk production. This study established an integrated machine learning (ML) model combining herd- and individual-level data to achieve efficient and balanced prediction of clinical mastitis. Data were collected from 5284 lactating Holstein cows on two farms in southern and northern China. Five feature processing methods—recursive feature elimination (RFE), contrastive learning (CL), slopes and intercept, milk-conductivity ratio, and differences—were evaluated with four ML algorithms: Support vector machine (SVM), random forest (RF), XGBoost, and backpropagation neural network (BPNN). Among them, the XGBoost model with the milk-conductivity ratio feature achieved the best performance, with a sensitivity of 0.81 and specificity of 0.75. To further address the imbalance between sensitivity and specificity, collaborative filtering (CF) was introduced into the XGBoost model to incorporate both herd and individual cow information. The resulting XGBoost–CF model improved sensitivity to 0.83 and specificity to 0.87, enhancing the model’s ability to identify both healthy and diseased cows. This integrated ML–CF framework provides an effective strategy for early mastitis prediction, offering practical support for intelligent dairy herd management and precision livestock farming.
Keywords: mastitis prediction, machine learning, feature selection, XGBoost, collaborative filtering
DOI: 10.25165/j.ijabe.20261901.10304
Citation: Wu J Z, Liu Y T, Zheng Y J, Yuan X Y, Wang H Y, Yang S H, et al. Optimized machine learning-collaborative filtering model for mastitis prediction in dairy cows. Int J Agric & Biol Eng, 2026; 19(1): 21–25.
References
[1] Chu M Y, Liu X W, Zeng X T, Wang Y C, Liu G. Research advances in the automatic detection technology for mastitis of dairy cows. Trans. Chin. Soc. Agric. Eng., 2023; 39(11): 1–12. (in Chinese)
[2] Chen L J. Cow mastitis and scientific prevention and control measures. China Anim. Health, 2023; 25: 40–41. (in Chinese)
[3] Ma R J, Du L, Ma W D, Zhao J H, Li Q C, Lei C Z, et al. Study on the general situation and prevention and control measures of subclinical mastitis. China Cattle Sci., 2023; 49(4): 47–50. (in Chinese)
[4] Ye W, Ma Z, Yu Y, Han B. Incidence status of mastitis in dairy cows and its prevention and treatment measures in China. Chin. J. Anim. Sci., 2023; 59(9): 343–348. (in Chinese)
[5] Wang A H, Yang L F. Causes, clinical symptoms, diagnosis and treatment of cow mastitis. Mod. Anim. Husb. Sci. Technol., 2023(10): 94–96. (in Chinese)
[6] Zhang Y, Shi Q, Zhou Q M, Feng W Y, Xu X, Wu X. Isolation, identification, drug sensitivity and pathogenicity of pathogenic bacteria in dairy cow mastitis. Heilongjiang Anim. Sci. Vet. Med., 2020; (23): 85–88, 167–168. (in Chinese)
[7] Liebe D M, Steele N M, Petersson-Wolfe C S, De Vries A, White R R. Practical challenges and potential approaches to predicting low-incidence diseases on farm using individual cow data: A clinical mastitis example. J. Dairy Sci., 2022; 105(3): 2369–2379.
[8] Naqvi S A, King M T M, Matson R D, Devries T J, Deardon R, Barkema H W. Mastitis detection with recurrent neural networks in farms using automated milking systems. Comput. Electron. Agric., 2022; 192: 106618.
[9] Tian H, Zhou X J, Wang H, Xu C, Zhao Z X, Xu W, et al. The prediction of clinical mastitis in dairy cows based on milk yield, rumination time, and milk electrical conductivity using machine learning algorithms. Animals, 2024; 14(3): 427.
[10] Satola A, Satola K. Performance comparison of machine learning models used for predicting subclinical mastitis in dairy cows: bagging, boosting, stacking, and super-learner ensembles versus single machine learning models. J. Dairy Sci., 2024; 107(6): 3959–3972.
[11] Pakrashi A, Ryan C, Guéret C, Berry D P, Corcoran M, Keane M T, et al. Early detection of subclinical mastitis in lactating dairy cows using cow-level features. J. Dairy Sci., 2023; 106(7): 4978–4990.
[12] Luo W K, Dong Q, Feng Y. Risk prediction model of clinical mastitis in lactating dairy cows based on machine learning algorithms. Prev. Vet. Med., 2023; 221: 106059.
[13] Ebrahimi M, Mohammadi-Dehcheshmeh M, Ebrahimie E, Petrovski K R. Comprehensive analysis of machine learning models for prediction of sub-clinical mastitis: Deep learning and gradient-boosted trees outperform other models. Comput. Biol. Med., 2019; 114: 103456.
[14] Shi Y L, Li W L, Tang Y J, Mi S Y, Xiao W, Liu L, et al. Studies on risk-assessment-model establishment and prediction of mastitis in Chinese Holstein cattle. Chin. J. Anim. Sci., 2021; 57(3): 84–90. (in Chinese)
[15] Li W L, Zhao T T, Da R, Shi Y L, Guo G, Wang Y C, et al. Application and optimization of dairy cow mastitis risk assessment system in Chinese Holstein. Chin. J. Anim. Sci., 2021; 57(10): 65–72.
[16] Bobbo T, Biffani S, Taccioli C, Penasa M, Cassandro M. Comparison of machine learning methods to predict udder health status based on somatic cell counts in dairy cows. Sci. Rep., 2021; 11: 13642.
[17] Ozella L, Brotto R K, Forte C, Giacobini M. A literature review of modeling approaches applied to data collected in automatic milking systems. Animals, 2023; 13(12): 1916.
[18] Zhou X J, Xu C, Wang H, Xu W, Zhao Z X, Chen M X, et al. The early prediction of common disorders in dairy cows monitored by automatic systems with machine learning algorithms. Animals, 2022; 12(10): 1251.
[19] Zhou X Z, Wen H J, Zhang Y L, Xu J H, Zhang W G. Landslide susceptibility mapping using hybrid random forest with Geo Detector and RFE for factor optimization. Geosci. Front., 2021; 12(5): 101211.
[20] Zhang C S, Chen J, Li Q L, Deng B Q, Wang J, Chen C G. Deep contrastive learning: A survey. Acta Autom. Sin., 2023; 49(1): 15–39.
[21] Sun Y, Zhou G Y, Wu T B, Li Y L, Ji S Q, Zhang T. Recent research progress of cow mastitis in China. China Dairy, 2022(4): 43–51. (in Chinese)
[22] Zhai Y, Zhou B, Zhou F Z, Dai X, Liang Y, Zhang H R, et al. Analysis of factors affecting milk yield, conductivity, and activity level in Holstein cows. Chin. J. Anim. Sci., 2024; 60(6): 148–153. (in Chinese)
[23] Fan X, Watters R D, Nydam D V, Virkler P D, Wieland M, Reed K F. Multivariable time series classification for clinical mastitis detection and prediction in automated milking systems. J. Dairy Sci., 2023; 106(5): 3448–3464.
[24] Bonestroo J, van der Voort M, Hogeveen H, Emanuelson U, Klaas I C, Fall N. Forecasting chronic mastitis using automatic milking system sensor data and gradient-boosting classifiers. Comput. Electron. Agric., 2022; 198: 107002.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal of Agricultural and Biological Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.
IJABE is an international peer reviewed, open access journal, adopting Creative Commons Copyright Notices as follows.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).