Automatic body condition scoring system for dairy cows based on depth-image analysis

Body condition score (BCS) is an important management tool in the modern dairy industry, and one of the basic techniques for animal welfare and precision dairy farming. The objective of this study was to use a vision system to evaluate the fat cover on the back of cows and to automatically determine BCS. A 3D camera was used to capture the depth images of the back of cows twice a day as each cow passed beneath the camera. Through background subtraction, the back area of the cow was extracted from the depth image. The thurl, sacral ligament, hook bone, and pin bone were located via depth image analysis and evaluated by calculating their visibility and curvature, and those four anatomical features were used to measure fatness. A dataset containing 4820 depth images of cows with 7 BCS levels was built, among which 952 images were used as training data. Taking four anatomical features as input and BCS as output, decision tree learning, linear regression, and BP network were calibrated on the training dataset and tested on the entire dataset. On average, the BP network model scored each cow within 0.25 BCS points compared to their manual scores during the study period. The measured values of visibility and curvature used in this study have strong correlations with BCS and can be used to automatically assess BCS with high accuracy. This study demonstrates that the automatic body condition scoring system has the possibility of being more accurate than human scoring.


Introduction 
The metabolizable energy stored in fat and muscle is vital to maintaining dairy cows. Body weight alone is not a good representative of bodily energy reserves, as the relationship between these variables is affected by parity, stage of lactation, frame size, gestation, and breed [1][2][3][4] . Body condition was defined as the ratio of body fat to nonfat components in the body [5] . Because direct measurements of body adiposity are difficult and expensive, multiple body condition score (BCS) evaluation systems have been developed to indicate and evaluate the relative amount of subcutaneous body fat or energy reserves of a live cow [6] .
BCS evaluations can be used to determine whether a cow is in the proper condition for each stage of the lactation cycle. Using BCS information, appropriate dietary changes can be made to maximize the performance of cows [7] . Cows with unfavorable BCS are at high risk of metabolic and other diseases in the peripartum period. At calving, BCS must be sufficient to allow maximal milk production and health, but excessive BCS at this stage may result in calving difficulties and animal losses [6] . In early lactation, BCS is a vital indicator of excessive weight loss, which can lead to metabolic disorders and should be avoided [8] .
At dry-off, parturition, and throughout the lactation cycle, BCS evaluations can be used to identify cows that are at risk of milk fever, mastitis, lameness, and infertility [9] ; therefore, BCS is an important management tool for maximizing milk production and reproductive efficiency [9,10] and even preventing potential disease and lameness [6,11] . It is estimated that less than 5% of US dairy herd managers regularly assign BCS values to their cows [12] . Most dairy farms that conduct BCS in the US use a 5-point system [13] . This system measures the relative amount of subcutaneous fat in 0.25-point increments, where 1 denotes a very thin cow and 5 indicates an excessively fat cow [14] . The manual scoring of body condition requires experienced personnel with adequate training. Although a well-trained scorer can score one cow in a short amount of time, it is time consuming to score all the cows in a large herd on daily basis.
Additionally, the perception of fatness and the understanding of the BCS guidelines vary from person to person, which causes inconsistencies in data from different scorers. The dairy industry and research community have recognized the need for a quick and inexpensive but accurate technology with which to automatically measure body fat on cows in different stages of lactation. Bewley, et al. [15] explored the possibility of developing an automatic BCS system based on 2D digital images. A total of 23 points were selected manually from each image to analyze the contour and shape of cows. The study showed that the hook angle, posterior hook angle, and tailhead depression were significant predictors of BCS. According to the above method, Azzaro, et al. [16] developed an application with which to extract the 23 anatomical points. In the study, a shape descriptor based on principal component analysis was built and tested. Validation testing showed that the average error of the polynomial model was 0.31 from the manual scores. Several other studies have also been conducted to automate BCS evaluation based on 2D and thermal image processing technology [17][18][19] .
As three-dimensional (3D) images contain information on the depth dimension of the body surface of a cow, they have great potential to improve the accuracy of automatic BCS systems. Weber et al. [20] developed an automatic 3D optical system to estimate the backfat thickness (BFT) of cows; this system has great potential for scoring body condition. The correlation coefficient between the observed and estimated BFT was 0.96. Fischer, et al. [21] used a 3D camera to capture the surface of the rear of the cow, and four anatomical landmarks were identified manually from the surface. The principal component analysis was applied to the dataset, and the coordinates of these surfaces in the principal component space were used to build a multiple linear regression model with which to assess BCS. The system still requires some level of human involvement. DeLaval [22] released an automatic BCS system that provides continuous and daily BCS readings [23] ; however, the high cost of the system hinders the popularization and application of the BCS system.
Spoliansky, et al. [24] developed an automatic BCS system using a low-cost 3-dimensional Kinect camera. In that study, 14 features were used to build prediction models. When the model was applied, 94% of the errors were under 0.75 BCS points compared to manual scores, and all errors were under 1 BCS point. However, the models required weight information in addition to depth images. Alvarez, et al. [25] used a depth camera to capture the image of the back of dairy cows. After removing the background, the images directly input into a convolution neural network for training. Finally, 1158 images were used for model training and 503 images were tested. The results show that 78% of the samples with a BCS error less than 0.25.
In light of the above studies, it is apparent that, because of either cost or accuracy, proposed automated BCS systems fall short of the requirements for farm daily use. As such, there is a considerable need for further research to improve the practicality of those systems to a higher level. Therefore, the objective of this study was to (1) develop a fully automatic 3D computer vision-based system with which to assess the BCS of cows with high accuracy (MAE<0.25); (2) explore the possibility of the automated system being more accurate than human scoring. In order to obtain the information directly related to BCS from the depth image, four specific areas (bones or body structure) on the back of the cow were located automatically. Their visibility or curvature was analyzed for quantitative evaluation of fatness in that area. The machine learning algorithms and regression analysis model were constructed to score the body condition accurately. The stability and consistency of the system were tested and analyzed with the comparison of automated BCS to human scores on daily basis.

System setup
The data for this study were collected on the University of Kentucky Coldstream Dairy Research Farm. A group of 94 Holstein cows was milked twice a day in the morning (5:00 AM) and afternoon (4:00 PM). All cows left the parlor through a roofed walkway and returned to the free-stall barn after milking. The walkway was paved with a concrete slab floor with walls on both sides. The width of the walkway was 1.03 m, which restricted the movement of the cows.
A PrimeSense™ Carmine 1.08 RGB+depth sensor (PrimeSense™, Tel Aviv, Israel) was used to capture depth images of each cow's back contours as it walked through the return alley. The camera system was placed 3.05 m above the floor of the walkway, with its field of view covering the entire width of the walkway. The camera was connected to a computer in the dairy office via a 30m active repeater USB 2.0 cable. The images taken by the camera were sent to the computer and stored on the hard drive. The resolution of the depth images was 320×240 pixels with a frame rate of 30 fps.
As shown in Figure 1, the walls of the walkway ensured that the cows were always located in the middle area of the frame and oriented roughly parallel to the midline of the image.
To obtain a single file for each cow, we developed software to record the depth images as each cow passed beneath the system. Four fixed lines in the image scene were used to trigger and stop the recording. As the cow walked from the left to the right side of the scene, data recording started when the cow's nose reached the fourth line. Then, depth frames were captured continuously until the tail end of the cow passed the first fixed line. At the beginning of the recording, the software saved the initial image (without a cow in it) as the background image to perform background subtraction later.

Data acquisition
Depth images of 94 cows were obtained twice a day from April 1st to June 7th, 2014. Over this time, three independent human scorers manually scored every cow in the group once a week on the same day when possible or, at most, within a few days of one another. The median of these three scores from one cow in one week was assigned as the score for the cow that week. Within-cow outliers were removed by comparing the BCS obtained during successive weeks. When a given BCS differed from preceding and subsequent scores by more than ±0.25, the score was removed from the dataset. The objective of this editing technique was to remove individual BCS values that were clearly inconsistent with scores for an individual cow in a short time frame. Because very fat cows were rare in the herd (less than 1.5%), cows with scores above 3.75 were assigned a score of 3.75 to eliminate outliers. Over the two-month data collection period, a total of 94 unique cows were examined at various stages of lactation and levels of body condition. The camera recordings of each cow in a given week were paired with the human-evaluated BCS for that week. The dataset contained 4820 images from 94 cows and their related BCS values from 2.25 to 3.75 (7 classes).
N images were randomly selected from each class to build a training dataset. With M i denoting the number of images in each class, N is the minimum in M i . Through this approach, the training model was trained at the same level for each class, and overtraining for the larger classes was avoided. Table 1 illustrates the number of images (M i ) among different BCS classes and the proportion of training data in each class. Eventually, 952 images were selected for training data, i.e. 136 images from each BCS class (N = 136). The BCS values of the training dataset were normally distributed. The classification and regression models were calibrated on the training dataset, and tested on training and entire dataset, respectively.

Feature definition
The image features used in this study were selected because they are potential indicators of BCS. Body fat reserves on the sacral ligament were measured using the convex hull. Surface curvature was used to define the sharpness of the hook and pin bones. Overall back fatness was also evaluated.
Convex hull. As shown in Figure 2, the sacral ligament has a concave curve on thin cows, and the curve is less concave on fat cows. As a thin cow gains fat on the sacral ligament, the concave parts of the curve are filled in, and the concave curve finally becomes a convex curve. In this paper, the convex hull is defined as a convex curve that is tangent to the concave line and lies at a minimum distance from it; therefore, the convex hull can be a tool with which to simulate how the sacral ligament would look if a cow gained body fat. In this study, the visibility of the sacral ligament was measured by the space between the convex hull and the outline of the sacral ligament.  Figure 3 illustrates the flowchart for drawing the convex hull of a discrete concave curve. For a discrete concave curve containing P points, two points are iteratively selected from the curve to draw line i,j . If line i,j is tangential to the curve, then it belongs to the convex hull of this curve; otherwise, the points are discarded, and a new pair is selected and tested.

Figure 3 Flowchart of drawing the convex hull for a concave curve
The convex hull of the sacral ligaments of a cow was calculated based on the above algorithm and shown in Figure 4. The space between the convex hull and the sacral ligament indicates the potential space in which the cow can carry fat reserves. The average distance between the convex hull and sacral ligaments was calculated using Equation (1) to evaluate the visibility of the sacral ligaments quantitatively.
where, VSL is the visibility of the sacral ligament; SES is the area of the total space between the convex hull and the sacral ligament; LSL is the length of the sacral ligament; WSL is the width from the left hook bone to the right hook bone, and ASL is the average of WSL over the dataset; ASL/WSL is the coefficient used to eliminate the effect of individual size and shape on the VSL; VSL is independent of the height of the cow. When the VSL is close to 0, the sacral ligament is barely visible.

Figure 4 Space between convex hull and outline of sacral ligament
Surface curvature. The fat reserved on the hook bone and pin bone was measured by the surface curvature (SC). The bones of thin cows appear sharper than those of fat cows; therefore, the SC of the bones is larger in thin cows.
In this study, the SC of a piece of the surface was defined as the ratio of the superficial area to the shadow area of that piece of surface. Therefore, the SC is independent of the height and size of the surface. Figure 5 illustrates the hook bones of a thin cow and a fat cow. In Figure 5a, the hook bone is sharp, and the SC is 1.4; meanwhile, the hook bone in Figure 5b has a flat surface and an SC of 1.17. Back fatness. Because the 5-point BCS system mainly focuses on the area of a cow's back above the thurl, a depth threshold, denoted by DT, was used to segment the region of interest in the depth image. For the j-th column from the back to the front of the cow, the height of the point on the spine in that column was denoted by Hsp j . The points in the depth image were filtered according to the rule described by Equation (2). The points that deviated from the Hsp i by less than DT were reserved. Otherwise, the points were discarded. The information that was unrelated to BCS in the depth image was filtered through this process.
, , where, Mask i,j indicates whether the point will be reserved (Mask = 1) or discarded (Mask = 0); Hsp j is the height of the point on the spine in j-th column, and H i,j is the height of the point in the ith row and j-th column. DT was set as 100 in the study.
Thin cows had visible thurls with additional points classified as belonging to the thurl area. Therefore, the visibility of the thurl (VTH) can be evaluated by calculating the ratio of the area of the image after and before image cropping using Equation (3).
where, APH after and APH before are the area of the region from pin bone to hook bone after and before the image cropping, respectively. The methods for locating the pin bone and hook bone will be presented in the following sections. VTH is independent of the height and size of the cow.

Image processing
Background subtraction. The first 1200 camera depth frames without cows were used to build the background image. The background image was then continuously updated to avoid potential error from any single background image. The number of frames for building the background was set to 1200 to ensure that the depth-frame sequence contained at least one depth frame that included the floor and walls of the scene. Background subtraction and threshold processing were performed to obtain pure depth data of cow. After background subtraction, the depth value of each pixel on the cow body was converted into the distance to the floor by adding the height of the camera, which was 3050 mm.
Image rotation. During the movement of cows, the deviation between the body axis and the horizontal axis of the image may occur, which has a great influence on the subsequent image processing. Therefore, the body axis of the cow needs to be detected and corrected. The spine of the cow was detected by calculating the highest point in each column in the depth image. A line was fitted to the group of points. The image was rotated according to the angle between the x-axis and the fitted line. The symmetry of the rotated cow was tested by calculating the overall difference between the left and right of the cow as defined by the line of symmetry [24] .
Image crop. A width threshold was used to eliminate the tail from the depth image. Measuring from the back to the front of the cow, if the width of the pixels in each column was less than a certain threshold, the pixels in this column were set as 0 to remove the tail.
Hook bone and sacral ligament detection. The contour of the back of the cow was divided into left and right parts by a symmetry line. As shown in Figure 6  The y-coordinate and size of the bump were determined by analyzing the slope of the sacral ligament along with that of the convex hull. By combining the x-coordinate of the sacral ligament and the y-coordinate with the size of the bump, the hook bones in the depth image were detected, as shown in Figure 6. The surface curvatures of left and right hook bones were calculated and the average value of them was denoted by CHB.
The sacral ligaments were isolated by connecting points A and B and extracting the slice on the line from the depth image. After the sacral ligaments were extracted from the depth image, the visibility of it was calculated as VSL using Equation 1.
Removal of tailhead and detection of pin bones. The tailhead caused a discontinuous change in the depth image and had a major influence on the analysis of the pin bone; therefore, it was necessary to remove the tail head from the depth image. Figure 7 illustrates the slice containing the tailhead and pin bones. As shown in Figure 7, the tailhead caused two drop points, D 1 and D 2 , in the slice. At points D 1 and D 2 , the distances between the convex hull and the slices were the maximum values, which were denoted as M 1 and M 2 , respectively. The points between D 1 and D 2 were set to 0 to remove the tailhead. From the back to the front of the cow, M 1 and M 2 decreased. If the mean of M 1 and M 2 was smaller than a predetermined threshold, the slice was defined as the end of the tailhead, as shown in Figure 7. The areas separated by the removed tailhead on the left and right sides were the left pin bone and right pin bone, respectively. The surface curvatures of left and right pin bones were calculated and the average value of them was denoted by CPB. For fat cows, the pin bones are entirely hidden, and the length of the tailhead is dramatically shorter than in thin cows. In this study, if the length of the tailhead of a cow was less than 50 mm, then the pin bones of this cow were considered entirely hidden, and CPB was set as 1. For a specific cow, before any further calculations, the length of the tail head was multiplied by ASL/WSL, which was defined in Equation (1), to eliminate the effect of individual cow size.

Prediction models
For each selected depth image, the image processing generated four features (i.e. VTH, VSL, CHB, and CPB) according to the methods described above. The distribution of those four features among BCS classes were analyzed by plotting their boxplot, and analysis of variance (one-way ANOVA) was used to test for differences in measured features with regard to the BCS class. The decision tree learning method was used to build a classification model to predict BCS values based on features. Linear regression and a backpropagation (BP) neural network were used to build regression models for continuous BCS evaluation. The features and the BCS values were normalized to a range of 0 to 1 before they were applied to the regression models.
Decision tree learning. Decision tree learning [26] was utilized to classify the four feature variables into seven BCS levels (2.25 to 3.75 with 0.25 interval) according to the given scores. This predictive model maps a group of observations of an item to the target value of the item. In these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels by using if-then rules. The goal of training is to find the optimal criteria for the if-then rule in each branch of the tree. In the prediction phase, the trained model takes several variables from one observation as input, and a path from the root node to a leaf node is determined by comparing the variables and the criteria of the if-then rule in each branch. The class label that the leaf node represents is the output of the model. Linear regression. The linear regression model operated by assuming that the larger the four features were, the thinner the cow was. When the four features were all 0, the cow was expected to have had a high body condition score. It was assumed that the human evaluator scored the cow based on the highest body condition score in the herd and then reduced the score by perceiving the sharpness of the thurl, sacral ligament, hook bone, and pin bone. Based on that assumption, a model for BCS regression was designed as follows: BCS=µ-w1×VTH-w2×VSL-w3×CHB-w4×CPB (4) where, µ was the highest score in the herd and w1, w2, w3, and w4 were the scores that were subtracted from µ due to the sharpness of the thurl, sacral ligament, hook bone, and pin bone, respectively. VTH and VSL are the visibility of the thurl and sacral ligament, respectively. CHB and CPB are the curvature of the hook bone and pin bone, respectively.
BP neural networks. The BP neural network is a multilayer feedforward network that is trained according to an error BP algorithm. This method is one of the most widely used neural network models [27] . A BP network can be used to learn and store mapping relations in an input-output model, and there is no need to disclose the mathematical equation that describes these mapping relations [28] . Due to this characteristic, the BP network is a feasible way to regress the relationship between the inputs and output, regardless of whether it is linear or nonlinear. In the network in this study, the input layer contains four neurons for each feature and two hidden layers with five neurons in each one. The output of the model is the body condition score that is predicted by the input features. The neurons were fully connected to each other in the different layers. The transfer functions of the hidden layers and output layer were 'tansig' and 'purelin' respectively. The maximum epoch of training was set as 100. The learning rate and training goal were 0.1 and 0.0004, respectively.

Model Evaluation
The decision tree model was evaluated on the entire dataset (D E ) and the training dataset (D T ) with 3-, 5-, and 10-fold validation. K-fold validation was not applicable to the D E because the numbers of samples in different classes of the D E were not even. The linear regression and BP network model were built based on the D T and tested on the D E . The three models were evaluated by the mean absolute error (MAE), the rate of correct classifications (only for the decision tree model), and the rates of predicted scores within 0.25 and 0.5 BCS points of manual scores. The correlation (R 2 ) between the results of the model and the target BCS values was also calculated to evaluate the two regression models. correlation with BCS, followed by the curvature of the pin bone (r = −0.85) and the curvature of the hook bone (r = −0.75). The visibility of the thurl (r = −0.73) had the weakest correlation with BCS among the four predictive features. The four features were positively correlated with each other (p<0.01). Among the features, the correlation between the visibility of the sacral ligament and the curvature of the pin bone was the strongest (r = 0.74), while the correlation between the visibility of the thurl and the curvature of the hook bone was the weakest (r = 0.50). Other correlations among the features ranged from 0.62 to 0.70.  Most features among BCS levels are significantly different between grouped data except for VTH values in BCS 3.5 and 3.75, CHB values in BCS 3 and 3.25, as well as CHB values in BCS 3.5 and 3.75. Two groups of data cannot be distinguished from each other if they are not significantly different. However, other features in these BCS groups (i.e. VSL and CPB) are significantly different, and provide great variability that can classify and predict these BCS values. Distributions and medians of VTH values in BCS 2.5 and 2.75 are close. However, the result of ANOVA test shows they are significantly different and come from the different normal populations, which means their variance is quite different and can provide variability for classification and regression. Compared to the other features, the VSL showed an improved linear relationship with the BCS and reduced interclass overlap. The VSL of a fat cow ranges from 2 mm to 6 mm, which indicates that the sacral ligament is barely visible.
The CHB and pin bones ranged from 1 to 1.5. The CHB dropped sharply as BCS increased from 2.25 to 2.75, while the tendency became flat for BCS values greater than 3, which showed that there was a nonlinear relationship between the CHB and BCS. The CPB had a similar tendency for BCS values from 2.25 to 2.75 and those from 3 to 3.5. However, the drop from 2.75 points to 3 points was considerable.   Table 3 illustrates the classification results of decision tree learning using 3-, 5-, and 10-fold cross-validation based on the training (D T ) and the entire dataset (D E ). As shown in Table 3, the accuracies of 3-, 5-, and 10-fold cross-validations were similar, but the model achieved the highest accuracy when using 10-fold cross-validation, with which 95.38% and 99.68% samples were classified within 0.25 and 0.5 BCS points, respectively, of the manual scores. The accuracy of classifications was improved by 5.25% by increasing the K value from 3 to 10. When the decision tree was trained on D T and tested on D E , the result was not significantly different from that with 10-fold cross-validation on D T .

Results of regression
The BCS regression model was fitted to a training dataset (D T ), and the result was as follows: BCS=3.94-0.35×VTH-0.71×VSL-0.67×CHB-0.72×CPB (5) The parameters in the model showed that the highest theoretical score in the herd was 3.94 and that a sharp thurl, sacral ligament, hook bone, and pin bone will reduce the BCS by 0.35, 0.71, 0.67, and 0.72, respectively. The R 2 of the regression was 0.88 (p<0.01) as shown in Table 4. The BP network took 12 epochs to finish the training on the D T . The linear regression and BP network were tested on D T and the D E ; the results are shown in Table 4.
For the same model, the rates of predicted scores within 0.25 and 0.5 BCS points were not significantly different when using D T and D E as testing data. However, the R 2 was reduced by 0.08 and 0.07 in the linear regression model and BP network, respectively, when they were tested on D E . In general, the BP network achieved higher accuracy than linear regression regarding all the performance indicators, especially the proportion of results within 0.25 BCS points of manual scores; in this respect, the BP network was over 5% more accurate than linear regression. The correlations between the camera and manual BCS values were analyzed to evaluate the regression models; the results are shown in Figure 9. The two regression models were similar in overall performance, but the predicted scores from the BP network were more concentrated than those from linear regression.  Figure 10 illustrates the MAE and standard deviation of the BCS error of each cow during the study period (sorted by MAE) when the BP network was used as a prediction model. The average MAE was 0.11, and all cows had MAEs lower than 0.25. The average standard deviation was 0.069, where 95% of the SD values of cows were less than 0.1. Figure 11 illustrates four patterns of camera scores versus manual scores in four selected cows. In Figure 11a, the predicted scores were close to the manual score during the study period, and the MAE was 0.02. The result in Figure 11b was the most common case among cows, where all the intervals were less than 0.25 and the average MAE of the cow was 0.13. Figure 11c illustrates the cow with the maximum MAE (0.22) in Figure 10.

Predicted scores for individual cows
In Figure 11d, the cow had an abortion, and its BCS dropped from 3 to 2.5 in the first 21 days of lactation 2; the predicted scores tracked that change successfully and even found the change earlier than the manual score changed, with an MAE of 0.09.

Discussion
Our work describes a method for scoring the body condition of cows with high accuracy. The system was developed and tested based on longitudinal data (2 months) from 94 cows. Four features were extracted from depth images, and three models were used to predict the scores based on these features. Compared to previous studies, this study improved the percentage of predicted scores within 0.25 BCS points of manual scores, raising this proportion to 90% using a decision tree and a BP network model with a fully automatic system.
The four features identified in this study all have strong positive correlations with BCS. A linear model was built to demonstrate the weights of the features when the human assigned the scores. The model partly explained the human scoring procedure and demonstrated the theoretical highest score in the herd. Due to the nonlinear characteristics of the features, the accuracy of the linear model was lower than that of the BP network model, which is able to regress linear and nonlinear relationships. The results showed that 95.48% of the samples were scored within 0.25 BCS points of the manual score using a decision tree learning model, which is a higher accuracy rate than that of the linear regression model (86.14%) or the BP network (91.68%). This demonstrates that the decision tree is more feasible for classification than other models if the data are treated as categorical.
In this study, the hook bone and pin bone were detected, and their curvatures were used as indicators of fatness. Bewley et al.
(2008) used 2D digital images to analyze the outline of the backs of cows and generate hook and tailhead descriptions related to BCS. The results showed that the hook angle, posterior hook angle, and tailhead depression were significant predictors of BCS. In our study, angle descriptions were not involved in the model because the edge of the entity in the depth image had a great influence on the 3D points that were close to the outline, which reduced the accuracy of the angles calculated from the outline in the 3D image. The descriptions of the sacral ligament and thurl were explored in this study because these two areas are frequently evaluated in the existing BCS system [14] .
Additional researchers also developed automatic BCS systems based on 2D [16,19] and thermal [17,18] image processing technology. However, it is difficult to accurately detect specific anatomical points of a cow and extract fatness-associated features closely related to BCS based on two-dimensional image data alone. The 3D images used in this study not only provided depth information but also made it possible to measure the physical traits of the cow without additional image calibration. Weber, et al. [20] developed an automatic 3D optical system with which to estimate the backfat thickness (BFT) of cows. The correlation between the observed and estimated BFT values was 0.96, which demonstrated the feasibility of using 3D images to measure the body fat reserves of a cow. It would be worthwhile to study the relationship between BFT and BCS to build a model predicting the latter from the former. Fischer et al. [21] used a 3D camera to capture the surface of the hindquarters of cows, and four anatomical landmarks were identified manually from the surface to predict BCS. However, that study still involved manual processing. Spoliansky et al. also used 3D images of backs to calculate the relative heights of different parts of the cows, and the height information was combined with weight and age to build a model with which to predict BCS. In our study, the model was based only on depth images and required no additional information, which makes this system easy to implement in commercial applications.
Sandgren and Emanuelson [23] reported the validation results from a commercial camera system; a total of 95% of the cows were scored within 0.25 BCS points of manual scores, with 99% of the scores having a standard deviation of less than 0.1. The result of our study showed that all cows were scored within 0.25 BCS points compared to the manual scores, and 95% of the cows had a standard deviation of less than 0.1. Therefore, our system exhibited similar performance to the reported commercial camera. However, the commercial camera used daily rolling average scores from seven-day periods as the output, which can improve the consistency of the output scores. Rolling average operation will greatly reduce the dynamic tracking performance of the scoring system, which makes the system insensitive to abnormal changes in a short time. The system proposed in this paper can ensure high precision and good tracking performance without rolling average.
The manual scores were categorical data. However, the fatness of a cow may fall between two categorical values. Therefore, the difference between the predicted scores and manual scores may be caused by the difference between the actual BCS and the manual BCS in some cows. A prior study also showed that the MAE of a well-trained expert was 0.25. Thus, the standard deviation of the predicted scores could be a good indicator for evaluating the performance of the automatic BCS system when the MAE of the system is lower than 0.25.
The region of the short ribs is another anatomical feature associated with the fatness of a cow. This area was ignored in the current study because the end of the short rib area is invisible on a fat cow, which made it difficult to determine and analyze that area. Future studies should focus on detecting the short rib area and analyzing the fat reserved on it to further improve the accuracy of the system.

Conclusions
Specific areas that are related to BCS, including the thurls, sacral ligaments, hook bones, and pin bones, can be accurately located through depth-image processing and the use of convex hulls. Measured values of the visibility and curvature of the four areas were strongly correlated with manually assigned BCS values. When the BP network was used, the system can score each cow within 0.25 points compared to the target BCS during the two-month study period (i.e. MAE of each individual cow is less than 0.25); meanwhile, the averaged MAE and SD of all cows were 0.11 and 0.069, respectively. The result shows that the system has high precision and good tracking performance, which demonstrates that the automatic system has the possibility of being more accurate than human scoring. Future studies should focus on analyzing the short rib area to provide an additional fatness-related feature and further improve accuracy.