Multi-target pig tracking algorithm based on joint probability data association and particle filter

In order to evaluate the health status of pigs in time, monitor accurately the disease dynamics of live pigs, and reduce the morbidity and mortality of pigs in the existing large-scale farming model, pig detection and tracking technology based on machine vision are used to monitor the behavior of pigs. However, it is challenging to efficiently detect and track pigs with noise caused by occlusion and interaction between targets. In view of the actual breeding conditions of pigs and the limitations of existing behavior monitoring technology of an individual pig, this study proposed a method that used color feature, target centroid and the minimum circumscribed rectangle length-width ratio as the features to build a multi-target tracking algorithm, which based on joint probability data association and particle filter. Experimental results show the proposed algorithm can quickly and accurately track pigs in the video, and it is able to cope with partial occlusions and recover the tracks after temporary loss.


Introduction
Multi-target tracking in real scenes is becoming more and more important in computer vision and image processing applications [1][2][3] , such as intelligent surveillance and control, robotics, space navigation, and automatic driving. With the increasing number of tracking targets, how to solve complex interactions and occlusions becomes a very difficult and important issue in visual tracking [4] . It has been believed a crucial and challenging problem since there are many uncertain factors during multi-target tracking [5] , such as measurement noise, cluttered background, occlusions, the changeable number of targets, and varying targets' appearances and motions.
Generally speaking, multi-target tracking has to solve two problems jointly: the estimation problem as in traditional tracking, and the data association, especially when multi-object interaction exists.
Many tracking algorithms solve the estimation problem in a Maximum A Posteriori (MAP) formulation [6] . In other words, the largest posterior probability of the current object state is based on current and previous observations. The position estimation process is one of the most important tasks in any tracking application, which is needed both to extract information about the tracking environment and to know the concrete location of the tracking target itself. Meanwhile, the limitations imposed by the measurement were taken into account, the estimation process has to consider the noise associated with them to obtain reliable information about the position.
The representative algorithm of this process estimation includes Particle Filter and Kalman Filter algorithm. Particle filter is the most widely used, which is adaptive to nonlinear and non-Gauss dynamic systems [7] . Particle filtering is a technique for implementing a recursive Bayesian filter by sequential Monte Carlo simulations. The key idea is to represent the required posterior density function by a set of random samples with associated weights and to compute estimates based on these samples and weights. The weights of particles are related to the observation of the object in the current frame. In recent years, more and more researchers use particle filtering to track moving objects in image sequences. Yang et al. [8] proposed a robust player detection algorithm based on salient region detection and tracking based on enhanced particle filtering. The field of sports was segmented by salient regional detection, and the Otsu algorithm was applied to the edge detection of soccer players. It used an improved particle filter, which improves the algorithm of the sample and the likelihood function of the color feature and edge feature tracked by the soccer player. Ma et al. [9] proposed a particle filter-based object tracking algorithm fusing color and Haar-like feature. It is adaptive to changes of the object and environment according to updating Haar-like features because Haar-like feature-based Semi-Supervised Online Boosting tracking algorithm can well distinguish between the object and background. Ma et al. [10] proposed a new particle filter algorithm, which processes tracking concurrently in several routes, and each route created particles separately. Each particle was optimized with a new regular iteration method. Zuriarrain et al. [11] presented a particle filter for multiple persons tracking designed for an FPGA-based smart camera. In order to sample the particle swarm in relevant regions of the high dimensional state-space with increased particle diversity, the algorithm proposed a new joint Markov Chain Monte Carlo-based particle filter with short Markov chains, devoted to each individual particle. Chavali et al. [12] proposed hierarchical particle filtering for multi-modal data fusion with application to multiple-target tracking. The algorithm proposed hierarchical particle filter estimated the global filtered posterior density of the unknown state in multiple stages, by partitioning the state space and the measurement space into lower-dimensional subspaces. The particle filter algorithm failed to take into account the interactions between targets, so it is difficult to obtain better tracking results in the condition of interactions using this algorithm. When the targets are close to each other, or their paths cross, the data association techniques can assist tracking without losing the identity of each target. There are many classical algorithms about data association [13][14][15] , including Nearest Neighbor (NN), multi-hypothesis tracking (MHT), probabilistic data association filter (PDAF), and joint probabilistic data association filter (JPDAF), etc. These multi-target tracking algorithms based on data association have been used extensively in the context of computer vision. Many researchers have subsequently improved these algorithms. Tchamova et al. [16] presented an approach for target tracking, which incorporates the advanced concept of generalized data (kinematics and attribute) association to improve track maintenance performance in complicated situations (closely spaced targets), when kinematics data were insufficient for correct decision making. It used the Global Nearest Neighbor-like approach and Munkres algorithm to resolve the generalized association matrix. Kim et al. [17] revisited the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework. The algorithm introduced a method for training online appearance models for each track hypothesis to further utilize the strength of multiple hypotheses tracking (MHT) in exploiting higher-order information. And appearance models can be learned efficiently via a regularized least squares framework. Rasmussen et al. [18] presented probabilistic data association methods for tracking complex visual objects. The algorithm introduced a randomized tracking algorithm adapted from an existing probabilistic data association filter (PDAF) that was resistant to clutter and followed the agile motion, which was applied to three different tracking modalities.
The joint probabilistic data association filter (JPDAF) algorithm is an extension of the probabilistic data association filter (PDAF) algorithm for a single target [19] . The joint probabilistic data association filter (JPDAF) finds the state estimate by evaluating the measurement-to-track association probabilities [19] . Some methods are presented to model the data association as random variables which are estimated jointly with state estimation by EM iterations [20,21] .
Rezatofighi et al. [22] revisited the joint probabilistic data association (JPDA) technique and proposed a novel solution based on recent developments in finding m-best solutions to an integer linear program. The key advantage of this approach is that it makes JPDA computationally tractable in applications with high target and/or clutter density. Most notably, the joint probabilistic data association filter (JPDAF) is influential algorithms in the above methods. However, most of these algorithms are in the small target tracking community where object representation is simple.
The key elements of the natural pig behavior traits include feed intake and water intake frequency, and excretion frequency. These factors indicate the growth rate of species. The monitoring of behavioral traits and analysis is indispensable to determine animal health [23][24][25] . Concerning the existing large-scale farming model, for example, eye observation and manual record and the way based on color features, it has some limitations in a practical application. The automation and intelligent monitoring of pig behaviors based on live pig motion features and video image information has become one of the most effective monitoring methods. By using pig-based detection and tracking technology, the health status of pigs can be assessed in time, the disease dynamics of live pigs can be accurately monitored, and the morbidity and mortality of pigs can be reduced. Matthews et al. [26] developed an automated monitoring system that can automatically track pig movement with depth video cameras, but it did not mention the specific problems encountered in the tracking, such as how to cope with partial occlusions and to recover the tracks after temporary loss. In view of the limitations of existing pig individual behavior monitoring technology, and based on the pig behavior characteristics and moving target tracking technology, this paper proposes a method that uses the color feature, target centroid and the minimum circumscribed rectangle length-width ratio as the features to build a multi-target tracking algorithm, which based on joint probability data association and particle filter.
In this study, a sequential Monte Carlo version of the data association scheme is presented for tracking multiple pigs. Firstly, particle filter (pf) combined with joint probability data (JPDA) is proposed to deal with the uncertainty of the origin. The algorithm is able to handle partial occlusion and recover the track after a temporary loss. The probabilities calculated for data associations take part in the calculation of probabilities of the number of objects. Secondly, color feature, target centroid and the minimum circumscribed rectangle length-width ratio are used as features to confirm target pigs; at the same time, according to the corresponding target region pixel changes to judge whether these are the tracking moving targets or not.

Joint probabilistic data association
The joint probabilistic data association (JPDA) algorithm was used in this study, which is an extension of the PDA algorithm for a single target [20] . The JPDA algorithm assumes knowledge of the number of targets τ. The index t = {1, …, τ} designates one among the τ targets. The measurements at time step k are denoted as , where an artificial measurement z 0 is introduced to handle false alarms or clutter and the number of measurements is given by M k . The measurement of the target association probabilities was evaluated jointly across the targets. Let θ denotes the joint association event (for simplicity, omit the time index k) and θ m t is the particular event that assigns the measurement m to the target t. By assuming that the estimation problem is Markovian, and applying the Bayes' theorem, the joint association probabilities are where, c is a normalization constant and X k ={x 1 , …, x k }, and it is assumed that the measurements are detected independently of each other. The probability of the assignment θ conditioned on the sequence of the targets' states P(θ|X k ) is approximated by Equation where, P D denotes the probability of detection; n is the number of assignments; P FA denotes the probability of false alarm.
Assuming that the measurement is dimension M, the measurement of possibility is is the determinant of , Finally, the probability of a single joint association event is given by With the increase in the number of measurements and targets, the number of associated increases exponentially, the number of possible associations λ, given that N D of the τ targets have been detected, is Therefore, the total number of possible assumptions (since N D does not know) is

Particle filter
Sequential Monte Carlo (SMC) method, also known as Particle Filters (PF) [27][28][29] , is a recursive Bayesian filter based on the sample set. Tracking the object of interest is the posterior state distribution p(x k |Z k ), where Z k = {z 1 , …, z k } denotes all the observations up to the current time step. In Bayesian sequential estimation, the filter distribution can be computed based on two steps of recursion: prediction step (Equation (8)) and filtering step (Equation (9)).
where, the prediction step follows from marginalization, and the new filter distribution is obtained through a direct application of Bayes' rule. The recursive requirement describes a specification for a dynamic model describing the state evolution p(x k |x k-1 ), and the model of the state likelihood model in the light of the current measurements p(z k |x k ). Recursion is initialized with the distribution of some initial States p(x 0 ). Particle filters start with a , the new samples are generated by an appropriately designed proposal distribution.
where, the proportionality reaches a normalizing constant. The is approximately distributed according to p(x k |z 1:k ).

Pig centroid
Firstly extract the contour information of moving targets, a closed curve around the moving targets. And then calculate the contour centroid data according to the edge positions. After getting the contour of the moving targets, each coordinate point on the contour (x i , y i ) could be recorded. Now suppose the total number of coordinate points is N, the moving target centroid (x c , y c ) can be calculated by using the following equation. Extracting the centroid coordinates of eight pigs from piggery in 72 consecutive frames image, and calculate the aspect ratio change difference of eight pigs between two adjacent frames.
Using Gaussian mixture model algorithm [23] to detect the target pigs and obtain the target pig's minimum circumscribed rectangle as well as its four vertices coordinate.
The minimum circumscribed rectangle schematic drawing is shown in Figure 1

Figure 1 Minimum circumscribed rectangle
The target's minimum circumscribed rectangular aspect ratio is

Color measurement model
The work described in this study was based on color measurement features. Following Reference [24], it did not use the entire image z k as a measurement, but the color histogram q k extracted from the image, calculated inside the image area specified by the state vector x k .
The center was defined by (x k,, y k ). In addition, the Gaussian density of the likelihood function was used to measure the color histogram as follows: where, D k is the distance between the reference histogram q * of pigs to be tracked and the histogram q k calculated by the current frame z k in the region specified by the state vector x k . The standard deviation σ of the Gaussian density in Equation (13) is a design parameter. If the two histograms are calculated on the U bins, the distance D k between two histograms is obtained from the Bhattacharya similarity coefficient in Reference [30] and defined as 3 Improved algorithm In this study, particle filter and joint probability data association were combined to apply in pig intelligent breeding to track multi-pig. Firstly, the state of the target was sampled by the particle filter method, and the sample particle was obtained. Secondly, combining with the measurement, the observation and particle correlation of each target were obtained according to the joint probability data association method. Finally, the associated probability of the joint event was calculated and the weight of the particle in the particle filter algorithm was obtained.
For target t, the sampling particle set is At time k, there are t visual targets in the scene, t=1, ..., τ, and the state is x y l w x y l w = X (15) where, (x a , y a , dx a , dy a ) represents the location of the centroid coordinates in x-, y-position and the change on the x-axis and the change on the y-axis in the a-th frame, respectively. (l a , w a , dl a , dw a ) stands for the target pig's length, width, length change rate, and width change rate of the minimum circumscribed rectangle in the a-th frame, respectively.
For visual target t, its observations can be mapped to the observed space through the observation model, namely, , , , According to the joint data association probability, the weight ( ) , i k t w of the sampling particle ( ) , In this study, the pig population was selected as the tracking target, which is similar and interactive in the tracking group. Only using color features could not distinguish between the target pigs tracked. Thus, in the actual tracking process, the target contour centroid is added as the feature to track the pigs. For target t, the weighted sampling particle set is , and the target state filter can be achieved by particle filter. In the framework of particle filter theory, the joint probability data association filtering algorithm based on independent sampling is given below. In the algorithm, ,  The algorithm assumes the number of known maximum targets is τ. It starts with a single filter of evenly distributed particles in the image scene. The variance of this filter is used to monitor the convergence of the filter. Once the filter shows convergence (by convergence, it means that the variance of pf is less than a certain threshold value), in addition to the area around the pigs tracking the first filter, a new filter initialized with particles a prior uniformly distributed across the image. The prior around the region tracked by the first filter (the exclude region) is 0. This choice of prior distribution avoids the possibility of two filters tracking the same pig. The variance of the second filter monitors the convergence of the new filter. Once the second filter is converged, another filter will be initialized with a prior from the region not covered by the first two filters. The number of pigs and the identity of active pigs were estimated during the tracking process, as shown in the following paragraph. This is used to detect when a pig disappears from the scene and to stop the corresponding tracking filter.
Let P(H t |Z k ) (t=1, …, τ) denote the posterior probability of the existence of t number of targets.
According to the total probability theorem, the existence probability of t number of pigs is given by where, x t is the event (hypothesis) that t number of targets exists, the probability of joint existence of particular target pigs can be estimated.

Results and discussion
In this study, in order to achieve effective and real-time tracking of mobile target pigs, the OpenCV library is installed and configured in the Visual Studio 2010 environment. The experiments in this paper used an industrial camera (model is Hikvision DS-2CD3T86FWDV2-I3S, lens focal length is 4 mm, the video resolution was 3840×2160 pixels; video frame rate was 25 frames/s).

1) Target pig centroid test
The pixel in the video was used as a basic unit to establish a coordinate system. Point in video top-left was as the origin of coordinate XOY with horizontal direction as X-axis, vertical downward direction as the Y-axis. By using the object contour centroid calculation method to calculate the centroid of target pigs and the minimum circumscribed rectangle method to calculate the aspect ratio of target pigs, the accuracy of the calculation was ensured. Figure 2 shows one pig's trajectory in 72 frames, and the pig's centroid changes in the average distance between two frames are shown in Table 1.  Figure 2 shows that the minimum circumscribed rectangular aspect ratio of two adjacent frames changes a little.  Table 1 shows the centroid distance change of one pig between two adjacent frames. Since the change between each frame was very small, in order to avoid repeated calculations, the data were obtained every three frames for calculation.
All the data presented were based on a pixel in the video as the basic calculation unit, and the coordinate system of the video was calculated [24] .
Extract the minimum circumscribed rectangular aspect ratio of eight pigs from piggery in 72 consecutive frames image, and calculate the aspect ratio average change difference of eight pigs between two adjacent frames. The result is shown in Figure 3. In this research, the time interval between two adjacent frames from the captured video sequence image is very small. The experimental results show that the shape and centroid location of the same target pig in adjacent frames changes a little, that is, the target pig's movement has the characteristics of continuity. Therefore, the change of a pig's centroid can be used to represent its movement and the minimum circumscribed rectangular aspect ratio represents its shape.
2) Compare tracking results Using the VS 2010 platform (OpenCV) under Windows 7 system, the algorithm proposed in this study was tested (Figure 4). In order to verify the effectiveness of the algorithm, the complete real-world video sequences of one hour were selected for the continuous tracking test of target pigs.
In this study, color feature, target centroid, and the minimum circumscribed rectangle length-width ratio were used as features to confirm targets; at the same time, according to the corresponding target region pixel changes to judge whether these are the tracked moving targets or not. Then, when the standing target pig does not move for a long time, which leads to the surrounding pixels do not change, at this moment the algorithm does not find the target. When the target pig is lying down, but some parts of the body swing, such as the ear swing, tail wagging, which lead to the surrounding pixels change; at the same time, the area of change is greater than one-third of the pig body, this will be judged as the target. Thus, the minimum outer rectangle is obtained according to the change of the corresponding region pixel.
This study adopted a fixed number to track the target live pig, the targets were not tracked when pigs were in a state of rest. Therefore, only 1 target (Figure 8a, in video sequence frame 7456), 2 targets (Figure 8b, in video sequence frame 74), or 3 targets (Figure8c, in video sequence frame 539) were tracked in the video sequence. In order to verify the effectiveness of the algorithm, all video sequence frames with 4 moving targets were selected ( Figure  4). In Figure 4a, the algorithm tracks 4 pigs. In Figure4b, for the partial occlusion of the targets, it can be seen from the graph that the tracking algorithm can still track 4 pigs. In Figure 4c, the targets are separated, the algorithm can still better track 4 targets. In Figure 4d, under the phenomenon of crowd clustering in the trace process, the group is the tracking moving object for overall tracking at this moment. In Figure 4e, when the target is greater occluded, the algorithm can still better track moving targets. In Figure 4f, in the occluded object out of the occluded area, the algorithm can still be robust. In Figure 4g, the algorithm is still valid when the two moving targets interact. In Figure 4h The JPDA method is used to track multi-target in Figure 5. The JPDA algorithm is only suitable for tracking targets in a simple scene. In Figure 5a, the tracking result presented error tracking due to noise interference in the video. In Figure5b, the JPDA algorithm fails to track targets when the interactive behavior of the moving targets and the interaction targets can only be combined as a whole target to be tracked. In Figure 5c, the JPDA algorithm still fails to track the moving targets correctly after the target's separated. In Figure 5d, the JPDA algorithm is invalid again when the group target encounters interaction, and the group target is merged as one target to track. In Figure 5f, the target is separated and without interaction, the JPDA algorithm still cannot track moving targets correctly. In Figure 5g, the JPDA algorithm fails when the target is slightly occluded, and the moving targets cannot be tracked correctly. In Figure 5h, the phenomenon occurs once again that multi-target merges as a target to track. In view of the failure of the above tracking, it showed that this algorithm is not robust to multi-target tracking.
The authors used 8 continuous frames to describe the tracking process. As shown in Figure 6, the centroid position of the moving pig target and the size of the minimum outer rectangle in two adjacent frames changes a little. That is the reason for choosing 72 consecutive frame images (Table 1, Figure 2, Figure 3). Figure 7 is the continuous 8-frame tracking result of the JPDA algorithm, which shows the tracking algorithm is not robust.
The tracking algorithm is proposed to track the different number of moving targets in the video in Figure 8. There is one moving target in Figure 8a; there are two moving targets in Figure  8b; there are three moving targets in Figure 8c; there are four moving targets in Figure 8d. It can be seen from Figure 8 that the algorithm can track moving multi-target better. Therefore, the proposed tracking algorithm is robust.
The JPDA algorithm tracks the different number of moving targets in the video in Figure 9. When there is only one moving target in Figure 9a, and the JPDA algorithm cannot track targets correctly. When there are two moving targets in Figure 9b, the JPDA algorithm tracks correctly, but the tracking result is still affected by the noise and presents errors. When there are three moving targets in Figure 9c, the JPDA algorithm only tracks one moving pig target. And when there are four moving targets in Figure 9d, the JPDA algorithm cannot correctly identify the moving targets.  Because of the limitations of the JPDA algorithm, the tracking robustness is poor. For the tracking results of these two algorithms, the tracking accuracy is compared according to the error number and the correct number, as shown in Table 2. The authors use the total number of moving targets (TM), the number of video image frames (SF), the number of correct tracking frames (CF), the number of error tracking frames (EF), the correct tracking rate (CR) and the error tracking rate (ER) to evaluate tracking performance and compare the proposed algorithm with JPDA algorithm. Considering that the movement of the pig population is continuous, the pig may appear partially with the ear or tail shaking when it is not moving. Thus, this study set a certain area as the determination of the moving target pig. Meanwhile, other factors such as light changes in the video also affect the performance of the algorithm. In order to verify the effectiveness of the algorithm, consecutive frames are selected for testing and statistics obtaining. These consecutive frames include different frame numbers and different target numbers in the video sequence.
Statistics of 100 frame video images that include one moving target in the video sequence were obtained. The correct tracking number of JPDA algorithm is 63 frames and the number of error tracking frames is 37 frames. Thus, the correct tracking rate is 63%, and the error tracking rate is 37%; while the correct tracking number of our approach in this study is 99 frames, the number of error tracking is 1 frame, the correct tracking rate is 99%, and the error tracking rate is 1%.
Statistics of 200 frame video images which include two moving targets in the video sequence were obtained. The correct tracking number of JPDA algorithm was 109 frames, and the number of error tracking frames was 91 frames. Thus, the correct tracking rate was 54.5%, and the error tracking rate was 45.5%; while the correct tracking number of our algorithm was 181 frames, and the number of error tracking was 19 frames. Thus, the correct rate tracking rate was 90.5%, and the error tracking rate was 9.5%.
Statistics of 300 frame video images which include three moving targets in the video sequence were obtained. The correct tracking number of JPDA algorithm was 113 frames, and the number of error tracking frames was 187 frames Thus, the correct tracking rate was 37.7%, and the error tracking rate was 62.3%; while the correct tracking number of our algorithm was 287 frames, and the number of error tracking was 13 frames. Thus, the correct tracking rate was 95.7 %, and the error tracking rate was 4.3%.
Statistics of 1000 frame video images which include four moving targets in the video sequence were obtained. The correct tracking number of JPDA algorithm was 366 frames, and the number of error tracking frames was 634 frames. Thus, the correct tracking rate is 36.6%, and the error tracking rate was 63.4%; while the correct tracking number of our algorithm was 903 frames, and the number of error tracking was 97 frames. Thus, the correct tracking rate was 90.3%, and the error tracking rate was 9.7%.
It can be known from the comparison of the experimental results that the tracking algorithm proposed in this study can effectively track moving multi-target. This tracking algorithm is robust.

Conclusions
In this study, a multi-target tracking algorithm based on joint probability data association and particle filter was proposed. The experimental results from real video sequences show its reliable performance.
The algorithm is able to cope with partial occlusions.
The Joint probability data association filtering algorithm can cope with multi-target tracking when the number of tracking targets is fixed, but it is not implementable for the variable number of multi-target tracking. In the actual scene, the number of moving objects is uncertain, and in the next step, the variable number of moving multi-target tracking will be studied.