Prediction and evaluation method of TVB-N values distribution in pork by hyperspectral imaging

Hyperspectral imaging makes it possible to map the spatial distribution of quality-indicating attributes over biological targets. However, inconsistencies may arise between these newly available qualitymaps of the same piece of the biological target from different chemometric models. Therefore, such inconsistency of the spatial prediction of the freshness-indicating attribute of total volatile basic nitrogen (TVB-N) over pork loins (Longissimus dorsi) was investigated in this work from the perspective of the accuracy and variation of pixel-wise prediction over the top-surfaces. The partial least square regression (PLSR)-based chemometric modelling was performed on the characteristic spectra of a target extracted from regions-of-interest (ROIs) of both the whole meat and the lean part only, after spectral preprocessing and spatial filtering, and coupled with waveband selection using successive projections algorithm (SPA). Results showed that all PLSR models achieved good and stable predictions of the average TVB-N values of pork loins, despite the differences of data-preprocessing or numbers of wavebands, with Rp being in the range from 0.832 to 0.889 for Meat-ROI and an even higher range from 0.846 to 0.912 if using Lean-ROIs, even with a reduced number of wavebands, accuracy remaining high in the range of Rp from 0.722 to 0.889 for Meat-ROIs and from 0.704 to 0.900 for Lean-ROIs. In contrast, however, devastating differences emerged between the predicted TVB-N distributions over a top-surface. Results showed that excessive variation of pixel-wise predictions rendered utterly useless distribution maps resulted from a direct application of the PLSR or PLSR_SPA models from Meat or Lean ROIs, quantified by root-mean-error (RMSE) of pixel-wise predictions over 21.6 mg/100 g without PLSR, or even the worse with PLSR-SPA, shooting absurdly high about 55 mg/100 g. After appropriate spatial filtering, the excessive between-pixel variation was suppressed to below 8.5 mg/100 g in RMSE, presenting visually better distribution maps, but at a significant loss in accuracy, Rp dropping from 0.79 to a barely enough 0.58 for Meat-ROI, and from 0.86 to a failing value of 0.42 for Lean-ROI. In conclusion, though good and reliable for the average TVB-N prediction of larger surfaces such as that of pork loins, hyperspectral imaging-based prediction of TVB-N distribution is more challenging. A compromise must be made between the accuracy and visual goodness of such quality maps and a meaningful variation of the pixel-wise values, possibly with appropriate spatial filtering. And never trust a quality-indicating map, even if a visually pleasing one, without going through the evaluation of its accuracy and pixel-wise variation.


Introduction
Pork is favored because of its unique flavor and is widely consumed around the world. There are many types of pork processing, some of which result in the pork being preserved for a long time [1] , and some of which result in quick degradation of the quality of the pork [2] . Therefore, the freshness of pork is an important factor that consumers pay attention to.
The evaluation standards of pork freshness mainly include total volatile basic nitrogen (TVB-N), the K-value, the pH value, and polyamine compounds. TVB-N values have been chosen as the freshness evaluation standard, and result from increased microbial growth and creation of biogenic amines during the time of spoilage [3] .
The detection methods mainly include traditional detection methods and non-destructive detection methods. Traditional detection methods are destructive, time-consuming, labor-consuming and cause pollution, which has led the way to the development of nondestructive testing.
Methods of nondestructive testing include near-infrared spectroscopy, terahertz spectroscopy [4] , and near-infrared imaging [5,6] . Taking the near-infrared spectrometer as an example, the spectrometer can measure the spectral information of a single point, and a parameter detection model can be obtained by modeling. However, the limitation of near-infrared spectroscopy (NIRS) lies in the measurement of the points and the lack of spatial information, which cannot meet the requirements of the spatial detection of the tested object. Hyperspectral imaging technology can solve this problem.
Hyperspectral imaging (HSI) is a powerful technology that can obtain good spatial and spectral information with the help of computer technology and original spectral analysis technology. Zhao et al. [7] has conducted research on the detection methods of greengage acidity based on hyperspectral imaging by three kinds of dimension reduction methods and obtained an RMSE of 0.0706 and correlation coefficient of prediction of 0.7925.
Recently, more attention has been paid to the visualization of models. Guo et al. [8] obtained an R 2 p of 0.944 and an RMSE of 2.07 mg/100 g by comparing the characteristic wavelengths selected by PCA and 2DPCA.
In the literature, spatial information of a hyperspectral camera has been reported many times, but the spectral-spatial information of hyperspectral imaging technology has not been reported to date.
However, it is known that inconsistencies may arise between the newly available quality-maps from chemometric models of similar competence according to conventional verification [9] . Therefore, such inconsistency was investigated in this work of the spatial prediction of the freshness-indicating attribute of total volatile basic nitrogen (TVB-N) over pork loins (Longissimus dorsi) using hyperspectral imaging-based chemometrics, from the perspectives of both accuracy and the variation of pixel-wise prediction over the top-surfaces.

Materials and measurement 2.1.1 Samples
Chilled fresh pork loins (longissimus dorsi) were chosen as the research object. They are a kind of pork that shows corruption when stored at 4°C for about 20 d. Fresh pork samples were purchased from a local supermarket (Metro Market, Nanjing, China) in the morning of the first day of the experiment and transported to the laboratory in ice boxes at 2°C-6°C within 30-40 min. Five pieces of pork were bought to process into 100 samples. The samples had a similar size of 10.0 cm×10.0 cm× 1.0 cm (length× width×thickness). All samples were packaged and tagged in PVC plates sealed with PE film and stored at 4°C in a refrigerator.

Measurement of TVB-N content
The TVB-N content was measured by the semi-micro Kjeldahl method based on the procedure mandated by the Chinese national standard [10] . The TVB-N content was calculated according to the following equation: where, TVB-N is the content of the meat sample, mg/100 g; V 1 is the consumption of HCl by the meat filtrate, mL; V 2 is the consumption of HCl by the blank water, mL; c is the concentration of the HCl, mol/L; m is the mass of the meat, g [11] . TVB-N is a well-received freshness-indicator of muscle-protein food like fish [12] and meat. But the threshold of spoilage varies between standards, or subjects, e.g., fish or meat. As a common practice [12][13][14] , the spoilage threshold for fresh and frozen meat, according to National Standard [15] , was adopted in this work as a rejection limit.

Non-destructive testing spectral imagery system
A non-destructive spectral imagery system was constructed (Figure 1), comprising of a push-broom hyperspectral camera (GaiaField-N17E, Sichuan Dualix Spectral Imaging Technology Co., Ltd, China) with an internal optical push-broom structure, a diffuse-reflective dome, a ring light source, a dark box, and a computer (T570, Lenovo, China).
The system collected spectral images at full resolution of 371×640 pixels in the range of 900-1700 nm with an increment of 5 nm, producing a spectral cube with a total of 512 bands for each scan. The sample and the system were relatively stationary during data acquisition. Figure 1 Diagram of the non-destructive spectral imagery system

Spectra extraction and preprocessing
To study the performance of a model based on spectral data extracted from different regions, the lean regions and the meat regions were divided from the hyperspectral images. A threshold division method was adopted to choose regions of interest.
Before that, raw hyperspectral images were calibrated by standard reflectivity transformation. After calibration, the image data were transformed from reflectance to reflectivity in order to minimize the effect of the system. The calibration used a dark image and a white image. The dark image was captured without removing the lens cover. The white image was captured from a reflectance standard (SRT-99-100, Labspere, USA). These two kinds of images were captured to calibrate images daily. A relative reflectivity image cube was calculated by the equation [16] : where, R is the result of the relative reflectivity image cube in percentage, %; R raw is the raw spectral image; R white is the spectral image of a 99% reflectance standard; R dark is the dark spectral image.
To split the background and meat region in the images and split the fat region and lean region in the images, a threshold algorithm method was used to obtain the masks of the meat region and lean region. The mask of the meat region was generated at band 207, and the mask of the lean region was generated at band 180. The masks were used to extract the exact region, by which we computed the mean value of pixels in the region. The mean values were treated as spectral values of the whole pork.
The mean spectra were obtained for the range 900-1700 nm. All spectral waves had high levels of noise information, which hindered modeling. The signal-to-noise (SNR) spectra wave elimination algorithm was adopted to eliminate the parts that did not contain information [17] . Finally, a data range of 993.8-1460.9 nm was obtained.
Furthermore, several traditional processing methods were used for the datasets, including standard normal variate (SNV), multiple scattering correction (MSC) and Savitzk-Golay smoothing (S-G).

Modeling
In order to study the relationship between TVB-N values and spectral variables, PLSR was used to build the predicted model of pork freshness. PLSR is a conventional method, which has been used in many areas and is widely recognized. PLSR is a regression modeling method that has a multivariate output to multivariate input, which is suitable for modeling the data of hyperspectral datasets.
The predictive accuracy of the PLSR model was described by RMSE p and R 2 p . A robust model should satisfy the demand that the values of RMSE p should be smaller and the values of R 2 p should be larger. 2.2.4 Statistics 1) Sampling and replicates Special care was given to the sampling location and immediacy of measurements to make sure a sample's hyperspectral images and its measured TVB-N values matching each other. Each sample was first hyperspectral imaged in the Spectral Imaging Laboratory, Nanjing Forestry University, then transported immediately in an icebox to the Food Engineering Laboratory, two stories below in the same building, where only the 5 mm top layer of the sample was sliced for chemical measurement to match the effective depth of reflectance hyperspectral imaging.
Two replicates were performed for chemical measurement, and the measurement of a sample was valid only when the 2 replicates agreed with each other within a 10% margin.
2) Data processing The dataset was partitioned into a training set and a test set by 3:1. All the modelling and other data-processing were programmed using the statistics toolbox in Matlab 2017b, MathWorks, USA. For PLSR training, a five-fold cross-validation was applied to the training set. And TVB-N value predictions and qualitymaps of TVB-N distributions of the samples from the test set were performed using each modelling method.

Dataset
The dataset of this work was built of the hyperspectral images and TVB-N reference values of 97 samples, excluding 2 failed chemical measurements when replicates did not match and 1 outlier, possibly due to sample pollution, which is shown in Figure 2 as the 19th datapoint or group D. The TVB-N content of the samples ranged from 3.837 to 38.027 mg/100 g, with an average of 11.817 mg/100 g and a standard deviation of 5.423 mg/100 g. As shown in Figure 2, samples from all five pork-loin subjects measured during the 20 d experiment had good coverage of the freshness-range of chill-stored pork, spreading from the very fresh at 3.x or 5.x mg/100 g, over the national standard [15] threshold of spoilage at 15 mg/100 g, till up to almost 30 mg/100 g.

Prediction models
After extracting spectral waves with different masks, two groups of data were used to establish prediction models. Different kinds of pre-treatment resulted in different performances of the models. The original spectral wave and the spectral wave after the standard normal variate (SNV) of meat and lean regions are shown in Figure 3 Table 1. The meat data group had the best RMSE p of 1.828 mg/100 g and the best R 2 p of 0.889 (the data without preprocessing). The lean data group had the best RMSE p of 1.623 mg/100 g and the best R 2 p of 0.912 (the data with preprocessing of MSC (mean)). The method of modeling based on the full spectral range was proven to be feasible by the experimental result.

PLSR prediction based on feature wavebands
To simplify the input data size and improve the performance of the model, feature wavebands extraction methods were executed. As illustrated in Table 2, feature wavebands were selected by SPA. The results are listed in Table 2. The meat data group had the best RMSE p of 1.823 mg/100 g and the best R 2 p of 0.889 (the data preprocessing of normalization). The lean data group had the best RMSE p of 1.734 mg/100 g and the best R 2 p of 0.900 (the data with preprocessing of MSC (mean)).

Visualization of freshness
The spectral waves of each sample were extracted from the meat and lean regions, and the spectral waves contained the whole sample spectral information. Most previous research has used this method and built models of different content. This method provides the spatial information of hyperspectral images, which fuse the whole image into one pixel. In this way, the hyperspectral process can save a great amount of time. However, most samples have complex spatial information. Pork has lean and fat regions, which have large differences in reflectivity. Therefore, it is meaningful to reproduce the spatial information.
A TVB-N visualization method was applied to analyze the distribution of TVB-N on the surface of pork. Using the models built before, the pixels in the same position of the spectral cube were used as inputs to compute the TVB-N value of the exact pixel.
The equation is denoted as: V TVB-N = SPEC 1*n+1 ·Beta n+1*1 (3) The results of the visualization are shown in Figure 4. The predicted values from the freshest to the most spoiled over the surfaces are displayed in pseudo colors from blue to red. The highest TVB-N of visualization was set as 30 mg/100 g to show better visualization.  Figure 4 Prediction mapping of TVB-N As shown in Figure 4, TVB-N values display a discrete distribution.
In the prediction mappings, there are several longitudinal fringe noises. By observing the noise pixels in the pictures, the prediction of TVB-N at these pixels showed a great difference to adjacent pixels, which contributed to the stripe in the prediction. Furthermore, mapping was generated by pixels, which showed great differences pixel by pixel. In the mapping of samples, parts of pixels with values over 30 existed in the mapping. This phenomenon made it hard for researchers to analyze the exact state of samples. It was meaningful to conduct preprocessing on the images to obtain a better mapping.
To reduce the impacts of the noise, spatial mean filters were adopted to process the raw hyperspectral images. The spatial filters used on images were the square filter and disk filter, which have different effects on images. The size of the filters was 25 pixels. The same processing methods adopted for the raw hyperspectral images were also adopted after filtering. The mappings are shown in Figure 4. The mapping contains the results of 10 samples by five kinds of processing.
As shown in the picture, the mappings were proven after the filters. To obtain the performance of each mapping, this paper used five parameters. The results are shown in Figure 5. In Figure 5a, the bar shows that the filter and SPA made little improvement on the accuracy of the model. For the complexity and the existence of noise, the model attempted to best fit the label values, which contained the information of these two matters. On the contrary, Figure 5b shows a great difference in parameters between 12 kinds of models. Here, the R 2 and RMSE are different from the R 2 and RMSE in Figure 5a. The R 2 denotes the R-square values between mean values of TVB-N prediction values of all pixels in the region of interest and the real TVB-N values. The RMSE denotes the mean values of 24 test samples' root mean squared error between the mean TVB-N prediction values in the region of interest and the real TVB-N values. The models without filtering showed a good R 2 performance and a poor RMSE performance.
In particular, after choosing feature wavebands by SPA, R 2 and RMSE both decreased. Although the R 2 value was good, the RMSE could not meet the previously mentioned demand. For models that used filtered hyperspectral imaging, R 2 and RMSE decreased. In Figure 5b, the R 2 of models based on data without choosing feature wavebands yielded values under 0.3, which are worse than those of the models based on data with the processing of SPA. The models based on meat and lean data filtered by a disk filter yielded an R 2 of 0.5754 and 0.4225, respectively. With regards to RMSE, each model yielded a value under 9, which means that the TVB-N prediction values fluctuated slightly around the real TVB-N values.

Conclusions
PLSR and SPA-PLSR models were built to predict pork freshness and its distribution over the top surface. Results showed that good accuracy of average TVB-N values per piece can be reliably achieved despite differences of meat-or lean-ROI, spectral or spatial preprocessing; In contrast, hyperspectral imaging-based prediction of TVB-N distribution of top surfaces is more challenging with contradicting freshness maps; And never trust a quality-indicating map, even if a visually pleasing one or the one resulted from a chemometric model good for average-value prediction, without evaluating its accuracy and pixel-wise variation.
To further improve TVB-N visualization for a possible interpretation of freshness distribution over surfaces, an investigation should be addressed to eliminating longitudinal fringe noise over pork surfaces in hyperspectral imaging-based quality prediction.