Nondestructive determination of GABA in germinated brown rice with near infrared spectroscopy based on wavelet transform denoising

The objective of this study was to analyze the content of γ-aminobutyric acid (GABA) in germinated brown rice (GBR) by using near-infrared spectroscopy (NIRS) and the pretreatment method of wavelet de-noising (WD). The prediction accuracy of the NIRS model established by the Daubechies5 wavelet basis function at 3 level denoising treatment is the highest, the correlation coefficient for calibration (rc) is 0.931, the root mean square error of calibration (RMSEC) is 0.4038 mg/100 g, the Bias of calibration is 0.006, the correlation coefficient for prediction (rp) is 0.916, the root mean square error of prediction (RMSEP) is 0.4329 mg/100 g, the Bias of prediction is 0.010, and the ratio of performance to deviation (RPD) is 4.911. Results showed that the predicted and actual values had high correlation. Therefore, these results indicate that the NIRS model treated by WD is feasible to detect GABA content in GBR rapidly and nondestructively.

1 Introduction  Germinated brown rice (GBR) is a kind of active brown rice obtained by brown rice germinating under suitable conditions until the bud length is 0.5-1.0 mm and drying at low temperature [1,2] . The germination process of brown rice is a process in which a variety of endogenous enzymes are activated, and macromolecular substances are degraded, transferred and synthesized [3] . GBR has been reported to contain more bioactive substances than brown rice, such as ferulic acid, oryzanol, tocopherol, and especially γ-aminobutyric acid (GABA) [4][5][6] . The content of GABA in GBR is noticed to be ten times more as compared to milled white rice and two times more than that of brown rice [7] . GABA is widely distributed in animals and plants, which is an important inhibitory neurotransmitter in the central nervous system of mammals [8] . GABA has the physiological activities of lowering blood pressure, improving brain function, activating liver and kidney function, and promoting ethanol metabolism [9] .
There are many methods for the analysis of GABA content in GBR. Bi-directional paper chromatography spectrophotometry and enzyme method are the earliest methods [10] . Compared with bi-directional paper chromatography spectrophotometry, enzyme method is simpler and more sensitive, but the reagents are more expensive [11] . Then, high performance liquid chromatography, which is different from traditional detection methods, is used to obtain reference data [12] . Other analysis methods for GABA content are based on colorimetric method [13] . Currently, amino acid analyzer is often used for quantitative analysis of GABA content [14][15][16] . All these detection methods are generally expensive, difficult and destructive to measurement targets. These methods are also not suitable for the real-time detection of GABA content in GBR.
Near-infrared spectroscopy (NIRS) analysis is a fast and non-destructive detection method without any preprocessing [17][18][19][20] . In recent years, NIRS analysis method has been used to analyze the content of rice protein and amylose [21] , the moisture content of rough rice [22] , the component content of fruits [23] and the evaluation of the postharvest quality of fruits [24] . The spectrum instrument and analysis environment can produce spectral noise and some non-information factors during the test analyses. These factors affect the prediction accuracy of the near-infrared spectrum model. The spectral data are usually pretreated with the first derivative (FD), second derivative (SD), multiplicative scatter correction (MSC), standard normal variate (SNV), baseline offset correction (BOC) and direct orthogonal correction (DOC), or normalized to remove multiplicative scattering or interference caused by sample size distribution and baseline shifts [25] . The spectral data may also be smoothed or filtered using smoothing [26] . The idea of wavelet transformation is that the chemical signal can be decomposed into multiple scale components according to different frequencies, and the sampling step size of different scale components can be taken accordingly, so as to focus on any part of the signal [27] . Wavelet transformation can be used for filtering process of spectral noise, which has been successfully applied to de-noising pretreatment of NIRS [28,29] . Thus, wavelet transformation has great potential in the field of spectral analysis.
As mentioned above, many studies suggest that NIRS analysis method and the pretreatment method of wavelet transformation may have great potential in the analysis of the component content of agricultural products. Therefore, the objective of this study was to apply wavelet de-noising to preprocess the NIRS signals of GABA content in GBR. The effect of wavelet basis function on the prediction accuracy of the NIRS model of GABA content in GBR was studied. The NIRS prediction model of GABA content in GBR was established. The prediction accuracies of the NIRS models established after the wavelet de-noising and conventional spectral pretreatment were evaluated.

Brown rice samples
The variety of the paddy rice used in the experiment was Early 944, and the rice samples were provided by the rice test station of Huazhong Agricultural University. After harvest, the paddy rice samples were dried to the moisture content of 14.5% immediately [30] . Then, they were stored for 3 months at 4°C. Before the germination test, the paddy rice was hulled with a sheller (THU-35B type, Satake, Tokyo, Japan) and processed into brown rice. Mildew grains, discolored grains, embryo free grains, immature grains and broken grains were removed [30] . The brown rice grains with uniform size were selected [30] . They were first rinsed 3 times with deionized water and disinfected for 25 min with sodium hypochlorite solution of 0.2 mol/L. Then, they were rinsed again several times with deionized water. The surface moisture of the brown rice was removed. The samples were kept in reserve.

GBR preparation
GBR was produced by referring to the method of Zhang et al. [13] . The 500 g of brown rice after disinfection and rinsing was placed in a 1000 ml beaker. With 600 ml of distilled water added, the brown rice samples were soaked at 30 °C for 12 h and were then drained. After that, the brown rice samples were evenly put in big culture dishes covered with gauze. The big culture dishes were placed in a constant temperature incubator (CTHI-150(A)B type, temperature fluctuation ±0.2°C, Shi Dukai Equipment Co., Ltd., Shanghai, China) for the germination of brown rice at 15°C, 20°C, 25°C, 30°C and 35°C. The germination time of each germination temperature was 16 h, 20 h, 24 h, 28 h and 32 h, respectively. Then, GBR was dried in an oven (DGH-9053A Yiheng Equipment Co., Ltd., Shanghai, China) for 6 h at 50°C. After cooling, one hundred samples were packaged with self-sealing bags and stored in the refrigerator at 4°C in reserve.

Near-infrared scanning
One hundred GBR samples were collected through germination tests. The sample sets were divided into 2 subsets, including calibration set and prediction set. The ratio of them was 4:1. The calibration set of eighty GBR samples was used to establish the NIRS calibration model. The remaining twenty GBR samples were used as the prediction set to test the prediction performance and stability of the established calibration model. First, GBR samples were used for NIRS analysis. After that, they were ground into powder by a miniature plant sample grinder (FZ102 type, Tester Equipment Co., Ltd., Tianjin, China) and used to obtain reference data. All GBR samples were repeatedly tested.
Fourier transform near-infrared spectrometer (ANTARIS Ⅱ type, Thermo Nicolet, Madison, USA) was applied for spectral scanning in the diffuse reflection mode. The wave range of NIRS was 10 000-4000 cm -1 (1000-2500 nm). Each GBR sample was scanned 64 times and the resolution was 8 cm -1 . The spectral data of each sample were recorded in the form of log (1/R), where R represented reflectivity. GBR samples were placed on a rotating sample stage for spectral scanning. Each GBR sample was scanned 3 times. The average value of the spectral data of GBR samples was used for spectral analysis.

GABA content analysis
The 10 g of GBR samples after scanning were taken to be ground and then screened through the hole sizer of 0.25 mm. The GBR powder of 1 g was weighed and stored in the Erlenmeyer flask. Then, it was dissolved by hydrochloric acid solution of 0.02 mol/L. And sulfosalicylic acid solution of 6% was added. Heating reflux was operated for 5 min in the boiling water bath. After oscillation for 30 min, it was moved into a 50 mL volumetric flask and diluted to the calibration with the citrate buffer solution of pH 2.2. Then, after standing for 1 h at room temperature, it was centrifuged at 1000 r/min for 15 min. Preparation of GABA standard solution, and chromatographic conditions were referred to the method of Zhang et al. [16] . Filtration membrane of 0.45 μm was used for filtration. After that, it was measured by amino acid analyzer (L-8800, Hitachi, Hitachinaka, Japan). The minimum value, maximum value, mean value and standard deviation of measured values of GABA content are presented in Table 1.

Pretreatment and analysis of NIRS data 2.5.1 Spectral pretreatment method
The conventional spectral pretreatment methods and the wavelet de-noising analysis method were used to preprocess the spectral data of GBR samples.
The conventional spectral pretreatment methods included FD, SD, smoothing, normalizing, BOC, MSC, DOC and SNV pretreatment.

Spectral wavelet transformation de-noising
The original spectral signals of GBR samples were decomposed and de-noised by Daubechies (dbN), Coiflet (CoifN) and Symlet (symN) wavelets. Daubechies, Coiflet and Symlet wavelet are a wavelet family with multiple wavelet bases. The optimal selection of wavelet basis has great influences on the de-noising effect of spectral signal. The wavelet decomposition level was fixed at 3. At a given threshold, the spectral signals were de-noised and reconstructed. The spectral data and the GABA content reference values of the GBR samples after the wavelet de-noising pretreatment were reimported into Unscrambler 10.3 analysis software (Camo, Norway). The NIRS calibration analysis model of the GABA content in GBR samples after the wavelet de-noising pretreatment was established. The optimal wavelet basis function was selected according to the modeling result. The wavelet decomposition levels are 3, 4, 5 and 6 for noise elimination. The optimal wavelet basis function and wavelet decomposition level were determined according to the spectral modeling results.

Spectral modeling
All spectral analyses were conducted using the Unscrambler 10.3 (Camo, Norway). The partial least squares regression (PLSR) was used to construct the NIRS prediction model of GABA content in GBR. All spectral data were divided into 2 subsets, including calibration set and prediction set. The ratio of them was 4:1. The spectral data and reference data of the calibration set were used to establish the NIRS prediction model.

Evaluation method of spectral prediction model
The performance evaluation of the NIRS prediction model of GABA content in GBR was determined by the correlation coefficient for calibration (r c ), the correlation coefficient for prediction (r p ), root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP). The higher r c and r p showed that the linear relationship between spectral information and chemical content was closer [31] . The lower RMSEC and RMSEP showed that the prediction performance of the model was better [31] . The Bias value evaluates the size of the system error of the model. The ratio of performance to deviation (RPD) value is the ratio of standard deviation reference values of the validation set to the RMSEP of the validation set. RPD values can reflect the comprehensive performance of the model, and high RPD indicates that the model has high prediction accuracy and stability. The RPD value is greater than 8, which is generally considered to be conducive to process control, development and application research; the value of 5 to 8 indicates that the model can be used for quality control; the value of 2.5 to 5 is acceptable for screening samples. When RPD value is less than 2.5, the model is not reliable [32] . Table 2 shows the influences of different wavelet basis functions on the modeling prediction results of NIRS. As indicated in Table 2, combining the evaluating indicators of correction set and prediction set, all the prediction results of the PLSR calibration model were optimal under the condition of de-noising for all wavelet basis functions when the filter length was fixed at 5. Therefore, the optimal filter length of each wavelet basis function was determined to be 5. Table 2 also shows that the de-noising result of db wavelet basis function is better than that of coif and sym wavelet basis functions. These research results indicated that the db5 wavelet basis function had the best de-noising result. The PLSR calibration model established after db5 wavelet basis function de-noising had the best prediction result.

Effects of decomposition scale on the de-noising results
The original spectral data of GBR were decomposed by wavelets. The effective information for modeling was mainly concentrated in the low frequency part, while high frequency components generally represented irrelevant information such as spectral noise. With the increase of NIRS decomposition levels, the high frequency components became less and the low frequency part kept more information. When the decomposition levels were too high, some useful information retained would also be removed. The useful information for modeling was reduced, which affected the modeling quality of NIRS [33] .
The wavelet basis functions of db5 wavelet, coif5 wavelet and sym5 wavelet were used for wavelet decomposition levels of 3, 4, 5 and 6. As shown in Figures 1-3, the spectral signal of GBR had a larger noise. The de-noising results could be clearly seen when the three kinds of wavelet basis functions were used for wavelet decomposition levels of 3, 4, 5 and 6. When the wavelet decomposition level of 3 was used for decomposition, the de-noising signal was smooth and more information useful for spectral modeling was retained. At the wavelet decomposition levels of 4, 5 and 6, more information useful for spectral modeling may be filtered. The de-noising results of three wavelet basis functions of db5, coif5 and sym5 were shown in Figures 1-3. When db5 wavelet was used as the decomposing wavelet basis function for de-noising, When db5 wavelet was used as the decomposing wavelet basis function for de-noising, the de-noising signal was the smoothest. These research results showed that the wavelet denoising not only removed part of the noise in the original spectrum, but also retained the useful spectral information in the original spectrum.
Signal-to-noise ratio (SNR) and root mean square error index were selected to evaluate de-noising effect [34] . The first derivative of spectral signal was defined as x(n). The spectra after de-noising were defined as ˆ() xn . The SNR calculation formula of estimation signals after the wavelet de-noising was defined as follows: norm( ( )) SNR 20 lg( ) where, norm (x(n)) is Euclidean norm of x(n); n is signal sequence. Root mean square error (RMSE) between original signal and de-noising signal was defined as follows: 2 1R MSE ( ( ) ( )) x n x n N   (2) where, N is the length of noise reduction signal and original signal.   Note: R represents reflectivity. Figure 3 Effects of decomposing scale levels on the de-noising results with wavelet basis function of sym5 The de-noising targets of spectral signals were to remove the noise as well as to retain the original spectral information. The higher signal-to-noise ratio contributed to the lower RMSE between the original signal and the de-noising signal. Therefore, the de-noising signal was closer to the original signal and the de-noising result was better [35] . When using different levels of decomposition, the spectral denoising results were shown in Table  3. SNR was the highest and RMSE was the lowest when the decompositions of three wavelet functions of db5, coif5 and sym5 were at the wavelet decomposing level of 3. The results showed that the decomposition of GBR original spectrum at the wavelet decomposing level of 3 had the best de-noising result. Under the condition of wavelet basis function of db5 and wavelet decomposing level at 3, SNR and RMSE were 29.636 and 0.00015, respectively. The de-noising result was the best. These results were consistent with the de-noising result in Figures 1-3.

Effects of conventional spectral pretreatment and wavelet analysis on prediction results of PLSR model
The prediction result of PLSR quantitative calibration analysis model of GABA content in GBR samples was shown in Table 4. The optimal predictive model and optimal pretreatment method were determined based on the highest r c and r p and the lowest RMSEC and RMSEP values. The higher RPD values indicated better NIRS prediction accuracy and stability. The optimal NIRS prediction model was obtained by de-noising pretreatment with db5 wavelet basis at the decomposing level of 3. The r c and RMSEC value of calibration set were obtained at 0.931 and 0.4038 mg/100 g, respectively. The r p and RMSEP values of the prediction set were obtained at 0.916 and 0.4329 mg/100g, respectively. The RPD value for the prediction set was 4.911. The Bias found on the independent calibration set and prediction set were 0.006 and 0.010, respectively. The research results indicated that the prediction accuracy and robustness of the NIRS model established after the de-noising pretreatment with db5 wavelet basis at the decomposing level of 3 were better and the over-fitting phenomenon was not found. The r c , r p and RPD of the model established after the de-noising pretreatment with db5 wavelet basis at the decomposing level of 3 were higher than those after other pretreatments. The RMSEC and RMSEP of the model established after the de-noising pretreatment with db5 wavelet basis at the decomposing level of 3 were lower than those after other spectral pretreatments. This may be because the wavelet de-noising pretreatment could not only effectively eliminate the noise of spectral information, but also retain the information useful for spectral modeling. Therefore, the prediction accuracy of NIRS model established after the wavelet de-noising pretreatment was high. The r c , r p and RPD of the NIRS model established after other wavelet de-noising were lower while the RMSEC and RMSEP were higher. This was due to the loss of the information useful for modeling during wavelet de-noising. Thus, compared with other wavelet de-noising and other spectral pretreatments, the GBR original spectrum by decomposition wavelet pretreatment with db5 wavelet basis at the decomposing level of 3 had the best de-noising result.
Researchers have focused on the possibility of developing an NIRS method to measure the quality of grain or grain products. Albanell et al. [36] observed that NIRS could accurately predict the binding phenols and anthocyanins in barley flour. Onmankhong and Onmankhong et al. [37] established a prediction model of the texture quality of cooked parboiled rice by near-infrared spectroscopy. The models established in the research were fair for prediction application. Jiang et al. [38] found that near-infrared spectroscopy can be used to monitor fatty acid values in rice storage. The scatter relationship between predicted and actual values of GABA content for the best model (db5-3) was presented in Figure  4. Table 2 and Figure 4 showed that there was a good linear relationship between the predicted value of GABA content by NIRS and the measured value, and the correlation was significant. The results showed that the method was feasible for the quantitative analysis of GABA content in GBR.

Conclusions
A feasible pretreatment method was applied to the NIRS prediction of the GABA content in GBR. The effects of wavelet de-noising and other spectral pretreatment methods on the modeling accuracy of NIRS were studied. The r c , r p and RPD of the NIRS model established after the de-noising pretreatment with db5 wavelet basis at the decomposing level of 3 were higher than those after other pretreatments. The RMSEC and RMSEP of the NIRS model established after de-noising pretreatment with db5 wavelet basis at the decomposing level of 3 were lower than those after other pretreatments. The prediction accuracy of NIRS model established after the optimized wavelet de-noising was higher than that after other spectral pretreatments. These research results show that the PLSR calibration model established after wavelet analysis de-noising can effectively predict the GABA content in GBR. The present research results provide a new method for the detection of GABA content in GBR.