Integration of optical and SAR remote sensing images for crop-type mapping based on a novel object-oriented feature selection method

Remote sensing is an important technical means to investigate land resources. Optical imagery has been widely used in crop classification and can show changes in moisture and chlorophyll content in crop leaves, whereas synthetic aperture radar (SAR) imagery is sensitive to changes in growth states and morphological structures. Crop-type mapping with a single type of imagery sometimes has unsatisfactory precision, so providing precise spatiotemporal information on crop type at a local scale for agricultural applications is difficult. To explore the abilities of combining optical and SAR images and to solve the problem of inaccurate spatial information for land parcels, a new method is proposed in this paper to improve crop-type identification accuracy. Multifeatures were derived from the full polarimetric SAR data (GaoFen-3) and a high-resolution optical image (GaoFen-2), and the farmland parcels used as the basic for object-oriented classification were obtained from the GaoFen-2 image using optimal scale segmentation. A novel feature subset selection method based on within-class aggregation and between-class scatter (WA-BS) is proposed to extract the optimal feature subset. Finally, crop-type mapping was produced by a support vector machine (SVM) classifier. The results showed that the proposed method achieved good classification results with an overall accuracy of 89.50%, which is better than the crop classification results derived from SAR-based segmentation. Compared with the ReliefF, mRMR and LeastC feature selection algorithms, the WA-BS algorithm can effectively remove redundant features that are strongly correlated and obtain a high classification accuracy via the obtained optimal feature subset. This study shows that the accuracy of crop-type mapping in an area with multiple cropping patterns can be improved by the combination of optical and SAR remote sensing images.

1 Introduction  Mapping regional and global cropland distribution in a timely manner is a main basis for crop yield estimation, crop growth monitoring and acreage surveys [1,2] .With the ability to quickly and efficiently collect information in real time over wide ranges, remote sensing is a rich data source for crop-type mapping [3] .Multiresolution or multitemporal optical images with abundant spectral and texture information have been widely used in crop identification [4][5][6] .However, data acquisition of optical remote sensing images will inevitably be affected by clouds and rain.Due to the limitation of optical images for discrimination of crops with similar spectra, the accuracy of crop classification with a single optical image may be reduced.
Radar remote sensing mainly acquires images, which are C-band RADARSAT-2 dataset to evaluate five decomposition techniques based on incoherent and coherent decompositions as well as the radar vegetation index (RVI) for crop classification over heterogeneous agricultural areas.Among the incoherent models, Freeman 3-component decomposition was found to perform better in distinguishing vegetation from other land covers.Jiao [20] applied the polarimetric parameters extracted from Cloude-Pottier and Freeman-Durden decompositions to SAR crop mapping, and object-oriented classification results derived from a single date of PolSAR image indicated an overall accuracy of 95% and a Kappa of 0.93, which was a 6% improvement over linear-polarization only classification.
Due to the different characteristics of SAR and optical remote sensing in imaging mechanisms, the combination of radar and multispectral data has become one of the trends in agricultural remote sensing.Several studies verified that higher mapping accuracies can be reached by combining the two data sources [21][22][23] .Steinhausen et al. [24] applied a random forest-based wrapper approach to select the most relevant SAR images for combining with optical images, and the results indicated that combining Sentinel-1 and Sentinel-2 data can improve land cover mapping in cloud-prone regions.De Alban et al. [25] created image stacks from the Landsat and SAR layers and delineated regions of interest to map land cover using a random forest classifier, indicating that the combined use of Landsat and L-band SAR data was superior to individual sensor data.Fusing optical and SAR data has been given more attention in numerous studies and were employed on three levels: the pixel level, the feature level and the decision level.However, due to speckle noise inherent in SAR imagery, pixel-level fusion is inappropriate for SAR data [26] .Hong et al. [27] noted that some methods, such as high-pass filters (HPFs), wavelet transforms and contourlet transforms, can retain texture information, but high-frequency information and spatial details derived from SAR images were lost.Feature-level fusion is usually implemented on features extracted from images.Perushan et al. [28] adopted feature-level image fusion to fuse Sentinel-2 and Landsat 8 images with SAR images and discriminated an alien plant species from its surroundings using a support vector machine (SVM) supervised classification algorithm, but the classification accuracy needs to be improved.Because of the inherent noise in SAR data and different imaging models between optical and SAR data, fusion methods between two heterogeneous images are difficult to perform.
Speckle noise resulting from the coherent imaging system of SAR affects the image quality and interpretation, reducing the classification accuracy derived from the pixel-based classification methods [29] .
Object-oriented classification, as a novel classification approach, can extract features, such as texture, shape and spatial information, after image segmentation which would suppress noise and improve the classification accuracy [30] .Some statistical models and segmentation methods for SAR data have been studied [31][32][33] .However, due to the high dynamics and local statistical properties of SAR images, there are still limitations in SAR image segmentation.Therefore, a feasible solution, that is, image segmentation based on high-resolution optical imagery, could accurately express the ground information and avoid salt-and-pepper noise in image processing.
To make full use of the advantages of optical and SAR images, a scheme has been developed for crop-type mapping using GaoFen-2 (GF-2) optical images and GaoFen-3 (GF-3) PolSAR images.In total, 58 feature parameters related to crop growth were extracted from optical and SAR images, and object-oriented crop classification was performed with the support of farmland parcels segmented from the high-resolution GF-2 image.Furthermore, a novel feature selection algorithm that is based on within-class aggregation and between-class scatter is proposed to extract the optimal feature subset and is abbreviated WA-BS.To utilize and evaluate the complementary advantages of the multisensory method, our main objectives are (1) to test the validity of the combination of GF-2 and GF-3 data for crop-type mapping with the support of farmland parcels derived from optical image segmentation and (2) to quickly select the optimal feature subset from remote sensing images and evaluate the contribution of each feature during the crop growth stage.An experiment was carried out in an agricultural district within Xinjiang Uygur Autonomous Region, China, to examine the feasibility of the proposed method.

Study area and datasets 2.1 Study area
The study area is located in an agricultural district stretching over Bohu and Yanqi counties of Xinjiang Uygur Autonomous Region that faces Bosten Lake to the east, which is the largest inland freshwater lake in China.This area is one of the most productive agricultural regions in Xinjiang Uygur Autonomous Region, mainly for food crops (maize and winter wheat) and economic crops (tomatoes, beets, cotton and fruit).
The geographic extent of the study area is shown in Figure 1.This region has a dry, temperate continental climate.The annual precipitation is approximately 79.2 mm, and the annual evaporation ranges from 1800 to 2000 mm.Due to the multiple crop types and a unique crop calendar, this area is ideal to examine the potential for the combined use of optical and SAR remote sensing images for mapping crops.

Remote sensing data and preprocessing
As one of the new generation satellite programs developed by China, the GF-2 satellite has potential capabilities in many applications [34][35][36] .
The GF-2 satellite carries two 1-m panchromatic and 4-m multispectral cameras, with four multispectral bands ranging from 450 to 890 nm.In this study, image data from 25 June 2018 were downloaded from the website (http://www.cresda.com/CN/sjfw/zxsj/index.shtml) provided by the China Centre for Resources Satellite Data and Application (CRESDA), with a cloud coverage of less than 3%.The preprocessing of the GF-2 data was performed in ENVI 5.3, including radiation calibration, atmospheric correction and geometric correction.Radiometric calibration was performed by using the absolute radiation calibration coefficient of the GF-2 data provided by CRESDA.Atmospheric correction was carried out using the second simulation of the satellite signal in the solar spectrum (6S) model [37] .Geometric correction was based on a WorldView-2 image with accurate spatial coordinate information.A quadratic polynomial correction algorithm was utilized by adding group control points (GCPs), and the error was controlled within 0.5 pixels.In the 1-m fusion image, farmland boundaries could be clearly represented on the GF-2 images, guaranteeing the extraction of the farmland parcels.
The GF-3 satellite, the first Chinese C-band civil radar satellite, has 12 imaging modes with a fine spatial resolution up to 1 m and was launched on 10 August 2016.The GF-3 PolSAR data of quad-polarization strip I (QPSI) mode with a spatial resolution of 8 m were downloaded from the CRESDA website.According to field investigations, the crop types in the study area were mainly wheat, cotton, maize, beets and tomatoes.Considering the phenological characteristics of the crops (Figure 2), GF-3 data from 6 July 2018 were utilized to classify the crop types.At this time, the crop growth characteristics were significantly different.The SAR data were preprocessed in Pixel Information Expert (PIE), including radiometric calibration, multi-look processing, refined Lee filtering (window size of 5×5) [38] and geometric correction.GCPs were used to coregister the GF-3 image and the preprocessed GF-2 image until the error was less than 0.5 pixels.
Figure 2 Crop calendar for the study site

Field data collection
The sample data were collected in mid-July 2018 during fieldwork.With a land use map of the study area and a Google Earth image from this period as reference maps, the random clustered sampling approach [39] was used to obtain the samples.The sampling method first randomly selected certain groups that were non-crossing and non-repeating in the study area, then randomly selected several clusters from the groups and investigated all the individuals or units in these clusters.Then, a simple random sampling method was used to select a certain class in the cluster as the sample to be collected.Each cluster was a geographical area, which was used as an investigation unit.A Trimble Pro XRT GPS receiver with a positioning accuracy of 0.2 m was used to mark the geographical coordinates (latitude and longitude) of these samples, and geotagged photos were taken at each site to record the sample types.In addition, the field data also contained other land cover information, including water bodies, residential areas, woodlands, wetlands and bare land.To train the classification model and investigate the crop classification accuracy, we randomly divided the crop samples into 3224 training samples and 1470 validation samples, as shown in Table 1.

Methodology
The four-step procedure for crop-type mapping with the combined use of optical and SAR remote sensing images is outlined in Figure 3: (1) acquire the map layer of the farmland parcels at the optimal segmentation scale, which is used to classify crop types at parcel level; (2) extract multiple polarimetric features from the SAR image and the texture features and vegetation index from the optical image; (3) select the optimum feature set from high-dimensional features through the WA-BS algorithm proposed in this paper; and (4) use the support vector machine one-versus-rest (SVM-OVR) algorithm to perform object-oriented classification with the support of the farmland parcels segmented from high-resolution optical image.
Figure 3 Overview of the methodology for mapping crops with the combined use of optical and SAR remote sensing images

Farmland parcel extraction from an optical image
The process of catchment area division was divided into four steps: DEM generation, flow direction setting, and two key technologies in the object-oriented analysis: image segmentation and object-oriented image classification.Image segmentation greatly affects the quality of subsequent processing.Multiscale segmentation is a bottom-up segmentation method that combines similar adjacent pixels to form image objects by identifying the similarities between pixels.The segmentation effect depends on the parameter settings, including the smoothness, compactness, color, shape and scale.Because of speckle noise and the lack of effective feature expression, multiscale segmentation of SAR images cannot be performed effectively.Therefore, in this study, we performed multiscale segmentation on the GF-2 image.Typical samples of six types of ground objects distributed in the study area were selected for the supervised classification based on an SVM to obtain the distribution of ground objects: water bodies, residential areas, woodlands, wetlands, bare land and farmland.Then, the extracted nonfarming land was masked to obtain the farmland range.
In the process of image segmentation, the segmentation scale greatly affects the spatial structure of the objects.Therefore, it is important to set the optimal scale so that the homogenous pixels with similar spectral characteristics, brightness and texture can be classified as the same object.In this study, we used the local variance method [40,41] to select the optimal scale of farmland and resegmented the farmland layer.The local variance in the window was calculated by a moving window of size n×n pixels, and then the average local variogram (ALV) map was generated, as shown in Equation ( 1).The range for the ALV curve was determined by the change rate in the ALV with the segmentation scale.In fact, the extreme point or critical point of the local variance cannot be found mathematically.
Therefore, the threshold of the change rate in the ALV should be set according to the experiment.As the segmentation scale increases, the scale that corresponds to the point whose change rate is initially less than the threshold is the optimal segmentation scale of the farmland parcels.where, M and N represent the number of rows and columns of the image, respectively; i and j are the row and column numbers in the local window, respectively; LV(i, j) is the local variance of the i-th row and j-th column pixel; g(i, j) is the gray value of the pixel, and g is the mean of the gray values with the local window.

Feature extraction from multisource images 3.2.1 SAR image feature extraction
Features derived from polarimetric decomposition and the radar indices have physical meanings and are sensitive to crop growth stages [42] .Polarimetric decomposition is helpful for revealing the physical mechanism of scatters by using a polarimetric scattering matrix, such as a covariance matrix, a coherence matrix or a scattering matrix.A number of polarimetric decomposition theories have been proposed, including coherent target decomposition and incoherent target decomposition.The various parameters derived from SAR polarimetric decomposition reflect the scattering difference between the target and the background from different angles, which is helpful for accurately identifying crop targets via the joint use of these parameters.In this study, we extracted three types of features from the SAR imagery: (1) features based on the original PolSAR data (i.e., the Sinclair scattering matrix, coherent matrix and covariance matrix were calculated for the full-polarized image to extract the original matrix parameters); (2) features based on different polarimetric decomposition theories, including Cloude-d [43] , Freeman-d [44] , VanZyl-d [45] , Krogager-d [46] , Pauli decomposition [47] (Pauli-d), Barnes-d [48] , Holm-d [49] , Yamaguchi-d [50] , Huynen-d [51] , Neumann-d [52] and H/A/Alpha-d [53] ; and (3) features including the total scattering power SPAN of the Sinclair scattering matrix, the radar vegetation index (RVI), and the pedestal height (PH).In total, 50 parameters were extracted, as shown in Table 2.

Optical image feature extraction
The texture features derived from optical images are helpful for distinguishing different objects compared with classification with pure spectral features only [54] .The commonly used texture feature extraction methods include a Gabor wavelet transform, the GLCM and a Markov random field [55][56][57] .The GLCM is an effective method to describe the spatial correlation of the pixel grayscale and to count the occurrence frequency of the gray level of two pixels in a certain spatial relationship.Four directions (0°, 45°, 90° and 135°) were selected in this experiment, considering that the GLCM is related to each direction.The eigenvalues of the subwindows from the four directions were calculated and then averaged to reduce the influence of direction on the feature parameters.Comparing the effect of texture feature extraction, this experiment finally chose the parameters with a window of 5×5 pixels, a moving step of (1, 1) and a 64-level grayscale.Six texture feature statistics from the GF-2 optical image were extracted using MATLAB R2016a, including the homogeneity (HOM), energy (ASM), contrast (CON), variance (VAR), mean (MEAN) and correlation (COR), which are numbered from 51 to 56.In addition, the normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) are important indicators for characterizing vegetation growth and are closely related to the vegetation coverage, biomass and the leaf area index.Therefore, the mean NDVI and EVI values within each parcel were calculated as the features used in crop classification, and the NDVI and EVI are numbered as 57 and 58.

Optimal feature subset selection
Selecting the optimal feature subset is essential for quickly finding effective or necessary information in a large feature set.A high dimensionality of features would be hard to process and would lower the classification accuracy due to the redundancy and relatedness of the features.Existing methods of feature subset selection (FSS) are of two types: filters and wrappers.Wrapper models need to utilize a classifier to assess the feature subsets and always have a high computational complexity.Filter models, such ReliefF [58] , mRMR [59] and CFS4 [60] , are independent of the classifiers.However, some FSS methods still have limitations since the individually optimal features are certainly not the optimal combination of these features as a whole.Therefore, a novel feature selection method based on within-class aggregation and between-class scatter, namely, WA-BS, is developed in this study.The aim of the WA-BS algorithm is to choose a compact set of features that is important and sufficient to represent the characteristics of objects.The WA-BS algorithm generally includes two steps: (1) discarding features that are weak in terms of distinguishing target objects to obtain the candidate features and (2) removing redundant features from the candidate features that are strongly correlated.First, the WA-BS functions were defined, and the two functions of each type of crop for a certain feature were calculated.Then, the evaluation criterion function of the feature set was constructed to measure the ability of a feature to distinguish different types of objects.Based on this criterion, the Monte Carlo sampling method was used to select the candidate feature subset from the original feature set.Second, the correlation coefficient matrix was built, and the final optimal feature subset was obtained by setting a reasonable threshold to remove highly correlated features.Through the above steps, the optimal feature set can be selected from high-dimensional features and improve the recognition efficiency.

The extraction of candidate features
Each crop type can be considered an independent class.Therefore, the quantitative criteria for optimal feature selection can be established by calculating the within-class aggregation and the between-class scatter of various types of crops in a certain feature.When the within-class aggregation C nii is high and the between-class scatter D nij is large, the feature of interest can distinguish these ground objects.
Definition 1: the within-class aggregation C nii of the crop class i that corresponds to the n-th dimension feature is: (2) where, n is the dimension of the feature vector; ) is the expected value of X n i , and || • || 2 is used to calculate the 2-norm of a vector.
Definition 2: the between-class scatter D nij of crop class i and crop class j that corresponds to the n-th dimension feature is: ) If there are S types (S≥2) of crops to be identified, the complexity of the identification should be considered, so lower dimensions of the selected feature subset are better.Furthermore, the samples in the same class should be as compact as possible, and samples of different classes should be dispersed; that is, larger discriminant functions G nij are better.
According to the discriminant function in definition 3, the criterion function f n for evaluating the quality of the feature set can be expressed as formula 5: According to the above formula, a larger between-class scatter of various ground object types produces a larger criterion function, while a smaller within-class aggregation produces a larger criterion function.Therefore, the larger the value of the criterion function is, the better the ability of the feature to distinguish ground object types.
Based on the criterion function f n , the Monte Carlo random sampling algorithm was used to select the candidate feature subset to reduce the feature dimensionality and to improve the classification efficiency.The processing chain is shown in Table 3.

Removing redundant features that are strongly correlated
Because of the possible correlation between some candidate features, the redundant features that are strongly correlated also need to be removed.A correlation coefficient matrix can be built for multiple vectors since the correlation between two variables can be expressed by the correlation coefficient.Therefore, feature parameters with similar classification capabilities can be deleted by analyzing the correlation between these parameters.First, a feature parameter matrix was constructed according to the feature subset that was extracted above.All the features were sorted in ascending order according to f n .
Second, the correlation coefficient between every two columns in the feature parameter matrix was calculated by the nonparametric correlation coefficient, and the correlation coefficient matrix P was obtained:  (6)   Due to the equality of r ij and r ji , P is a symmetric matrix.Therefore, only the lower triangular matrix of P needs to be analyzed.According to the relationship between the element value r ij in the correlation coefficient matrix and the threshold T, the elements X i and X j are strongly correlated if r ij >T.Thus, the i-th (or j-th) dimension feature in the feature parameter matrix should be removed to obtain the final optimal feature subset., where l is the total number of samples; (2) Types of ground objects X={x 1 , x 2 , …, x S }, where S is the number of ground objects and x i is the sample set of the i-th class of ground objects; (3) Number of training samples of each class {Mk}, k=1,2,…,N; (4) Number of features q; (5) Threshold tv; (6) Maximum number of iterations mi.

Processing chain
(1) Initialize n = 1 and select the n-th feature; (2) Use ( 1)-( 4) to calculate fn.If fn < tv, delete the n-th feature, and the dimension of the feature space is d = d-1.Otherwise, jump to step (3); (3) Use the linear congruential generator (LCG) vu = (avu-1+c)(modM), ru = vu/M (where a is the multiplier, 0 < a < M; c is the increment, 0 ≤ c < M; M is the modulus, M>0; v0 is the original value, 0 ≤ v0 < M; and ru is a random number) to obtain H random numbers.Select X′= {xe, xf, …, xz} from a set X by using these random numbers, where xf = floor(l×ru)• H and floor()is the integral function; (4) Perform random rearrangement to the selected set X′ to obtain X″; (5) Combine the rest of set X and set X″ to obtain a new set Y; (6) Use ( 1)-(3) to calculate fn′; (7) If fn′ >fn, the new set Y is more suitable for sample classification than the original set X, so num = num+1.Otherwise, the original set X is more suitable for sample classification than the new set Y, and num remains the same.

Output
Candidate feature set

Object-oriented classification by combining optical and SAR remote sensing images
Support vector machines (SVMs), a universal learning method based on statistic learning theory, have an excellent learning performance and a good generalization ability when solving small-sample, nonlinear and high-dimensional problems.The SVM algorithm was originally used to solve the problem of two-class classification.When the training sample set originates from m (m>2) classes, the set belongs to the multiclass classification problem.At present, many algorithms have extended SVMs to multiclass classification problems, and these algorithms are collectively called multi-category support vector machines (M-SVMs).The one-versus-rest (OVR) method is widely used [61] , so this method was utilized to solve the problem of multiclass crop classification in this study.The parameter settings of the SVM model include the type and parameters of the kernel function.Roli F et al. [62] showed that the classification accuracy was generally higher when using a radial basis function (RBF) kernel than when using a polynomial kernel or sigmoid kernel, while using a linear kernel yields the lowest precision.Therefore, the RBF was chosen as the kernel function.With the support of farmland parcels segmented from the optical image as the basic unit for classification, the mean values of multi-features within the farmland parcels were calculated.Because the ranges of the extracted feature parameters were different, the feature parameters were normalized, and then the SVM-OVR classifier was used for classification processing.

Optical-based segmentation of farmland
According to the method in Section 3.1, a mask of the nonfarming areas was built to obtain the range of farmland by using eCognition Developer version 9.0.2.Then, the farmland was resegmented, the segmentation scale was set to range from 10 (very broken) to 80 (very rough) in the interval of 5 and was continuously adjusted to determine the optimal scale.Then, the shape and compactness parameters were set to 0.3 and 0.6, respectively.These settings focused on the dependence of the spectral information and were closely connected to the crop growth, thereby achieving a better segmentation effect for farmland.The ALV and the corresponding change rate were calculated at different scales, and the change rate threshold of the ALV was set according to the experimental results.After the test, the point that corresponded to the change rate that was less than 0.2 was used to determine the optimal scale.To compare the trends, the change rate of the ALV was increased by a factor of 10 and is shown alongside the ALV in Figure 4.An upward trend was observed in the ALV curve with increasing scale.The fitted curve of the corresponding change rate indicated that the change rate was less than the specified threshold for the first time at a scale of 64.Therefore, the preliminary optimal scale for farmland was 64.
Figure 4 ALV and the corresponding change rate at different segmentation scales To evaluate the optical-based segmentation for farmland, the segmentation results at different scales were analyzed to determine the optimal segmentation scale.Referring to the land cover samples from the GF-2 images of the same period, the overlap method [63] was used to calculate the accuracy of the segmentation results.The accuracy evaluation at different scales is shown in Table 4.The extraction accuracy of the farmland parcels initially increased and then decreased with increasing scale, while the number of parcels gradually decreased.The accuracy was the highest (0.925) at a scale of 65, and the number of parcels was relatively small.Therefore, this result was not much different from the predicted best scale of 64, satisfying the optimal segmentation requirement.Through the above process, the complete boundary of the farmland parcels was acquired (Figure 5).The WA-BS algorithm was used to determine the optimal feature subset for crop identification from the 58 extracted features through the feature set evaluation criteria.
The dimensions of the optimal feature subset and the runtime of the algorithm were controlled by setting the maximum number of iterations mi.mi was set to values between 0 and 5000 and by gradually adjusting by intervals of 500.Then, the dimensional variation in the optimal feature subset with different iteration numbers was obtained (Figure 6).The dimension of the optimal feature subset converged to 18 and remained constant when mi≥3000.Therefore, setting mi to 3000 could yield an optimal feature matrix for the identification of the 5 crop types.Because the optimal feature matrix is a symmetric matrix, only the lower triangular matrix is shown to express the optimal feature subset for crop identification (Table 5).After analyzing the optimal feature matrix, only 18 of the 58 features were involved in the classification of the five crop types, thus greatly reducing the information redundancy and calculations.The feature parameters derived from different polarimetric decompositions greatly contributed to the classification results, mainly originating from Freeman-Vol, Freeman-Dbl, VanZyl-Odd from VanZyl decomposition, two eigenvectors from Cloude decomposition, Yamaguchi-Vol, Yamaguchi-Hlx from Yamaguchi decomposition, Neumann-dela-mod from Neumann decomposition, and A and H from H/A/Alpha decomposition.The scattering properties of the target crops are uncertain and temporally vary, so the scattering echo of each pixel mostly consists of the scattering information of multiple scatterers.Therefore, incoherent target decomposition played a positive role in the classification results.
The polarimetric physical parameters derived from incoherent target decomposition reduced the influence of speckle noise, thereby accurately explaining the scattering process of the target, which was beneficial for distinguishing different land cover types.NDVI and texture information such as CON, COR, and ASM derived from the optical image also played an important role in crop identification, indicating the effectiveness of using spectral information and texture information as classification features.Although this study performed filtering before polarimetric decomposition, the results still retained noise information.Therefore, some of the extracted polarimetric feature components may have affected the classification results and even reduced the classification accuracy.
In this case, this information was excluded by feature selection.

Optimal feature subset for crop identification
In this section, the optimal features derived from the optical and SAR images during the growing period of various crops were analyzed by boxplots.Figure 7 shows the boxplots of the 18 features in the optimal feature subset, with the 0.5th, 25th, 50th (median), 75th and 99.5th percentiles shown.
As shown in Figure 7, the volume and double-bounce scattering components from Freeman decomposition played a more important role in crop classification than the surface scattering component, and the ability of the volume scattering component to distinguish crop types was greater than that of the double-bounce scattering component.
The reason for this finding may be because Freeman decomposition is more accurate at modeling volume scattering, which is influenced by the actual scattering mechanisms of crops.Analyzing the features of volume scattering of various crops from the Freeman decomposition indicated the following: (1) the volume scattering component of cotton during this period was obviously dominant.This period was the growing stage of cotton, and canopy scattering was the main backscattering component.The incident waves scattered many times, so the volume scattering was increased.(2) Because of the uniform growth of wheat in the ripening stage, the eigenvalue for volume scattering was the smallest among all the crops, and the distribution of the eigenvalues was relatively dispersive.In contrast, the value for double-bounce scattering was large.(3) The eigenvalue for volume scattering of tomatoes was relatively large and its distribution was the most concentrated.The height of beet plants was relatively low, so the reflection of radar waves mainly involved surface reflection.The branches and the ground comprised a pair of orthogonal surfaces, which sometimes caused double-bounce scattering.
Therefore, beets had a small eigenvalue of double-bounce scattering.
Cloude theory [43] decomposes the polarimetric coherence matrix into a weighted sum of three components p. q. r.
Figure 7 Boxplots of the optimal feature subset.(a-r) are the optimal features extracted from the 58 features The scattering entropy H reflects the randomness degree between isotropic scattering (H=0) and completely random scattering (H=1) of a scattering medium.The larger the value of H is, the higher the randomness degree.When analyzing the scattering entropy of various crops, the scattering type of maize during this period was complicated.Here, maize was in the early harvest stage, so the maize canopy was dense, resulting in a high scattering entropy.In contrast, beets were in the middle of the growing period, and the row spacing of beets was relatively large when seeding, so the area of bare soil was large and the surface structure was simple.In this case, the target contained relatively simple scatters, and the polarimetric entropy was relatively low.The anti-entropy A, a supplementary parameter of the scattering entropy H, is also in the optimal feature set for crop identification.The energy parameter SPAN contains the intensity information of the scattering mechanism, so introducing SPAN into the classification helped to distinguish ground objects with the same scattering mechanism but with different scattering intensities.The RVI can reflect the vegetation features and volume scattering information, which exhibited certain differences because the density of crop covers during this period was different.The density of wheat at harvest was the highest, so the incident waves had sufficient random scattering in the medium, resulting in the depolarization of the scattering echo.The planting of beets was sparse, and the vegetation structure was relatively simple, so any random scattering occurring when the incident waves interacted with the vegetation was rare.Most of the incident waves directly escaped after single-or double-bounce scattering, and the depolarization effect was low.In addition, DERD can be used to compare the importance of different scattering mechanisms, which played a certain auxiliary role in crop classification.The SE is defined based on the intensity and polarization; the intensity contribution is related to the backscattering energy, and the polarization contribution is related to the polarizability.Analyzing the SE boxplot, the value of maize was the largest during this period, and the SEs of tomatoes, wheat, cotton, and sugar beets were lower, indirectly reflecting differences in the backscattering energy of various crops.
The texture features extracted from the optical image, including CON, COR and ASM, were important eigenvectors for crop identification according to the boxplots.The values of these three types of eigenvectors for cotton were generally large.During this period, the coverage of cotton was high, and the cotton leaves had unique characteristics; thus, the texture primitives strongly contrasted, and the grooves were deep, indicating obvious texture effects.In contrast, the growth of wheat was relatively uniform, and the texture was fine; the ASM and CON values of wheat were the lowest.The NDVI has an important ecological significance and can reflect the crop growth information, so it was included in the optimal feature set for crop identification.However, due to the similar NDVI values for cotton and maize during this period, the NDVI had a weaker ability to distinguish between the two crops, so it was not the main feature to separate cotton and maize.

Crop classification and accuracy assessment
With the SVM-OVR classifier, referred to in Section 3.4, the training samples of the five types of crops were used to train the model, and the samples were used for accuracy assessment.When employing the classifier, the kernel parameter γ and the penalty factor C in the RBF kernel need to be determined.A cross-validation algorithm was employed to determine these two parameters [64] .In this study, the cross-validation parameter selection model from LIBSVM-3.22 was used to search for the optimal set of γ and C.After the experiment, the optimal parameters were considered γ = 0.125 and C = 512.Finally, with the optimal feature subset, the classifier was employed in object-oriented classification.The crop classification result is shown in Figure 8, and the statistics of the crop acreage were calculated using ArcGIS 10.3 (Figure 9).From Figure 8, crop-type mapping supported by segmentation from the optical image accurately identified crops and avoided the "salt-and-pepper" phenomenon that often appears in pixel-based classification.The crop type of each parcel was indicated, and confusion among different crop types was rare.Analyzing Figure 9, tomatoes were the most widely planted crop in the region, with an area of 156.5 km 2 , accounting for 25.98% of the total planting area, followed by cotton (147.6 km 2 , accounting for 24.51%), wheat (112.4 km 2 , accounting for 18.66%), beets (97.5 km 2 , accounting for 16.19%) and maize (88.3 km 2 , accounting for 14.66%).According to the data published in the 2018 Xinjiang Agricultural Statistical Yearbook, the actual planting areas of tomatoes, cotton, wheat, beet and maize in the study area were 165.2 km 2 , 151.8 km 2 , 107.3 km 2 , 100.7 km 2 and 82.6 km 2 respectively.Therefore, the feasibility and effectiveness of this method have been proved by the experiment.The accuracy of the crop classification results was evaluated with the validation samples to assess the effect of the crop identification model.A confusion matrix, a standard format for accuracy evaluation, was used to calculate the overall accuracy (OA), the user's accuracy (UA), the producer's accuracy (PA), and the Kappa coefficient.According to the confusion matrix [65] (Table 6), the OA of the five crop types was 89.50%, and the Kappa coefficient was 0.87.Therefore, the proposed method achieved a good performance, indicating that the model has strong practicability for identifying crop types.Generally, when the PA and UA are both higher than 85%, the crop classification is considered to be reliable [66] .Therefore, the results showed that the extracted features from the optical and full polarimetric SAR images with the support of farmland parcels can be used for crop-type mapping, satisfying the effect of crop identification.

Comparison with different datasets
To evaluate the practicability of the proposed method for crop-type mapping, we carried out a comparative experiment under the same experimental conditions.The farmland parcels were obtained from GF-3 image segmentation, and the multifeatures were extracted from the GF-2 optical and GF-3 SAR images.In this experiment, the optimal segmentation parameters for shape = 0.2, compactness = 0.6, and scale=40 were set to obtain the farmland parcels from the GF-3 SAR image, and then, the mean values of the optimal features from the optical and SAR images within farmland parcels were extracted.The same set of training samples was used to train the classifier, and the classification results are shown in Table 7 and Figure 10.The OA of the comparative experiment is 82.24%, which is inferior to the classification accuracy derived from the proposed method (89.50%).From Figure 10, the crop-type mapping was inefficient in delineating the farmland parcel boundary due to the poor segmentation results from the GF-3 image.The inherent speckle noise reduced the spatial resolution of the SAR data and blurred the detail of the image, which led to confusion between crop types and other classes.Because a high variability resulting from the backscattering effect of ground objects on radar beams exists within a class, SAR image segmentation divided a homogenous region into several parcels.In the process of farmland extraction, some fields were classified as non-agricultural land and thus masked, resulting in an omission error.For example, the UAs and PAs of wheat and maize were not ideal with large commission and omission errors, declining from 76.89% to 74.39% for wheat and from 75.48% to 78.17% for maize.However, the classification results were improved by the object-oriented classification based on optical image segmentation, which was more suitable to extract farmland parcels and reduced salt-and-pepper noise.From the above comparison, the proposed method is better for crop classification, helping to improve classification accuracy.

Comparison with different feature selection methods
To verify the performance of WA-BS, a comparison between the proposed method and three baseline methods (ReliefF [58] , mRMR [59] and LeastC [67] ) was performed on a 3.4 GHz, 64 bit AMD CPU.The above comparison models are filter models and are independent of classifiers.The overall accuracy of crop classification conducted by the SVM classifier with the same parameter settings is shown in Figure 11.The accuracy of crop classification using the optimal feature subset achieved by the WA-BS algorithm is 89.50%, which is better than that of other feature selection methods.The mRMR algorithm considers the relevance between features and identifying targets as well as the independence between features and achieved a classification accuracy of 87.59%.The LeastC algorithm is fault-tolerant to noise and is not affected by feature interaction; the classification accuracy using the optimal features achieved by the LeastC algorithm is 86.74%.However, the ReliefF algorithm cannot effectively remove redundant features due to the lack of consideration of the correlation between features, reaching a crop classification accuracy of only 85.01%.The dimensions of the optimal features obtained using WA-BS, ReliefF, mRMR and LeastC algorithms are 18, 46, 25 and 33, respectively.Therefore, the WA-BS algorithm can effectively remove redundant features that are strongly correlated and achieves a higher classification accuracy via the optimal feature subset.

Conclusions
Crop-type mapping with a single type of remote sensing image sometimes has unsatisfactory precision.The purpose of this study is to promote the collaborative application of optical and SAR remote sensing data in agriculture.A framework for crop identification is proposed based on GF-2 and GF-3 images.A demonstration in Xinjiang Uygur Autonomous Region targeting wheat, maize, cotton, beet and tomato crops showed that the crop growth features derived from the GF-2 and GF-3 images within farmland parcels segmented from an optical image can achieve good classification results, with an overall accuracy of 89.50%, which is better than the accuracy of 82.24% derived from SAR-based segmentation.Compared with the ReliefF, mRMR and LeastC feature selection algorithms, the WA-BS algorithm proposed in this paper can effectively remove redundant features that are strongly correlated and can achieve a higher classification accuracy via the obtained optimal feature subset.The results indicate that the combination of optical and full polarimetric SAR images under the constraints of farmland parcels can be used for crop-type mapping, resulting in crop identification.Furthermore, this study extends the application of the GF series satellites to agriculture and indicates their great potential in crop monitoring.61375002).Our sincere thanks go to the students at the State Key Laboratory of Remote Sensing Science for their assistance during the field survey campaigns.

Figure 5 4 . 2 Feature selection and analysis 4 . 2 . 1
Figure 5 Segmented layer for crop planting area at the optimal scale 4.2 Feature selection and analysis 4.2.1 Optimal feature subset for crop identificationThe WA-BS algorithm was used to determine the optimal feature subset for crop identification from the 58 extracted features through the feature set evaluation criteria.The dimensions of the optimal feature subset and the runtime of the algorithm were controlled by setting the maximum number of

Figure 6
Figure 6 Dimensional variation in the optimal feature subsets by the number of iterations , and each component corresponds to a scattering mechanism (single scattering, bidirectional scattering and cross scattering).Therefore, Cloude decomposition can include all the scattering mechanisms.Analyzing the different boxplot values of the two-component Cloude-T 11 and Cloude-T 33 features indicated that the two components of Cloude decomposition exhibited a strong ability to describe ground objects with different scattering mechanisms.

Figure 8 Figure 9
Figure 8 Crop classification result derived from multifeatures and optical-based segmentation

Figure 10
Figure 10 Crop classification results derived from multifeatures and SAR-based segmentation

Figure 11
Figure 11 Overall accuracy using the optimal feature subset obtained by different feature selection methods