Crop and weed discrimination using Laws’ texture masks

: Computers have become an integral part of human lives. Computers are used in almost every field even in agriculture. Technologies like computer vision-based pattern recognition are being used to detect diseases and pests like weeds affecting the crop. The Weeds are unwanted plants growing among crops competing for nutrients, water, and sunlight. It can significantly reduce the quality and yield of the crops incurring a huge loss to the farmers. This paper investigates the use of texture features extracted from Laws’ texture masks for discrimination of Carrot crops and weeds in digital images. Laws’ texture method is one of the popular methods used to extract texture features in medical image processing, though not much explored in plant-based images or agricultural images. This experiment was carried out on two categories of benchmark digital image datasets of Carrot crop and Carrot weed respectively, which are publicly available. A total of 70 texture features were extracted. The dimensionality reduction technique was used to get the optimal features. These features were then used to train the Random Forest classifier. The results and observations from the experiment showed that the classifier achieved above 94% accuracy.


Introduction 
Lately, there is a big difference in the way the agronomists and farmers can gather and analyze data because of the advances in technology. Automated livestock management, precision weed control and measurement of phenotypic characteristics of the plants and crops all allow us in attaining good yield and profit with less input. The main concept behind these systems is Computer Vision (CV). Computer Vision is defined as the process of analyzing images and videos automatically to obtain meaningful inference or measurements without human intervention. This is one of the latest technologies that are being used in precision agriculture. Precision Agriculture is defined as 'art and science of enhancing crop production using the latest technology' [1] . For many years, technology being used in agriculture such as mechanical harvester, various sensor networks to know about the current environmental condition and to predict environmental changes that may happen in the near future. One of the reasons for using computer vision in agriculture is to eliminate the extensive use of chemical herbicides and to favor the development of environment-friendly and non-chemical methodologies. In a few decades, it is predicted that most of the manual farming chores will be replaced by robotic farming, which will be based on the computer vision techniques to do things like preparing the land for cultivation, weed control, monitoring and harvesting [2] .
Weeds are unwanted plants growing amongst crops. Weeds compete with the crops for nutrients, water and sunlight thus can cause low yield. India is the second largest in the farm output but the yield is very low and one of the main reasons for this is due to weeds [3] . Weed management is very poor in India especially in the coastal Karnataka region of India because of the non-availability of laborers. In addition, weeds are mainly controlled in India by chemical or mechanical weeders. The overuse of chemical herbicide leads to contamination of groundwater.
Many farmers in India lack knowledge of site-specific treatment. That is, for a particular weed, the suitable herbicide in the right amount has to be used otherwise, it leads to herbicide-resistant weeds. Both mechanical and chemical ways of controlling weeds take into account the general condition of the field without considering the spatial or temporal changes that can occur at minute-level. Therefore, this is the best time to harness the power of computer vision technology for precision weed management. The use of technology for weed management not only reduces labor problems but also gives way for chemical-free farming and helping in increasing the yield and reducing loss. India is the most populous country next to China. Therefore, India has to use technology in agriculture to increase food production and hence meet growing food demand. According to Young et al. [4] , in the future, many tasks in agriculture will be automated using computer vision or robotic vision technology. Common agricultural tasks like land preparation, sowing, weeding and so forth will be carried out by agricultural robots operating under computer vision technology. Even weed detection and removal will be automated with the help of computer vision as shown in Figure 1. These field robots have camera sensors, which will take images of the field, and images will be processed using advanced image processing techniques, which will help robots to take the appropriate action such as weed removal if detected. Also by flying drones on the field, images of the field can be captured. These images can be processed using advanced image processing techniques or computer vision techniques, which will help in identifying the weeds so that specific types of treatment can be given in controlling or eradicating the weeds. Figure 1 Weed robots removing weeds. Courtesy [4] We can find many studies carried out to distinguish the crop and weed using image processing and soft computing techniques in the literature. Philipp Lottes et al. [5] used Unmanned Arial Vehicle (UAV) to acquire the images and analyzed the images for weed types, crop spatial distribution, and weed-crop ratio. Visual features and geometrical features were extracted from the vegetation segmentation from the background. Random Forest classifier was used to classify the vegetation into crop and weed. The dataset consisted of sugar beet plants and the commonly available weeds in the fields of Germany and Switzerland. Ch. Gé e et al. [6] presented a generalized method to know the weed infestation rate in agronomic images. Initially, the method was tested on the simulated images, which simulated maize field images. After the vegetation was extracted using the spectral index method, crop row is detected using a double Hough transform. The weed and crop were discriminated using region-based segmentation based on the blob coloring method. The accuracy for crop row was found to be 87% in case of medium weed infestation rate and 100% in case of low weed infestation rate. The result of crop and weed discrimination was compared with the manual weed infestation estimation. The correlation coefficient found to be 0.83 in the case of medium weed infestation. Ciro Potena et al. [7] used convolutional neural networks (CNN) on RGB images and near-infrared images and an attempt was made to distinguish crop and weed. Vegetation segmentation from the soil background was done using the normalized difference vegetation index (NDVI) method trained by a lightweight CNN. Now in order to classify the vegetation as crop or weed, a deeper 3-class CNN was used. Wu et al. [8] identified the weeds found in between the crop rows based on position and edge feature. For edge detection, Roberts's edge detecting operator is used. After edge detecting, the image was divided into blocks and each block number of white pixels was compared with the threshold to determine whether they belong to weed or not. The correct detection rate average was found to be 95%. The algorithm can be used to detect the weed in other crops, which are grown row-wise such as corn, soy, maize. Burgos-Artizzu et al. [9] performed real-time identification of weed patches in maize fields. Fast image processing (FIP) and slowed robust crop row detection was combined and this produced very good results even under natural variable conditions. Hong et al. [10] performed the segmentation of corn crops from weeds. Images were acquired under natural variable conditions from various sources like fields, laboratory, and greenhouse. The vegetation was segmented using the normalized excess green index method based on a threshold. A median filter of size 3×3 was used to reduce the noise. The morphological operations were used to calculate features and an artificial neural network was used for classification of crop and weed. In the initial stage, the accuracy was around 72% but after improvement by ignoring the incomplete morphological features near the edges of the image, the accuracy was found to be 95%.
Hemming et al. [11] developed a classification algorithm for Cabbage and Carrot crops and weeds. Eight morphological and color features are combined to create a joint feature space and for each object. Then feature selection is used to determine which features suitable for discriminating weeds and crops. Fuzzy logic based membership function was used for classification. Kumar et al. [12] used texture features based on curvelet transform and Tamura texture features in discriminating crops and weeds. Zhiche Li et al. [13] discriminated the weed and crop using texture feature extracted using the color co-occurrence method. Kumar et al. [14] , made a review on the crop and weed segmentation. Agrawal et al. [15] , extracted the texture features like contrast, energy, and so forth using a gray-level co-occurrence matrix of the leaf images. Ab Jabal et al. [16] identified the plants using color, shape and texture feature extracted using a gray-level co-occurrence matrix. Faisal Ahmed et al. [17] extracted only color, size-independent, and size-dependent features in discriminating chilly crops and weeds. Slaughter et al. [18] differentiated plants using shape, color and texture features based on contrast, regularity, energy and so forth. Abdul Kadir et al. [19] , identified different plants, shape, color and texture features were combined to form a common feature space. Wu et al. [20] used the Scale-Invariant Feature Transform(SIFT) to extract features used for classification. Ali Caglayan et al. [21] , extracted the shape and color features of individual leaves for classification purposes. Andres Milioto et al. [22] , developed a convolution neural network (CNN) based semantic classification weeds, crops, and background. With the help of existing index-based methods, the CNN learns and makes a pixel-wise classification of weed, crop, and background in real-time. The accuracy achieved is around 91%. Bonirob robot was used in capturing the images from the fields. This study was done with the purpose of giving site-specific to the fields with the aim of reducing the overuse of chemicals. The review paper by Wä ldchen et al. [23] gives us insight into the various approaches that have been used in the identification of plant types using computer vision techniques. In addition, authors have given a brief introduction to various publicly available plant image datasets and flower datasets for researchers working in this domain. Authors also state that there are many researchers who have worked on their own dataset which is not made public and on images obtained from web sources.
The objectives of the study were 1) to apply the Laws' texture masks to extract the textural features, and 2) to train the classifier with the textural features and to analyze the results produced by the classifier.

Materials and methods
The first dataset used for this study is created by Huang et al. [24] . This dataset was obtained from http://github.com/cwfid. The annotated images consist of crop marked with green color polygons and weeds are annotated with red color as shown in Figure 2. Now based on the color, the only red color polygons were retained in the annotated image by masking all green color polygons. Next, the resulted image was multiplied with the corresponding original image which had its soil background removed. Figure 2 Original images of Carrot crop and weed and annotated images [24] By doing this, images with only weeds were obtained. Similarly, by selecting green colored polygons and masking all red-colored polygons and multiplying with original images, images with only carrot crops were obtained. These images were then converted to the YCbCr color space. The luminance component 'Y' was separated. Then the Laws' filter masks [25] were applied to the luminance component Y. The filter masks were obtained from five one-dimensional (1-D) vector of length five. They are E5, L5, R5, S5, W5 describing edge, level, ripple, spot, and wave microstructures respectively. Now multiplying these vectors with one another, twenty-five different masks were obtained which are as follows: The images in the dataset (after separating crop and weeds) were convolved with these 25 filter masks to get textured images called Im E5S5 . These textured images were then passed through a 15×15 average filter to get macro-structures. After this, images were normalized using a min-max normalization method to get normalized images called as NormalizedTIm E5S5 . Combining NormalizedTIm E5S5 and NormalizedTIm S5E5 by using the equation (1) we get, FTD E5S5 =(NormalizedTIm E5S5 +NormalizedTIm S5E5 )/2 (1) The final 14 texture descriptors or textured images called FTD are as follows Now for each FTD, the five statistical properties were calculated. Figure 3 and Figure 4 summarize the texture feature extraction process. The statistical properties considered are mean, standard deviation (SD), skewness, kurtosis, and entropy which are given as follows: , The second dataset used in this study is created by Lameski et al [26] . This dataset was obtained from https://github.com/lameski/ rgbweeddetection. In this dataset, ground truth images are gray-level images with the crop in represented in light gray color and weed in dark gray color.
Since the images in this dataset were acquired under variable natural lighting conditions, to minimize the effect of illumination variation, the technique of contrast limited adaptive histogram equalization (CLAHE) [27] was applied before applying Laws' texture masks. In CLAHE, the image is divided into small regions called tiles. Each tile's contrast is enhanced so that it matches with the histogram specified. The adjacent tiles are combined with using bilinear interpolation to remove boundaries, which are artificially induced. Figure 3 Steps to extract only one type of plant (either crop or weed) from the image Figure 4 Steps involved in extracting statistical features from textured images

Dimensionality Reduction Technique
The dimensionality reduction technique refers to the process of converting a set of data with huge dimensions to a smaller dimension ensuring the new set of data conveys the same meaning. Dimensionality reduction technique removes redundant features which may decrease the accuracy of the classifier. In this study, recursive feature elimination (RFE) [28] was used for feature selection. In this technique, a support vector machine (SVC) [29,30] was built with all the 70 texture features extracted. These 70 texture features are listed in Table 1. The goal of this method was to select the optimal subset of features by removing weak texture features one by one and validating an optimal subset of features through a 5-fold cross-validation score. Figure 5 shows that the model achieved the highest cross-validation score of 89% when the feature set contained 17 features. These 17 best performing texture features are listed in Table 2. The implementation of this algorithm was done using RFECV (Recursive Feature Elimination with Cross-Validation) from the Sklearn package for Python3 [31] .   The datasets were divided into a training dataset and a test dataset. The size of the training dataset was 48 and the size of the test dataset was 12 in the first dataset. In the second dataset, the size of the training dataset was 30 and the testing dataset was 09. The training dataset was used to train Random Forest (RF) classifier [32] . The test data was used to evaluate the performance of the classifier. The selected 17 best performing features were highly correlated as shown in Figure 6. The random forest classifier performs well when highly correlated features are present in the feature subset [33] . Therefore, a random forest classifier was selected for classification. Figure 6 Correlation matrix of the selected feature subset

Random forest classifier
The tree-based supervised learning algorithm is considered one of the best as it provides high accuracy and maps the non-linear relationships effectively. Random Forest is the most popular method among data scientists as it can perform both classification and regression. It also performs well in handling outliers, filling missing values and other essential issues in data analytics. It comes under an ensemble learning model wherein a group of weak learners comes together to form a strong model. In Random Forest multiple trees are built. If the classification of objects is based on features, multiple trees are built. Each tree gives a classification. The forest goes with the majority vote. The RF was implemented using RandomForestClassifier from the Sklearn package [31] . The following points summarize the steps involved in Random Forest classifier as follows 1) First randomly m features are chosen from M features where m<M.
2) Using these m features builds a node b which will be a root node using the best feature among m features. This is called as best split approach.
3) Make node b to have child nodes by using the same best split approach. 4) Repeat the steps from one to three until 'p' numbers of nodes have been reached. 5) Repeat the steps from 1 to 4 until 'n' trees have been built. 6) Test Features are now taken and rules of each tree are applied to predict the class.
7) The final prediction is done by considering the majority vote in the forest.

Results
This study was experimentally implemented using Matlab R2017a. The images were resized to 1296×966 when working with Matlab R2017a. The evaluation Random Forest classifier is done quantitatively using the Confusion Matrix [34] . The confusion matrix gives a summary of the prediction done on a classification problem. The number of correct and incorrect predictions are presented with a count value and separated by each class in the confusion matrix. This is shown in Table 3. This essentially tells how a classification model is confused while making the predictions.  Table 3 shows the confusion matrix for a binary classification problem or two-class classification problem. In this study, there are two classes, crop (Carrot) and weed. If the classifier outcome is positive and the actual case is positive, then it is a true positive. If the classifier outcome is negative but the actual case is positive, it is a false negative. If the classifier outcome is negative and the actual case is negative, then it is a true negative. If the classifier outcome is positive but the actual case is negative, then it is a false positive. Table 4 shows the evaluation parameters for the confusion matrix for a binary classification problem [35] . Relates the real positives with those given by the classifier

Discussions
Plant leaves have different natural textures. The texture of a plant leaf refers to smoothness or roughness on the surface of the leaf. These natural textural features were very well extracted from the images using micro-structure properties like level, ripple, edge, spot, and wave provided by the Laws' masks. In addition, Laws' texture masks capture the leaf venation very well which is an important parameter in plant discrimination. In this research work, discrimination of carrot crop and weed was made using Laws' texture masks. An accuracy of 95% was obtained for the first dataset which is greater when compared to the accuracy obtained by Haug et al. [24] for the same dataset. 100% accuracy was achieved for the second dataset as shown in Figure 7. This shows that textural features using Laws' masks could be an important feature for discrimination of different types of plants in computer vision applications. Time taken to extract textural features using the Laws' method was approximately 500 s. This can be further reduced using parallel processing.

Evaluation using receiver operating characteristics (ROC ) curve
ROC curve is used in machine learning to depict the performance of a classifier visually. It is a graph of false-positive rate (FPR) against the true-positive rate (TPR). The area under the ROC curve commonly referred to as AUC is another measure used to assess the performance of the given classifier [35,36] . It gives us the discriminative ability of the given classifier. That is, it gives us the probability with which the classifier will rank a randomly chosen positive instance. For example, if we get an AUC value as 0.8, then it means that a randomly chosen positive instance has a higher score than for a negative instance 80% of the time. If most of the time classifier cannot clearly distinguish between the groups, we have AU as 0.5, for a random classifier. For the best classifier, we have AUC as 1. Thus, this area gives us the predictive accuracy of a classifier model. Higher the value of AUC, the better the model is. Figure 8 shows the ROC curve for the first dataset with 0.912 as AUC value. Figure 9 shows the ROC curve for the second dataset with 1 as AUC value. Even though the results are promising, more and more crop and weed datasets have to be created so that each new method of discrimination of crop and weed can be tested and evaluated. Eventually, this may lead to the development of a general crop and weed discrimination model which can be used in the realization of robotic weeding.

Conclusions
Plants have varying degrees of smoothness on the surface of their leaves. These naturally occurring textures also define the vein structure of the leaves. This naturally occurring texture of the leaves in digital images is well captured by the Laws' texture masks. Thus, it can be used as a feature for discrimination of Carrot crop and weeds. Even though the accuracy of above 94% showed that texture features extracted using Laws' texture masks could be a significant feature for the discrimination of carrot crop and weed, this research output can be solidified only by further research on larger datasets belonging to the variety of crop and weed.
Research in discrimination of crop and weed using machine vision techniques is lagging due to the lack of publicly available benchmarks datasets. More and more crop and weed datasets should be made available to encourage research in this direction. As a part of our future research, we intend to evaluate the efficacy of texture features obtained by the Laws' texture masks in computer vision-based plant discrimination or identification on large plant datasets consisting of several species along with parallel processing to reduce the processing time. [References]