[Retracted] Application of Convolution Neural Network Algorithm Based on Intelligent Sensor Network in Target Recognition of Corn Weeder at Seedling Stage
Abstract
Grass damage in the seedling corn field has always been an important factor affecting the growth and development of crops. The existence of grass not only compresses the living space of corn seedlings but also easily causes insect damage. Therefore, it is essential for weeding in the seedling corn field. The existing weeding methods usually use manual or chemical herbicide spraying, which is not only time-consuming and laborious but also inefficient. With the development of artificial intelligence and modern agricultural technology, the use of robots for field weeding has become an effective means, which has attracted more and more attention of researchers at home and abroad. Therefore, based on the full investigation of the development of relevant technologies at home and abroad, this paper carried out the research on the real-time target recognition and ranging method of field weeding robot and proposed a target recognition method of weeding corn at the seedling stage by using an intelligent sensor network and deep learning convolution neural network. Among them, the intelligent sensor is mainly used for target ranging and obstacle avoidance, and CNN is mainly used for target recognition. Taking the images of corn seedlings and weeds in the seedling stage under natural environmental conditions as samples, the migration training is carried out through the depth network model of the COCO data set, and the convolution features are shared by the CNN depth network model and Fast-CNN depth network model. VGG and ResNet feature extraction networks are compared. The experimental results show that the CNN depth network model based on this paper has obvious advantages in rape and weed target recognition. The target recognition accuracy of rape and weeds can reach 87.64%, and the recall rate can reach 80.23%. Compared with other models, it has obvious advantages, which proves the effectiveness of this model.
1. Introduction
Corn is one of the main food economic crops in China, which has high nutritional value and is one of the delicacies in people’s daily life. Its yield and planting area also occupy a major position in China [1]. At the same time, corn is also an important industrial raw material, which can process more than 3500 kinds of industrial products. It is one of the most suitable food varieties for industrialization in the world, and it is also one of the main crops that drive farmers to increase their income [2, 3]. However, in recent years, due to the deepening contradiction between construction land and agricultural land, the land available for planting corn has become less and less. Therefore, planting more corn in the limited land has become one of the topics of researchers. How to identify and remove weeds in the seedling corn field is an important field of research [4–7].
As we all know, field grass damage has always been an important factor affecting crop growth and development. Weeds in the field not only compete with crops for nutrients in the soil but also compete for limited growth space and sunlight, which easily induces field disasters, seriously affects the growth and development of field crops, and even leads to crop yield reduction. The existing weeding methods in China usually adopt manual or chemical herbicide spraying, which not only consumes a lot of human, material, and financial resources but also has low efficiency, and have been unable to meet the urgent needs of agricultural development [8–10]. The efficiency of manual weeding is low, and the labor cost is too high, which is not suitable for field operation. Spraying a large number of chemical pesticides not only pollutes the environment but also is not conducive to the growth and development of field crops. At the same time, it will also produce problems such as drug residues, and its side effects should not be underestimated.
With the development of artificial intelligence and modern agriculture, the use of robots for field weeding has become an effective means to achieve a green, environmentally friendly, low-carbon, and efficient way to remove weeds in the field, which is conducive to farmers’ agricultural production and management, and is of great significance to promote the deep integration of modern information technology and traditional agricultural fields in China [11–13].
In the past 10 years, a large number of agricultural informatization projects in China have been successfully implemented and achieved fruitful results, such as the collection of soil and crop information, the precise implementation of water medicine and fertilizer, and the proposal of key technologies such as facility agricultural environmental control [14], which gradually improved the production mode of precision agriculture. More and more researchers begin to apply it to the field of target detection and recognition. Girsick et al. [15] proposed a target detection method based on the regional convolution neural network (R-CNN). This method uses selective search (SS) algorithm to generate candidate regions for extracting target features. Although it solves the problem of training a small amount of limited labeled data to generate a high-quality model, it does not perform well in feature extraction efficiency and memory occupation. Subsequently, in order to solve the problem of redundant feature extraction in R-CNN, Wang et al. [16] proposed a Fast R-CNN model, which greatly improved the performance of target detection and recognition by using adaptive scale pooling operation. He et al. [17–19] proposed a spatial pyramid pool level visual recognition method based on the deep convolution network, which relaxed the limitation on the size of input picture data, thus significantly improving the recognition accuracy. Ren et al. [20] proposed the Fast R-CNN model, which is composed of the regional suggestion network (RPN) and Fast R-CNN. The regional suggestion network is used to replace the selective search algorithm, which solves the bottleneck problem of large time overhead in calculating regional suggestions, and makes real-time target detection and recognition possible. At the same time, the Fast R-CNN model is widely used in the field of vehicle detection [21–23], remote sensing image ground object recognition [24], appearance defect detection [25, 26], pedestrian detection and recognition [27–29], and field image detection [30–32]. The emergence of the above weeding robots can help people complete weeding work and reduce labor intensity to a certain extent, reduce the use of pesticides, protect the rural ecological environment, promote the automation and intelligence of orchard machinery, and improve the efficiency of corn planting production.
It can be seen that there are many researches on garden weeding robots at home and abroad; especially in foreign countries, weeding robots have been widely used in various agricultural fields. However, for China, there are few relevant research, and the relevant products are not only relatively few but also mostly imported products, which have the characteristics of large size and expensive purchase price, so it is not convenient to widely use them. It has delayed the process of mechanization and modernization of corn field management. The adaptive ability of the weeding robot in the seedling stage to the nonstructural characteristics of corn is not enough, resulting in the inability to accurately avoid obstacles and the low degree of automation of pastoral management.
Therefore, in view of the imperfection of the existing rural weeding machinery, this paper designs a new structure of rural obstacle avoidance and weeding robot, which can be selected according to different requirements and different spaces, improving the adaptive ability of the weeding robot in a complex environment, reducing the loss of artificial labor, reducing environmental pollution, improving weeding efficiency, eliminating soil hardening, improving the physical and chemical properties of soil, expanding the growth space of corn at the seedling stage, increasing the yield of corn, promoting the automation and intelligence of corn planting machinery, and improving the efficiency of corn production.
2. Overview and Research Status of Intelligent Sensor Technology and Convolutional Neural Network Technology
Intelligent perception based on multisource information fusion is one of the supporting technologies of robots. It can comprehensively analyze the categories and attributes of the environment and work objects to achieve the purpose of intelligent perception. The requirement of mechanical weeding is to remove weeds without damaging crops. Therefore, perception technology mainly includes crop row recognition technology and weed recognition technology. Among them, the perception of crops mainly depends on the intelligent sensor equipped by the weeding robot, while the recognition of crops and weeds mainly depends on image recognition technology and intelligent recognition algorithm.
2.1. Intelligent Sensor Technology
Sensor technology is one of the three foundations of information technology. The application of sensors is more extensive, ranging from small components to large machines such as transformers and aircraft. The use of sensors also brings more rapid development to many electronic components. At the same time, while promoting the development of various components, with the continuous progress of science and technology, sensors also develop rapidly and gradually move towards intelligence. Intelligence is the inevitable trend of the development of sensor technology. Compared with traditional sensors, intelligent sensors have outstanding advantages and high cost performance and have information processing functions. Intelligent sensor technology has gradually become one of the important symbols to measure the degree of national informatization [33].
An intelligent sensor, also known as integrated sensor, is an intelligent sensor element that integrates data transmission and data ability and has its own microprocessor, which can simply calculate and process some simple data. Its main characteristics are small size and fast data acquisition rate and data transmission rate. It is also confirmed that for the above reasons, this type of sensor has also been widely used in recent years. Its main components include the following: a/d and d/a converters, transceivers, microcontrollers, and amplifiers, etc. There are many researches on intelligent sensor technology at home and abroad, and breakthroughs in related technology research in different fields have greatly promoted the development of intelligent sensor technology and also greatly promoted the progress of intelligent facilities in various fields, including the progress of agricultural auxiliary equipment [34, 35].
2.2. Convolutional Neural Network
In the field of artificial intelligence and deep learning, convolutional neural network, as the mainstream trend and key technology, has developed rapidly. At present, the convolutional neural network has developed rapidly and is widely used in many fields and has achieved milestone status in many fields. In the field of deep learning, the convolutional neural network is the most frequently used. In 1998, Lecun [36] proposed the classic LeNet-5 network, as shown in Figure 1.

The convolutional neural network has attracted the attention of relevant scholars and continues to develop and progress. Compared with the traditional network, the convolutional neural network has many advantages. It uses local connection and realizes weight sharing through the backpropagation algorithm, which significantly reduces the number of parameters, optimizes the network, and reduces the overfitting of the model. The convolutional neural network structure includes the convolution layer, pooling layer, etc. The pooling layer reduces the complexity of the network and improves the robustness of the model. Moreover, the convolutional neural network has good fault tolerance, adaptive performance, and strong learning ability.
The convolutional neural network includes two parts. The first part is feature extraction. The convolutional neural network takes each image as a matrix of pixel values and outputs the pattern we are trying to classify. In the convolution layer, it is mainly used to extract the edges and features of the image, which is usually composed of convolution nuclei. It is mainly neurons that work. Each neuron is connected with the upper receptive field to form a unique connection mode. The activation function is mainly to add nonlinear capability to the network, so that the model can be applied to complex situations. Activation functions often use sigmoid activation functions for binary classification problems and ReLU activation functions for image recognition and classification problems. Pooling is a process of downsampling the model. The second part is classification and recognition, which mainly refers to the full connection layer, which usually acts as a classifier. The neurons in each layer are tiled, and the extracted features are weighted and summed and finally outputted. The simple convolutional neural network structure is shown in Figure 2.

2.2.1. Convolution Layer
The convolution layer is a key part of convolution neural network feature extraction. The convolution layer is a link to locally perceive the target image data and retain the spatial features of the input image, so as to achieve the two purposes of feature extraction and dimension reduction processing. The convolution layer contains many convolution kernels of different sizes. The convolution kernel moves according to a certain step size, performs convolution operation with the input corresponding to the moving window, and takes the dot product of the two matrices in the original image, and the core is a special linear operation. The obtained value is saved in a new matrix, and a new output is generated through rapid calculation to obtain a new characteristic map. Common convolutions include single channel convolution and multichannel convolution, as shown in Figures 3 and 4.


2.2.2. Pool Layer
The pool layer is mainly used to extract the main features in the feature map, reduce the amount of computation, reduce memory usage, reduce the number of parameters, expand the receptive field, reduce the complexity of the model, prevent the occurrence of the overfitting phenomenon, and achieve scale and space invariance. Pooling operations are performed on different channels, so the number of channels does not change, and control parameters are usually not required. Average pooling and maximum pooling are often used. The operation processes of average pooling and maximum pooling are basically the same, but the results are unexpected and completely different in specific cases because the variables selected for the operation are different. The two types of pooling methods are shown in Figures 5 and 6.


2.2.3. Activation Function
- (1)
Sigmoid activation function: the image of sigmoid activation function is similar to an S-shaped curve, which is the shape of an exponential function, as shown in Figure 7. The function definition is shown in Formula (1) below
- (2)
tanh activation function, which is also similar to the S-shaped curve, is called the hyperbolic tangent activation function. As shown in Figure 8, the function definition is shown in Formula (2) below
- (3)
ReLU function is the most commonly used and mainstream function in deep learning. As shown in Figure 9, the function definition is shown in Formula (3) below



According to the image of the ReLU activation function, when the input of the function is positive, the reciprocal is constant to 1, and there is no so-called gradient saturation problem. Its convergence speed and calculation speed are much faster than the other two activation functions, which can greatly save time when running. Therefore, it is comprehensively considered to select the ReLU activation function as the activation function in this paper.
2.3. Research Status of Weeding Robot
In order to realize the green and pollution-free growth of field crops in the whole life cycle and the sustainable development of the agricultural field, many scientific researchers have focused their research on the field of automatic weeding of agricultural mobile robots, aiming to reduce the use of chemical pesticides in the field, so as to reduce environmental pollution and drug residues of agricultural products caused by the massive use of chemical pesticides [21, 24]. At the same time, the emergence and use of agricultural mobile robots can not only replace human beings to complete boring and repetitive agricultural work but also carry out efficient and sustainable operations in different indoor and outdoor environments and ultimately improve production efficiency and effectively liberate human hands. Therefore, under the conditions of natural growth environment, how to accurately and quickly identify and remove weeds from field crops by agricultural mobile robots plays an important role in realizing intelligent field management.
At present, there are many researches on weeding robots. The appearance of related robots can be called a variety of forms, and the functions are complex and diverse, with a number of dozens. Among them, for hortibot robots, there are weeding robots with more than 20 kinds of weeds. In addition, there are countless robots developed for corn and other crops at home and abroad. Figure 10 shows the appearance of some commonly used weeding robots in the corn field.

3. Weed Identification Method in Maize Field at Seedling Stage Based on Intelligent Sensor Network and CNN Deep Network
3.1. Data Acquisition
The image data of corn and weeds in this experiment were collected in a corn field from May 10 to 15, 2021. The time period is 9:00-12:00 a.m. and 13:00-16:00 p.m. every day. At this time, the corn target is in the seedling stage, and the sun is abundant during this time period. During the acquisition process, the images of corn seedlings and weeds growing under natural conditions are collected in three directions: head up, down, and 45° squint, and the acquisition steps are strictly followed. The image data acquisition equipment adopts a high-definition digital camera, the model is Canon EOS 6D Mark II camera, the image resolution is 5472 pixels × 3648 pixels, and all images are in JPEG format.
During the shooting process, due to the growth cycle of seedlings and the limitation of shooting time, the photos taken directly are not fully applicable at first, and the number of photos taken is also limited. Only 400 useful photos have been collected, so relevant photos should be processed, including vertical and horizontal overturning of the original photos, as well as color saturation and brightness processing. The photos obtained from this processing are shown in Figure 11. Among them, photo A is the original image taken, B is the data image after the horizontal flip of the original image, C is the data after the vertical flip, D is the breakthrough after the saturation is enhanced to 1.5 times, E is the image after the saturation is reduced to 0.5 times, and F is the image after the brightness is enhanced to 1.2 times. The main purpose of the above operations is to expand the data source and make better use of data. In order to make the samples better used by the model, we use the code to preprocess the samples, divide all types of photos equally, and set them according to the ratio of 7 : 3, of which 0.7 is used for the training model—that is, based on the intelligent sensors and Fast-CNN model constructed in this paper, 1680 pictures are used for training; 0.3 is used for the test of the model—that is, based on the intelligent sensor and Fast-CNN model constructed in this paper, 720 pictures are used for the test.

3.2. Design of Corn Seedling and Weed Identification Network Based on Intelligent Sensor and CNN
The whole weeding machine is composed of a machine vision system, tool control system, walking system, and anticollision system. The data communication between each system is mainly through the PCI bus. The machine vision system is mainly composed of a camera, convolutional neural network image processing system, and user interface. First, turn on the camera, transfer the data in the camera to the image processing system through the PCI, and display it on the user interface. The results processed by the image processing system are transmitted to the tool control system and the walking system through the PCI. According to this result, the position and rotation speed of the tool, as well as the forward direction and speed of the weeder are adjusted.
3.2.1. Image Acquisition System
The camera in the image acquisition module is the aforementioned Canon EOS 6D Mark II camera, which is directly connected to the PC through the USB interface. This module mainly completes image acquisition by running on the TensorFlow2.0 framework, and the acquisition frequency is 30 frames per second. In order to obtain images from the camera, they must be read frame by frame from the first frame. Read the information data in the camera through the function, display the obtained real-time video on the user interface, and observe the shooting situation of the camera.
3.2.2. Image Preprocessing
In the process of information acquisition, transmission, and terminal processing, the preprocessed original image is easily affected by noise and light, which leads to image degradation. Before the system studies the image information, it needs to carry out image preprocessing to selectively highlight the useful information in the image and remove the interference information. Generally, it includes image graying, denoising, binarization, and data enhancement, etc. Some processing results are shown in Figure 12.

3.2.3. Seedling Positioning Module
In order to prevent the roots of maize seedlings from being damaged by the movement of robots in the process of weeding, it is often necessary to rely on intelligent sensors and image recognition technology to recognize the roots of seedlings and avoid them during walking. Generally, the leaves of the plant seedlings are evenly distributed around the stem of the plant. The plant is a single stem. The stem is vertically connected to the root of the plant, and the petiole connecting the leaf and the stem is much thinner than the petiole. Therefore, the whole image can be described as a structure with convex defects in the contour. Between the two leaves must be a convex defect in the contour, and the lowest point of the convex defect should be the intersection between the two leaves. And the leaves are surrounded by the stem, so the center of several contour points is the location of the stem and the root of the plant. The specific algorithm process of the target root location is shown in Figure 13.

3.2.4. Model Algorithm
The CNN deep network is composed of the feature extraction network, regional recommendation network (RPN), and ROI subnet (ROI). The VGG-16 convolutional neural network (CNN) is used in the feature extraction network to extract the features of the input image, form a feature map, and share the convolutional features with the regional recommendation network. The feature extraction network includes 13 convolution layers and 5 pooling layers. Each convolution layer adopts the nonlinear ReLU activation function as the activation function, and the size of the convolution kernel is 3 × 3. Each pooling layer adopts the form of Max pooling. Finally, the feature extraction network is divided into 5 parts, and each part is connected through the pooling layer. Taking the first part as an example, this part is formed according to the combination of convolution+activation function (ReLU)+Max pooling. The convolution layer has two layers, consisting of 3 × 3 convolution cores and 64 channel numbers.
3.2.5. Algorithm Evaluation Index
where P refers to accuracy, R refers to recall rate, nTP refers to the number of correctly identified rape and weed targets, nFP refers to the number of incorrectly identified rape and weed targets, and nFN refers to the number of unrecognized rape and weed targets.
3.3. Model Improvement
Because the images of corn seedlings and weeds in a complex background are easily affected by illumination and angle, which will produce color difference or partial occlusion, this paper proposes an improved convolution neural network (CNN) for feature extraction of corn seedlings and weeds, which can extract insulator features more accurately and completely. The specific process is as follows:
Step 1: although the features of different layers of the CNN model can express the features and attributes of the target from different levels, for example, the convolution features at the bottom contain more location information, there are still major deficiencies in feature extraction. Therefore, the introduction of Fast-CNN can be more suitable for target location.
Step 2: introduce the meaning of high-level features. High-level features are better in target semantics and suitable for target classification and noise filtering.
Step 3: use super pixels to segment the image, and then calculate the significance of each super pixel based on the characteristic covariance of each super pixel to obtain the roughly significant region.
Step 4: extract salient features through regional modularization and local complexity comparison, and finally, input the improved Fast-CNN model for salient region detection. This method avoids the time-consuming full-image search of the CNN model.
4. Test Results and Performance Evaluation
In order to select a better depth network model and a suitable feature extraction network, we selected different depth neural network models for experiments, including the traditional CNN model and the improved Fast-CNN model. At the same time, because feature extraction is very important for the construction of this model, we paired these two models with different feature extraction networks (VGG-16, ResNet-50, and ResNet-101) and conducted the same test.
During the model test, the change of the loss value of the model with the number of iterations, as well as the accuracy and recall rate, are selected as the basis for judging whether the model is good or bad. The different model test results obtained from this are shown in Figures 14 and 15. Figure 14 shows the change of the loss value of the model of the traditional CNN model under three different feature extraction network conditions. Figure 15 shows the change of loss value of the improved CNN model, the Fast-CNN model, under the conditions of three different feature extraction networks. It can be seen from the changes in Figure 14 that for the ordinary CNN model, under three different feature extraction networks, the change of the loss value of the model is basically the same; that is, with the increase of the number of iterations, the loss value of the model gradually decreases and finally decreases to 10-2 to 10-3, but the difference is that the decline rate of the loss value of the model of the shiyongvgg-16de feature extraction network is significantly better than that of the other two neural networks, and the overall loss value is also smaller, so it can be distinguished. Using the VGG-16 feature extraction network can effectively reduce the model error, improve the accuracy of model discrimination, and have good robustness while ensuring the network training speed. The change of the overall model in Figure 15 is basically the same as that in Figure 14. The change law of different feature extraction networks also shows the same situation as that of the traditional CNN model, but the difference is that the improved Fast-CNN model performs better on the whole, and its final loss value is also better than that of the ordinary CNN model. Especially for the VGG-16 feature extraction network, its proficiency speed is faster, and the loss value tends to stabilize around 100 iterations. It can be seen that the improved Fast-CNN model has a higher recognition rate and efficiency for corn seedlings and weeds.


Table 1 reflects the recall and accuracy of two different models in six different feature extraction network environments. From the results of six different models in Table 1, it can be seen that from the recall rate and accuracy data alone, the use of Fast-CNN (VGG-16) is superior to the other five models in terms of accuracy and recall rate. Its prediction accuracy is as high as 87.64%, and the recall rate is 80.23%. The discrimination effect of corn seedlings and weeds with good performance is relatively low in other networks. The accuracy rate of Fast-CNN (ResNet-101 and ResNet-50) is 81.34% and 82.35%, respectively, and the recall rate is 80.01% and 77.19%, respectively, which is similar to the previous data. Therefore, it can be seen that Fast-CNN has a good effect on any feature extraction network. However, for the CNN model, its overall accuracy is poor, the highest accuracy is only 73.49%, and the lowest recall rate is close to 65.42%. From these data, it can be seen that the improved CNN model is significantly better than the ordinary CNN model in terms of accuracy and recall rate. At the same time, comparing the detection time of different models for a single image, it can be seen that the ordinary CNN model has more advantages, and its time is shorter than the improved CNN model. However, based on the performance analysis of various indicators, the improved CNN model has more advantages.
Model | Accuracy (%) | Recall (%) | Time (ms) |
---|---|---|---|
CNN (ResNet-50) | 61.77 | 65.42 | 216 |
CNN (ResNet-101) | 62.08 | 70.17 | 234 |
CNN (VGG-16) | 73.49 | 79.59 | 197 |
Fast-CNN (ResNet-50) | 82.35 | 77.19 | 268 |
Fast-CNN (ResNet-101) | 81.34 | 80.01 | 254 |
Fast-CNN (VGG-16) | 87.64 | 80.23 | 271 |
To sum up, the experimental analysis results show that the Fast-CNN deep network model based on the VGG-16 feature extraction network shows obvious advantages for target recognition of corn seedlings and weeds in the field. The results show that the recognition accuracy of seedlings and weeds is higher than that of other deep neural network models, which verifies the practicability of the model constructed in this paper.
5. Conclusion
- (1)
This paper takes the field crops and weeds as the research object, takes the accurate and efficient identification and distinction of field crops and weeds as the premise of robot weeding, and points out that the multitarget ranging and weeding path planning technology is the key to the research
- (2)
Based on the full investigation of the development of relevant technologies at home and abroad, this paper carries out the research on the real-time target recognition and ranging method of field weeding robot and puts forward the target recognition method of weeding corn at the seedling stage by using the intelligent sensor network and deep learning convolution neural network. Among them, the intelligent sensor is mainly used for target ranging and obstacle avoidance, and CNN is mainly used for target recognition
- (3)
Based on the TensorFlow2.0 deep learning framework, the CNN deep network model based on different feature extraction networks is constructed by taking the images of corn seedlings and weeds in the seedling stage under natural environmental conditions as samples; through the migration training of the depth network model of the COCO data set, it is concluded that the constructed model can well identify the weeds of corn seedlings
- (4)
Using the convolution features shared by the CNN deep network model and Fast-CNN deep network model, VGG and ResNet feature extraction networks are compared. The experimental results show that the Fast-CNN deep network model based on the VGG-16 feature extraction network shows obvious advantages for the target recognition of corn seedlings and weeds in the field. The results show that the recognition accuracy of seedlings and weeds is higher than that of other deep neural network models, which verifies the practicability of the model constructed in this paper
Conflicts of Interest
The authors declared that they have no conflicts of interest regarding this work.
Acknowledgments
The authors acknowledge the 2022 National College Students Innovation and Entrepreneurship Training Program (Project No.: X202210712321). This research was funded by the Key Research and Development Projects of Shaanxi Province (2019NY-175).
Open Research
Data Availability
The data set used in this paper is available from the corresponding author upon request.