Image Classification through Convolution Neural Networks in Deep Learning

- January 22, 2021

Deep Learning is a division of Machine Learning where machine learns how to classify tasks that the humans do naturally. DL uses text, audio and visuals to accomplish accuracy in decision making capability. It is the DL technology which achieves high level of accuracy in recognition of objects within images equivalent or better than humans. DL analyses large sets of data which are labeled datasets using substantial amount of computing power. DL further explores several hidden layers of neural networks as shown in the diagram below in figure 1. Nodes are interconnected deeply which are explored in DL to extract the feature from the data or image without extracting it manually.

Figure 1: Neural network with thousands of hidden layers of interconnected nodes, explored through deep learning to recognize unexplored feature.

How DL Works?

Deep Learning uses neural network architecture and large sets of data are fed to DL models in order to learn directly from the labeled data without human intervention of extracting it. This can be better understood through the face recognition example explained by (Nielsen, 2019). The author explained that the weights and biases used in the networks remain hidden as they are automatically learned. But it would be difficult to understand how AI works and thus impossible to deduce how human brain works. The author discusses an example of recognizing whether an image has a face or not. This is interpreted through artificial neurons. Pixels of the image are input to a neural network. Instead of using learning algorithms, the image is decomposed into smaller parts and heuristics like existence of an eye, existence of a nose, existence of mouth and other features, are used to conclude whether the image consists of a face or not. Such heuristics, if helpful in solving sub-problems using neural networks, can facilitate in constructing a neural network by integrating those sub-networks and thus determining the presence of a face. Figure 2 describes how the concept of hidden layers can help in deducing the required output by deep learning of those layers.

Figure 2: Hidden Layers depicted for Face Recognition

The output shall claim if the image has a face or not. The hidden layers answer questions in deducing the output. The initial layers answer simple questions about the input pixels of image. The deeper layers answer complex questions. Such an architecture with several hidden layers is called deep neural network. However, training such deep networks is another complex task since it cannot be done manually. Learning algorithms are required to automatically learn from the training data.

Deep Convolution Neural Network & Image Classification

Deep Learning has exceeded the performance of other ML algorithms in image classification domain. The deep learning technique involves different models like Convolution neural network, recurrent neural network, deep belief network and long short-term memory network. The technique of Convolution Neural Network has excelled in object recognition of image data. CNN scales with data and model size and can be trained through backpropagation. CNN combined with Long Short-term model offers improved automatic image processing. CNN technique is powerful as it has constrained application programming interface (API) with fixed number of computational steps. CNN focuses on single object at a time and if there are more than one objects in an image, CNN may not detect the presence. A preferred object may be specified through Regional-CNN by which CNN is forced on a single region at a time to highlight the existence of single object in a region. The R-CNN regions are resized into equal sizes before feeding the data into CNN for classification.

Image classification is the method of categorizing and labeling pixels of an image on the basis of predetermined rules. It can also be classified into objects which represent scene components distinguishing an image. Image classification is useful in domains of medicine, security and education. As stated by (Deepan, 2020), image classification through objects can be further classified into three classes

Handcraft feature learning- It uses features like shape, texture, color and special details.

Unsupervised Feature learning- It is used for high performance image classification as compared to handcrafted feature learning method. It identifies low dimensional features in an unsupervised way and those features are used to improve overall performance in supervised environment with labeled data.

Deep feature learning

Image classification has five phases such as

Training data set of available images
Convolution Neural network training
Preparation of test data
CNN generated model on test data
Evaluation of images

CNN are the most used architecture for DL for vision and audio recognition. High accuracy in classification is extremely useful in medicine. It is further useful in remote sensing images for surveillance, agriculture, geographic planning and several other applications. The data for classification is acquired from satellites, and aerial devices.

References

Brownlee, J., 2019, “What is Deep Learning?”. Deep Learning Mastery Blog. Last Updated: August 14, 2020.

Deepan, P., Sudha, L.R., 2020, “Object Classification of Remote Sensing Image Using Deep Convolutional Neural Network”. The Cognitive Approach in Cloud Computing and Internet of Things Technologies for Surveillance Tracking Systems, Science Direct.

Gavali, P., Banu, J. S., , 2019, “Deep Convolutional Neural Network for Image Classification on CUDA Platform”. Deep Learning and Parallel Computing Environment for Bioengineering Systems. Science Direct.

Nielson, M., 2019, “Neural Networks and Deep Learning”, Determination Press, http://neuralnetworksanddeeplearning.com/

Search This Blog

Myriads O' Technology