Feature extraction for classification problems

Feature extraction

Image features are primitive features or distinguishing attributes that are prominent in an image. These features may be natural (in the sense that humans can be felt directly from a photograph) or artificially (in the sense of being created by human beings through the process of calculation, manipulation works on photo objects). Feature extraction is the step to extract the specific features/information of an imageThese features are called feature vectors (expressed as multidimensional vectors). The overall objective of feature extraction is to make sure that keeping the important information contained in the image, while reducing the size of the image.

A lot of feature extraction algorithms based on color, texture, and shape have been proposed. Color is one of the most important features of images. They are represented using color moments, fuzzy color moments, color histogram,…. Some method for texture feature extraction are Gabor filter, Haar Wavelet Decomposition, etc. About shape feature extraction techniques, they can be classified into 2 groups, contour based and region based methods. The former calculates shape features from the boundary of the shape, whereas the latter extracts features from the entire region. A good shape representation feature should be invariant to translation, rotation, scaling.

Feature extraction in classification

In the classification problem, feature extraction is an important stage, and plays a crucial role for the accuracy and speed of the recognition systems. 

The process of a classification system has three main stages: preprocessing, feature extraction, and classification. Feature extraction is done after the preprocessing phase. It transforms the data from a difficultly classified space to a easily classified space. Extracting the good features before applying classification methods (SVM, Artificial Neural Network,…) will lead to better classification results.

Feature extraction in the image classification system

There is no formula to find the best feature extraction method. Scientists and researchers have been trying to improve feature extraction for decades, attempting to find a better way of extracting features while keeping important information and/or reduce the size of the input data.

Some popular feature extraction methods

There are various methods that have been proposed for the feature extraction. In this post, three most popular methods are listed:

  • Scale-Invariant Feature Transform (SIFT)

The SIFT is an algorithm to detect and describe local features in images. These features are invariant to image translation, scaling, rotation, and partially invariant to illumination changes and affine or 3D projection.

SIFT feature extraction
  • Speeded Up Robust Features (SURF)

SURF was first presented by Herbert Bay, et al., at the 2006 European Conference on Computer Vision. It is partly inspired by the SIFT descriptor. SURF is more like SIFT but faster than SIFT.

  • Bag of Features

Bag of features is one of the popular visual descriptors used for visual data classification.  This is a technique borrowed from the world of natural language processing. A bag of features is a vector of occurrence counts of a vocabulary of local image features (SIFT/SURF).

Bag of features


In conclusion, each method has certain advantages and disadvantages, which are appropriate for each classification problem. Besides, it is not possible that the more the number of features, the better the final performance.