Grey Level Cooccurence Matrix

The Grey Level Cooccurence Matrix (GLCM) has been described in the image processing literature by a number of names including Spatial Grey Level Dependence (SGLD) [CH80] etc. As the name suggests, the GLCM is constructed from the image by estimating the pairwise statistics of pixel intensity. Each element (i,j) of the matrix represents an estimate of the probability that two pixels with a specified separation have grey levels i and j. The separation is usually specified by a displacement, d and an angle, tex2html_wrap_inline157 . That is,

eqnarray8

tex2html_wrap_inline159 will be a square matrix of side equal to the number of grey levels in the image and will usually not be symmetric. Symmetry is often introduced by effectively adding the GLCM to it's transpose and dividing every element by 2. This renders tex2html_wrap_inline159 and tex2html_wrap_inline163 identical and makes the GLCM unable to detect tex2html_wrap_inline165 rotations.

In texture classification, individual elements of the GLCM are rarely used. Instead, features are derived from the matrix. A large number of textural features have been proposed  starting with the original fourteen features described by Haralick et al [HSD73], however only some of these are in wide use. Wezska et al [WDR76] used four of Haralick et al's fourteen features ( tex2html_wrap_inline269 , tex2html_wrap_inline271 , tex2html_wrap_inline273 , tex2html_wrap_inline275 ). Conners and Harlow [CH80] use five features ( tex2html_wrap_inline269 , tex2html_wrap_inline271 , tex2html_wrap_inline281 , tex2html_wrap_inline283 , tex2html_wrap_inline273 ). Conners, Trivedi and Harlow [CTH84] introduced two new features which address a deficiency in the Conners and Harlow set ( tex2html_wrap_inline269 , tex2html_wrap_inline271 , tex2html_wrap_inline283 , tex2html_wrap_inline273 , tex2html_wrap_inline295 , tex2html_wrap_inline297 ). The features listed above are defined as

eqnarray17

where tex2html_wrap_inline299 , tex2html_wrap_inline301 , tex2html_wrap_inline303 and tex2html_wrap_inline305 are the means and standard deviations of row and column sums, respectively.

For any choice of d and tex2html_wrap_inline157 we obtain a separate GLCM, generally sensitive to the value of d and tex2html_wrap_inline157 . The GLCM is commonly implemented with some degree of rotation invariance. This is usually achieved by combining the results of a subset of angles. If the GLCM is calculated with symmetry, then only angles up to tex2html_wrap_inline165 need be considered and the four angles tex2html_wrap_inline177 are an effective choice. The results may be combined by averaging the GLCM for each angle before calculating the features or by averaging the features calculated for each GLCM. Separate feature sets are now obtainable for different values of d, irrespective of tex2html_wrap_inline157 , however, values other than 1 are rarely used.

This algorithm has been implemented in the program glcmClass.

Gabor Convolution Energies

The Gabor Energy method measures the similarity between neighbourhoods in an image and Gabor masks. Each Gabor mask consists of Gaussian windowed sinusoidal waveforms. Masks can be generated for varying wavelength ( tex2html_wrap_inline183 ), orientation ( tex2html_wrap_inline157 ), phase shift ( tex2html_wrap_inline187 ), and Gaussian window standard deviation ( tex2html_wrap_inline189 ) using

eqnarray64

The energy is computed at each pixel for each combination of wavelength and orientation ; the energy is the sum, over the phases, of the squared convolution values. That is, if we let the image be I(x,y), then

  eqnarray70

Energy calculated using equation 1 for each combination of tex2html_wrap_inline183 and tex2html_wrap_inline157 may be used as texture features [FS89].

A Gabor mask is a sinusoidal waveform which is spatially localised by modulation with a Gaussian envelope. Mathematically, the family of Gabor convolutions is a spatially localised modification of a Fourier analysis. Fourier analyses describe waveforms ; specifically the frequency and orientation of waveforms. Most authors motivate the use of Gabor convolutions by relating them to Fourier analyses.

Another point of view regards the Gabor masks as texton detectors. Textons are spatially local patterns such as oriented lines, line ends, edges and blobs. According to the texton theory of human texture discrimination, preattentive human texture discrimination can be modeled by the first order density of textons and second order statistics of intensity.

So far, we have described the distinction between the Fourier and Texton approaches to Gabor convolutions at a theoretical level. At a practical level, the Fourier and Texton approaches are distinguished by the ratio between the wavelength of the sinusoid and the width of the Gaussian envelope in a Gabor mask. If the Gaussian envelope is large enough to contain several wavelengths, then the Gabor mask will be sensitive to the orientation of the waveform peaks, the width of the peaks and the interpeak distance. In brief, such a Wide mask will detect features appropriate to a Fourier analysis. However, if the Gaussian envelope is approximately one wavelength in width, the Gabor mask will be sensitive to the orientation and width of the peaks, but not the interpeak distance. In brief, such a Narrow mask will detect features appropriate to measuring texton density.

Figure 1 gives examples of both Wide and Narrow masks.

  figure68
Figure 1: Gabor Convolution Masks for wavelengths of 2, 4 and 8 pixels with (a) Narrow envelope and (b) Wide envelope. 

This algorithm has been implemented in the program gaborClass.

Gaussian Markov Random Field

Gaussian Markov Random Field (GMRF) methods characterise the statistical relationship between a pixel and its neighbours [CC85]. A stochastic model results where the number of parameters is equal to the size of the neighborhood mask. The parameters are calculated using a LMS algorithm over every valid mask position in the image. A commonly used mask is the symmetric fourth-order mask shown in Figure 2.

Consider a zero mean image, I(s) [s=(i,j)] and assume the pixel values are Gaussian. Let the neighbourhood be defined by the set tex2html_wrap_inline248 where each element is a pixel location relative to the current pixel i.e. (0,1) indicates image pixel I(i,j+1). Now, assume that the pixels in the image are related by

  eqnarray106

Using a symmetric mask and allowing a tex2html_wrap_inline165 rotation invariance, the number of parameters can be halved, as the parameters are made common between I(s+r) and I(s-r) as in

  eqnarray111

The model parameters are obtained by solving for tex2html_wrap_inline157 in equations 2 or 3 using a Least Squares method detailed by Chellappa and Chatterjee [CC85].

  figure105
Figure 2: GMRF fourth-order mask. 

This algorithm has been implemented in the program markovClass.

Fractal Dimension

The use of Fractal Dimension (FD) for texture classification and segmentation has been proposed by a number of researchers [CSK93, KC89, Pen84, PNHA84]. The property of self-similarity implies that the FD of an image will be independent of scale. This phenomenum was observed by Pentland [Pen84].

Various methods exist to estimate the FD of an image ; 2D generalization of Mandelbrot's original methods for coastlines [PNHA84], Fourier-transform based methods [Pen84], and variations on box-counting [KC89, CSK93].

The principle of self-similarity may be stated as: If a bounded set A is composed of tex2html_wrap_inline308 non-overlapping copies of a set similar to A, but scaled down by a factor r, then A is self-similar. From this definition, the FD is given by

eqnarray133

FD can be approximated by estimating tex2html_wrap_inline308 for various values of r and then determining the slope of the least-squares linear fit of tex2html_wrap_inline322 Vs tex2html_wrap_inline320 . The differential box-counting method outlined in Chaudhuri et al [CSK93] may be used to achieve this task.

Consider an tex2html_wrap_inline415 pixel image as a surface in (x,y,z) space where (x,y) represents the pixel position and z is the pixel intensity. We now partition the (x,y) space into a grid of size tex2html_wrap_inline433 pixels. An estimate of the relative scale is tex2html_wrap_inline435 . At each grid position, we stack cubes of size s, numbering each box sequentially from 1 up to the box containing the highest intensity in the image over the tex2html_wrap_inline433 area.

Denoting the boxes containing the minimum and maximum grey levels for the image in the tex2html_wrap_inline433 area at grid position (i,j) by k and l respectively, we define tex2html_wrap_inline449 . This is the differential variation of the box-counting method. tex2html_wrap_inline308 is estimated by summing over the entire grid as tex2html_wrap_inline453 .

The above procedure is repeated for a number of values of r (s) and tex2html_wrap_inline322 versus tex2html_wrap_inline320 is plotted. FD is then estimated by the gradient of the least-squares linear fit of these points.

To compensate for "image regularity", two random shifts of the box columns are introduced. The first purturbates the grid position of the column while the second introduces a random offset into the column height of less than one full box height.

Four features are extracted as per Chaudhuri et al [CSK93]. They are

The first three features are calculated as above on the appropriately modified images. The fourth feature is based on multifractals which are used for self-similar distributions exhibiting non-isotropic and inhomogeneous scaling properties. Using the notation as before, we introduce tex2html_wrap_inline485 . The multifractal FD, D(2) is then

eqnarray170

A number of different values for r are used and the linear regression of tex2html_wrap_inline491 versus tex2html_wrap_inline493 yields an estimate D(2).

All features are normalized to lie between 0.0 and 1.0.

This algorithm has been implemented in the program fractalClass.

References

CC85
R. Chellappa and S. Chatterjee. Classification of Textures using Gaussian Markov Random Fields. IEEE Transactions on Acoustics Speech and Signal Processing, 33:959-963, 1985.

CH80
R.W. Conners and C.A. Harlow. A Theoretical Comaprison of Texture Algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2:204-222, 1980.

CSK93
B.B. Chaudhuri, N. Sarkar, and P. Kundu. Improved Fractal Geometry Based Texture Segmentation Technique. IEE Proceedings, 140:233-241, 1993.

CTH84
R.W. Conners, M.M. Trivedi, and C.A. Harlow. Segmentation of a High-Resolution Urban Scene using Texture Operators. Computer Vision, Graphics and Image Processing, 25:273-310, 1984.

FS89
I. Fogel and D. Sagi. Gabor Filters as Texture Discriminator. Journal of Biological Cybernetics, 61:103-113, 1989.

HSD73
R.M. Haralick, K. Shanmugam, and I. Dinstein. Textural Features for Image Classification. IEEE Transactions on Systems Man and Cybernetics, 3(6):610-621, November 1973.

KC89
J.M. Keller and S. Chen. Texture Description and Segmentation through Fractal Geometry. Computer Vision, Graphics and Image Processing, 45:150-166, 1989.

Pen84
A.P. Pentland. Fractal-based Description of Natural Scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:661-672, 1984.

PNHA84
S. Peleg, J. Naor, R. Hartley, and D. Avnir. Multiple Resolution Texture Analysis and Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:518-523, 1984.

WDR76
J.S. Weszka, C.R. Dyer, and A. Rosenfeld. A Comparative Study of Texture measures for Terrain Classification. IEEE Transactions on Systems Man and Cybernetics, 6:269-285, 1976.



Ian Burns
Fri Apr 11 13:43:52 EST 1997