Grey Level Cooccurence Matrix

The Grey Level Cooccurence Matrix (GLCM) has been described in the image processing literature by a number of names including Spatial Grey Level Dependence (SGLD) [CH80] etc. As the name suggests, the GLCM is constructed from the image by estimating the pairwise statistics of pixel intensity. Each element (i,j) of the matrix represents an estimate of the probability that two pixels with a specified separation have grey levels i and j. The separation is usually specified by a displacement, d and an angle, . That is,

will be a square matrix of side equal to the number of grey levels in the image and will usually not be symmetric. Symmetry is often introduced by effectively adding the GLCM to it's transpose and dividing every element by 2. This renders and identical and makes the GLCM unable to detect rotations.

In texture classification, individual elements of the GLCM are rarely used. Instead, features are derived from the matrix. A large number of textural features have been proposed starting with the original fourteen features described by Haralick et al [HSD73], however only some of these are in wide use. Wezska et al [WDR76] used four of Haralick et al's fourteen features ( , , , ). Conners and Harlow [CH80] use five features ( , , , , ). Conners, Trivedi and Harlow [CTH84] introduced two new features which address a deficiency in the Conners and Harlow set ( , , , , , ). The features listed above are defined as

eqnarray17

where , , and are the means and standard deviations of row and column sums, respectively.

For any choice of d and we obtain a separate GLCM, generally sensitive to the value of d and . The GLCM is commonly implemented with some degree of rotation invariance. This is usually achieved by combining the results of a subset of angles. If the GLCM is calculated with symmetry, then only angles up to need be considered and the four angles are an effective choice. The results may be combined by averaging the GLCM for each angle before calculating the features or by averaging the features calculated for each GLCM. Separate feature sets are now obtainable for different values of d, irrespective of , however, values other than 1 are rarely used.

This algorithm has been implemented in the program glcmClass.

Gabor Convolution Energies

The Gabor Energy method measures the similarity between neighbourhoods in an image and Gabor masks. Each Gabor mask consists of Gaussian windowed sinusoidal waveforms. Masks can be generated for varying wavelength ( ), orientation ( ), phase shift ( ), and Gaussian window standard deviation ( ) using

The energy is computed at each pixel for each combination of wavelength and orientation ; the energy is the sum, over the phases, of the squared convolution values. That is, if we let the image be I(x,y), then

eqnarray70

Energy calculated using equation 1 for each combination of and may be used as texture features [FS89].

A Gabor mask is a sinusoidal waveform which is spatially localised by modulation with a Gaussian envelope. Mathematically, the family of Gabor convolutions is a spatially localised modification of a Fourier analysis. Fourier analyses describe waveforms ; specifically the frequency and orientation of waveforms. Most authors motivate the use of Gabor convolutions by relating them to Fourier analyses.

Another point of view regards the Gabor masks as texton detectors. Textons are spatially local patterns such as oriented lines, line ends, edges and blobs. According to the texton theory of human texture discrimination, preattentive human texture discrimination can be modeled by the first order density of textons and second order statistics of intensity.

So far, we have described the distinction between the Fourier and Texton approaches to Gabor convolutions at a theoretical level. At a practical level, the Fourier and Texton approaches are distinguished by the ratio between the wavelength of the sinusoid and the width of the Gaussian envelope in a Gabor mask. If the Gaussian envelope is large enough to contain several wavelengths, then the Gabor mask will be sensitive to the orientation of the waveform peaks, the width of the peaks and the interpeak distance. In brief, such a Wide mask will detect features appropriate to a Fourier analysis. However, if the Gaussian envelope is approximately one wavelength in width, the Gabor mask will be sensitive to the orientation and width of the peaks, but not the interpeak distance. In brief, such a Narrow mask will detect features appropriate to measuring texton density.

Figure 1 gives examples of both Wide and Narrow masks.

Figure 1: Gabor Convolution Masks for wavelengths of 2, 4 and 8 pixels with (a) Narrow envelope and (b) Wide envelope.

This algorithm has been implemented in the program gaborClass.

Gaussian Markov Random Field

Gaussian Markov Random Field (GMRF) methods characterise the statistical relationship between a pixel and its neighbours [CC85]. A stochastic model results where the number of parameters is equal to the size of the neighborhood mask. The parameters are calculated using a LMS algorithm over every valid mask position in the image. A commonly used mask is the symmetric fourth-order mask shown in Figure 2.

Consider a zero mean image, I(s) [s=(i,j)] and assume the pixel values are Gaussian. Let the neighbourhood be defined by the set where each element is a pixel location relative to the current pixel i.e. (0,1) indicates image pixel I(i,j+1). Now, assume that the pixels in the image are related by

Using a symmetric mask and allowing a rotation invariance, the number of parameters can be halved, as the parameters are made common between I(s+r) and I(s-r) as in

The model parameters are obtained by solving for in equations 2 or 3 using a Least Squares method detailed by Chellappa and Chatterjee [CC85].

Figure 2: GMRF fourth-order mask.

This algorithm has been implemented in the program markovClass.

Fractal Dimension

The use of Fractal Dimension (FD) for texture classification and segmentation has been proposed by a number of researchers [CSK93, KC89, Pen84, PNHA84]. The property of self-similarity implies that the FD of an image will be independent of scale. This phenomenum was observed by Pentland [Pen84].

Various methods exist to estimate the FD of an image ; 2D generalization of Mandelbrot's original methods for coastlines [PNHA84], Fourier-transform based methods [Pen84], and variations on box-counting [KC89, CSK93].

The principle of self-similarity may be stated as: If a bounded set A is composed of non-overlapping copies of a set similar to A, but scaled down by a factor r, then A is self-similar. From this definition, the FD is given by

FD can be approximated by estimating for various values of r and then determining the slope of the least-squares linear fit of Vs . The differential box-counting method outlined in Chaudhuri et al [CSK93] may be used to achieve this task.

Consider an pixel image as a surface in (x,y,z) space where (x,y) represents the pixel position and z is the pixel intensity. We now partition the (x,y) space into a grid of size pixels. An estimate of the relative scale is . At each grid position, we stack cubes of size s, numbering each box sequentially from 1 up to the box containing the highest intensity in the image over the area.

Denoting the boxes containing the minimum and maximum grey levels for the image in the area at grid position (i,j) by k and l respectively, we define . This is the differential variation of the box-counting method. is estimated by summing over the entire grid as .

The above procedure is repeated for a number of values of r (s) and versus is plotted. FD is then estimated by the gradient of the least-squares linear fit of these points.

To compensate for "image regularity", two random shifts of the box columns are introduced. The first purturbates the grid position of the column while the second introduces a random offset into the column height of less than one full box height.

Four features are extracted as per Chaudhuri et al [CSK93]. They are

FD of the original image ,
FD of high grey-valued transform of ,

where , and and are the minimum and average grey values in .
FD of low grey-valued transform of , and

where , and and are the maximum and average grey values in .
multifractal FD.

The first three features are calculated as above on the appropriately modified images. The fourth feature is based on multifractals which are used for self-similar distributions exhibiting non-isotropic and inhomogeneous scaling properties. Using the notation as before, we introduce . The multifractal FD, D(2) is then

A number of different values for r are used and the linear regression of versus yields an estimate D(2).

All features are normalized to lie between 0.0 and 1.0.

This algorithm has been implemented in the program fractalClass.

References

CC85: R. Chellappa and S. Chatterjee. Classification of Textures using Gaussian Markov Random Fields. IEEE Transactions on Acoustics Speech and Signal Processing, 33:959-963, 1985.
CH80: R.W. Conners and C.A. Harlow. A Theoretical Comaprison of Texture Algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2:204-222, 1980.
CSK93: B.B. Chaudhuri, N. Sarkar, and P. Kundu. Improved Fractal Geometry Based Texture Segmentation Technique. IEE Proceedings, 140:233-241, 1993.
CTH84: R.W. Conners, M.M. Trivedi, and C.A. Harlow. Segmentation of a High-Resolution Urban Scene using Texture Operators. Computer Vision, Graphics and Image Processing, 25:273-310, 1984.
FS89: I. Fogel and D. Sagi. Gabor Filters as Texture Discriminator. Journal of Biological Cybernetics, 61:103-113, 1989.
HSD73: R.M. Haralick, K. Shanmugam, and I. Dinstein. Textural Features for Image Classification. IEEE Transactions on Systems Man and Cybernetics, 3(6):610-621, November 1973.
KC89: J.M. Keller and S. Chen. Texture Description and Segmentation through Fractal Geometry. Computer Vision, Graphics and Image Processing, 45:150-166, 1989.
Pen84: A.P. Pentland. Fractal-based Description of Natural Scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:661-672, 1984.
PNHA84: S. Peleg, J. Naor, R. Hartley, and D. Avnir. Multiple Resolution Texture Analysis and Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:518-523, 1984.
WDR76: J.S. Weszka, C.R. Dyer, and A. Rosenfeld. A Comparative Study of Texture measures for Terrain Classification. IEEE Transactions on Systems Man and Cybernetics, 6:269-285, 1976.

Ian Burns
Fri Apr 11 13:43:52 EST 1997