## 6. Fourier Transform Model of Image Formation

**Discrete Fast Fourier Transform**

**Fourier analysis** has long been used in optics as an effective means of predicting what occurs when light passes through various apertures, slits, and curved mirrors. Now, it has gone further to be the foundational concept behind advanced techniques in image processing. **Fourier transformation **creates a representation of an image in the frequency domain, from dimension X to dimension 1/X [1]. The FT is one technique used to make more compact representation of images (e.g. JPEG compression) by rounding its components to lower arithmetic precisions and removing weak signals. The inverse of this FT provides an approximated reconstruction of the original image [2].

The Fourier transform of a 2-dimensional image *f(x,y)* is given by [3]:

which in discrete can be expressed as:

The Scilab tool contains a built-in function for calculating Fourier Transforms for 1- and 2-dimensional arrays, fft() and fft2(). For images, it is best to use the FFT function with images of dimension in power of 2 (e.g. 128×128). Calling this function returns the result with the quadrants of the image interchanged. The fftshift() function rearranges these blocks to their original positions (see Figure 1).

Figure 1. A. original quadrants. B. orientation of quadrants after FFT. |

Noting that the FFT is a set of complex numbers, we view the intensities by taking absolute values using the abs() function.

*Lens as Fourier Transformer*

We can see this basic concept by simulating an image created by light entering a lens. For instance, we create a binary image of a circle – centered, colored white, and embedded on a black background.

I = imread('circle.bmp'); Igray = im2gray(I);

Alternatively, we can create the shape within Scilab and also create a gray representation of it (Figure 2A).

x = [-64:1:64]; [X,Y] = meshgrid(x); r = sqrt(X.^2 + Y.^2); circle = zeros(size(X, 1), size(X, 2)); circle(find (r <= 32)) = 1; imshow(circle, []); Igray = im2gray(circle);

We implement FFT on the image and then display the intensity of the result.

FIgray = fft2(Igray); // FIgray is complex imshow(abs(FIgray),[]);

The brackets passed to the imshow() function adjusts the image such that the maximum value is set to white (or 1) and the minimum is set to black (or 0). Figure 2B shows the result of our transformation, however with quadrants still interchanged.

Figure 2. A. a binary image of a circle; B. the FFT of the circle; C. its shifted FFT; D. its inverse FFT |

We apply fftshift() to have the quadrants back to their proper placements. This result is displayed in Figure 2C.

imshow(fftshift(abs(FIgray))), []);

Now, if we want to do an inverse of the FFT we have performed, we do this by applying another FFT on our previous result.

imshow(abs(fft2(fft2(Igray))));

This approximately reconstructs our original circle (Figure 2D). However, though the resulting image appears similar to our input, it is an inverted version of the original.

We can verify this if we perform the same methods on a text character, as in Figure 3.

Figure 3. A. binary image containing character “A”; B. the FFT of the image; C. its shifted FFT; D. its inverse FFT |

———————————————————————————————–—————–————————————

**Convolution **is a function that solves for the amount of overlap of one function *g* as it is shifted over the function *f*. For 2-dimensional functions, convolution is expressed by [4]:

This equation basically describes the relationship between three signals: the input, the impulse response and the output.

*Simulation of an imaging device*

Photographs are usually defined by the transfer function (transform of the impulse response) of the camera, which is limited by finite radius of the lens. If minimal light rays are able to enter through the camera due to the lens, the image is reconstructed with lesser resolution.

To simulate this, we create an image composed of the text “VIP” covering about 50% of its total area, and a circle to serve as a lens aperture.

We take the FFT of the “VIP” image and multiply it with the fftshift of the circular aperture. We do not anymore perform FFT on the circle since it already is part of the Fourier plane. Our Scilab code will read as follows:

r = imread('circle.bmp'); a = imread('vip.bmp'); rgray = im2gray(r); agray = im2gray(a); Fr = fftshift(rgray); Fa = fft2(agray); FRA = Fr.*(Fa); IRA = fft2(FRA); // inverse FFT FImage = abs(IRA); imshow(FImage, [ ]);

We can investigate on the properties (and integrity) of this method by observing the resulting image if allowed to pass through various aperture sizes.

The results in Figure 4 show that the size of the aperture importantly matters to produce a well-resolved image of the object and to create a better representation of it by allowing more light to enter through the lense.

———————————————————————————————–—————–————————————

**Correlation **is the function for measuring the degree of similarity between two functions [1]. For 2-dimensional objects, it is expressed as:

High resulting values mean high correlation, which occur when functions (or images) are greatly identical. This is why correlation is widely used in image processing for pattern recognition.

*Template matching / Pattern recognition*

We test the method of correlation by creating an image containing a short text and correlate it with another image containing a letter frequently occurring in it. We perform correlation by multiplying the FT of the single-letter image and the conjugated FT of the text image. We then obtain the result by solving for the inverse FT of their product. The Scilab implementation reads:

r = imread('a.bmp'); a = imread('spain.bmp'); rgray = im2gray(r); agray = im2gray(a); Fr = fft2(rgray); Fa = conj(fft2(agray)); FRA = Fr.*(Fa); IRA = fft2(FRA); // inverse FFT FImage = abs(fftshift(IRA)); imshow(FImage, [ ]);

We obtain a resulting image representing areas on which the images are of high correlation (Figure 5C). However, the values are broadly scattered due to the large difference in the character size. We repeat the procedure but this time correlating with a character size more identical to the ones in the text.

Notice that the more identical the images, the higher the corresponding values are in the resultant matrix (Figure 5E). We find the highest peaks (or the brightest pixels) on the specific words where the letter “A” is present (i.e. “rain”, “spain”, “mainly”, and “plain”).

*Edge detection using the convolution integral*

Edge detection in images can be a way of template matching. For instance, we create a pattern within a 3×3 matrix such that the total of the values equal to zero. We then convolve this with a sample image for edge detection via the imcorrcoef() function.

pattern = [-1 -1 -1; 2 2 2; -1 -1 -1]; // horizontal pattern a = imread('vip.bmp'); agray = im2gray(a); img = imcorrcoef(agray, pattern); imshow(img);

We also test other templates such as vertical, diagonal and spot.

pattern = [-1 2 -1; -1 2 -1; -1 2 -1]; // vertical //pattern = [-1 -1 2; -1 2 -1; 2 -1 -1]; // diagonal right //pattern = [-1 -1 -1; -1 8 -1; -1 -1 -1]; // spot pattern

From our results in Figure 6, we notice that the edges more pronounced in the resulting image are those similar to the template with which the text is correlated with. The letters “I” and “P” are the ones least detected with edges when related with the horizontal and diagonal templates. The best detection, however, was obtained by the spot template which, in a sense, does not outline a specific pattern and thus was able to trace approximately all the boundaries of the text image.

Special thanks to Gladys R. for sharing her thoughts and immediate realizations, and Dr. Soriano for the supplementary insights. I would rate myself 10 in this activity for comprehending the ideas and producing the desired outputs.

———————————————————————————————–—————–————————————

References:

[1] Soriano, 2010. Fourier Transform Model of Image Formation. Applied Physics 186.

[2] Wikipedia, 2010. Fourier Analysis.

[3] Wang, R., 2007. Two-Dimensional Fourier Transform.

[4] Weisstein, E., 2010. Convolution. From *MathWorld*–A Wolfram Web Resource.

[5] Peters, R. II, 2008. Image processing: Spatial convolution. Vanderbilt University School of Engineering.

Awesome article ! Really explained the entire stuff in a very lucid and pratical manner