In this chapter, we will use a plane wave expansion of a monochromatic field to study light propagation through an optical system. The simplest of these systems is free space. It will soon become clear why this chapter is specifically titled 'Fourier' Optics.
5.1 Plane Waves and the Helmholtz Equation
As we have seen previously, an arbitrary function may often be constructed from a sum or integral of harmonic functions (plane waves) of different frequencies and complex amplitudes. This principle extends to multiple dimensions: an arbitrary spatial function , representing for instance a field distribution in a plane, may be constructed as a superposition of harmonic functions with different spatial frequencies () and complex amplitudes:
In the figures above, we describe the optical wave with a scalar function , which could represent, for example, one Cartesian component of the electric field. The problem at hand is the following: We consider the transmission of an optical wave through an optical system, which is assumed to be linear. The input field is defined in an initial plane, say , and we wish to find the field in an output plane, , after propagation through a distance :
As described in more detail here, a linear system is characterised by its impulse response or, equivalently, by its response to a harmonic function (its transfer function).
We begin the discussion of Fourier optics by recalling the wave equation for the electric field in a source-free, homogeneous, linear, isotropic, and non-dispersive medium (as derived in Chapter 1):
For a monochromatic wave, we may write the electric field as , where is a constant unit vector indicating polarisation, is a complex scalar function of space representing the complex amplitude (magnitude and phase) of the field component, and is the time-harmonic factor (using the physics convention, where often is used for time evolution in engineering). Substituting this into the wave equation yields the Helmholtz equation for the spatial part :
with being the wave number in the medium. The time-averaged intensity is then obtained as , where is the impedance of the medium. For the previously discussed monochromatic plane waves, is constant (independent of ). However, the Helmholtz equation also describes beams where the intensity is not uniform in space, such as Gaussian beams.
For such waves, we define the notion of a wavefront: a wavefront is a surface of constant phase. That is, if , a wavefront is a 2-D surface on which is constant (modulo ).
We can see that the wavefronts change curvature upon propagation in a Gaussian beam. For a plane wave, the wavefronts are planes. As we may expect, these wavefronts bend when passing through optical components, such as lenses:
5.2 Paraxial Approximation
In the paraxial approximation, we assume that the light rays (normals to the wavefronts) form only small angles with the principal axis of propagation (conventionally the -axis). Therefore, the transverse components of the wavevector, and , which encode deviations from propagation straight along the -axis, are assumed to be much smaller than the total wave number . This approximation is valid for beams with small divergence angles, as is often the case for the output of a laser.
For a field at a fixed plane , we can define its two-dimensional spatial Fourier transform with respect to the transverse coordinates and :
The inverse transformation is:
Each component in this expansion corresponds to a plane wave whose wavevector is , where is determined by the Helmholtz equation: . Thus, .
The paraxial approximation can be formally stated as .
Applying the 2D Fourier transform (with respect to ) to the Helmholtz equation , where : . Similarly for .
So, we obtain an ordinary differential equation for with respect to :
Recognising , this is .
The solution for forward propagation in is:
We define the transfer function of free space propagation over distance as:
Thus, . Note that depends on and , so is not dependent on an independent variable but rather on and . This approach is valid for propagation in any homogeneous isotropic medium.
Knowing in an initial plane allows us to find its profile in any other plane at a distance :
The procedure is:
Calculate the 2D spatial Fourier transform of the input field .
Multiply by the transfer function to obtain .
Apply the inverse 2D spatial Fourier transform to to obtain .
It is important to remember that we are working with linear systems: harmonic components (spatial frequencies ) are not created or destroyed during propagation in a linear homogeneous medium; only their relative phases are modified by .
5.3 Fresnel Approximation
In the expression for the transfer function , the term makes analytical inverse Fourier transformation difficult. In the paraxial approximation (), the angles and of the constituent plane waves with respect to the -axis are small. We can then expand the phase of the transfer function, :
Using the Taylor expansion for small :
The Fresnel approximation (or paraxial wave equation approximation) consists of keeping terms only up to the first order in within the square root, which means keeping terms quadratic in :
This approximates the spherical wavefront segment of each plane wave component with a parabolic one. The simplified transfer function becomes:
We can write , which is the phase accumulated by a plane wave propagating along .
The Fresnel approximation implies that we are observing the field at a distance that is large compared to the transverse extent of the source/aperture, but not so large that the wavefronts become essentially planar over the observation region (which leads to Fraunhofer diffraction).
Considering the source of the wave to be spherical, the Fresnel approximation approximates these spherical wavefronts with parabolas:
The validity of the Fresnel approximation requires that the next term in the Taylor expansion of the phase, , must be much less than . This is a stricter condition than just . It can be related to the Fresnel number where is a characteristic transverse dimension (of aperture or beam).
Next, we find the impulse response for Fresnel propagation, which is the field at due to a point source .
The 2D Fourier transform of is .
Therefore, .
We take the inverse Fourier transform:
We find that the impulse response function is given by:
This impulse response is useful because for a linear system, the output for an arbitrary input is given by the convolution:
Substituting :
This is the Fresnel diffraction integral.
The general steps to find the electric field for a given input are:
Find the 2D spatial Fourier transform of .
Multiply by the appropriate transfer function to obtain .
Apply the inverse 2D spatial Fourier transform to to obtain .
5.4 The Fraunhofer Limit: Far Field
The Fraunhofer approximation, or far-field diffraction, is a limit of the Fresnel approximation valid at sufficiently large distances from an aperture or object of characteristic transverse size . It requires the Fresnel conditions to be met, plus an even stronger condition.
The condition for Fraunhofer diffraction is often expressed using the Fresnel number . The Fraunhofer regime applies when , which means .
If is the size of the observation region (detector), then we also typically require .
The crucial simplification in the Fraunhofer limit arises from approximating the quadratic phase term in the Fresnel integral .
We expand . The term is kept under the integral. The term is taken outside. The Fraunhofer condition allows .
The field in the Fraunhofer regime becomes:
Recognising and as spatial frequencies (proportional to observation angles ), the integral is the 2D Fourier transform of the input field evaluated at these spatial frequencies:
The Fraunhofer approximation essentially states that in the far field, the observed complex amplitude is proportional to the Fourier transform of the aperture distribution, multiplied by a phase factor. All plane wave components effectively interfere constructively only along specific directions corresponding to their values, mapping spatial frequencies in the object plane to positions in the observation plane.
5.5 Diffraction Patterns - Amplitude Modulation
When an optical wave passes through an aperture or is otherwise spatially modulated in amplitude and/or phase, and then propagates some distance in free space, the resulting intensity distribution is called a diffraction pattern. From the discussion above, it should be clear that simply expecting the intensity pattern to be a geometric shadow of the aperture is an oversimplification, valid only in the limit of ray optics where the wave nature of light is ignored.
Consider an aperture described by an aperture function in the input plane :
If an incident wave illuminates this aperture, the field immediately after the aperture is . We now know, in principle, how to obtain the field at some distance .
5.5.1 Rectangular Aperture
Consider a rectangular aperture of width and height , centered at the origin. Assume it is illuminated by a normally incident plane wave of constant amplitude (so inside the aperture and outside). We apply the Fraunhofer approximation to find the far-field pattern at a distance
The 2D Fourier transform of is:
where . The far-field intensity . If is related to :
Here is the intensity at the centre of the pattern. Then, we arrive at:
This result is expected, as the Fourier transform of a rectangular function (top-hat) is a sinc function. The intensity pattern is shown next:
5.5.2 Circular Aperture
Consider a circular aperture of diameter . The 2D Fourier transform of a circular aperture (circ function) is related to a Bessel function of the first kind, :
where . The far-field intensity pattern at radius on the screen is:
This is the characteristic Airy disk pattern shown in the next figure.
5.6 Fourier Optics with a Lens
A thin lens introduces a quadratic phase transformation to an incident wavefront. For a lens with focal length , its phase transfer function is (for a focusing lens, assuming it's thin and located at ).
If an object is placed at the front focal plane () of a lens, the field at the back focal plane () is proportional to the Fourier transform of :
More generally, if an object is placed immediately before a lens, the field at its back focal plane is proportional to the Fourier Transform of :
The integral is , where is the FT of with the appropriate FT kernel sign. This demonstrates the Fourier transforming property of a lens.
A common configuration is a 2f system, where an object is placed at distance before a lens, and the image (which is the Fourier transform) is observed at distance after the lens. If another identical lens is placed at , it performs an inverse Fourier transform, potentially forming an inverted image of the original object at from the first object plane. This is a 4f system:
In the Fourier plane (at from the object, between the two lenses), one can place an amplitude or phase mask to filter out or modify specific spatial frequency components of the object. Here are coordinates in the Fourier plane, related to spatial frequencies by . The transfer function of such a spatial filtering system is effectively .
Consider the image example: A standard circular aperture in the Fourier plane acts as a low-pass filter, blurring the image by removing high spatial frequencies (sharp details). An opaque stop in the centre acts as a high-pass filter, enhancing edges and removing large-scale variations, making the man's skin appear dark while highlighting hair.
A Fresnel zone plate is another optical element that can focus light, but it operates based on diffraction rather than refraction.
Its transmission function consists of concentric transparent and opaque zones:
for a binary zone plate designed for focal length .
The spacing of these Fresnel zones is such that light diffracted from the transparent zones interferes constructively at the desired focal point. Zone plates are inherently chromatic, focusing different wavelengths to different focal points.
5.7 Holography
Holograms are recordings that encode the full optical wave from an object, including both its amplitude and phase information. In principle, if we could create a transparency equal to the complex field from an object, illuminating this transparency with a plane wave would reconstruct the object wave. However, optical detectors are sensitive only to intensity (), not directly to phase. Holography overcomes this by interfering a reference wave with the object wave .
If these two waves overlap at , the intensity pattern recorded on a film (hologram) is:
Assuming the film's amplitude transmittance after development is proportional to :
where and .
When this hologram is illuminated by the original reference wave , the transmitted field is:
If the reference wave is a uniform plane wave with amplitude (so ), the terms are:
: A scaled version of the reference wave (DC term).
: The reference wave modulated by the object intensity (distorted image).
: A wave proportional to the original object wave . This term reconstructs the virtual image of the object.
: A wave related to the conjugate of the object wave. This term reconstructs the real image (often called the twin image or conjugate image).
To separate these four waves spatially, a common technique is off-axis holography, where the reference wave and object wave are incident on the recording film at a significant angle to each other:
This angular separation ensures that upon reconstruction, the four terms propagate in different directions, allowing the desired term to be viewed without overlap from the others.
Holography generally requires light sources with high temporal and spatial coherence (such as lasers) for both recording and reconstruction. Variations like volume holography (where interference fringes are recorded throughout the depth of a thick medium) and rainbow holography (which allows viewing in white light) address some of these limitations.
Generally, we differentiate two types of hologram based on recording geometry:
Reflection hologram: The reference and object beams approach the recording material from opposite sides. Interference fringes are typically formed in planes nearly parallel to the surface. Reconstruction is usually done by illuminating the hologram from the same side as the original reference beam was incident.
Transmission hologram: The reference and object beams approach the film from the same side. Interference fringes are generally formed nearly perpendicular to the surface. Reconstruction is done by illuminating with the reference beam, and the reconstructed object wave is viewed by looking through the hologram.
An example of a transmission hologram setup is shown in the next figure (source).
In ordinary photography, only the intensity distribution is recorded, so all phase information is lost, and thus three-dimensional reconstruction is not possible.
5.8 Paraxial Ray Optics
Often, for analysing simple optical systems, a full wave-optical (Fourier) treatment is not necessary, and the simpler ray approximation (geometric optics) suffices. This is especially true when effects of diffraction can be neglected, for instance, when all apertures and beam sizes are much larger than the wavelength of light. We define a ray as the local normal to a wavefront. We will work within the paraxial approximation, meaning all rays form small angles with respect to the optical axis (conventionally the -axis).
A single ray at a transverse plane can be described by a 2D vector, commonly its radial distance from the axis and its angle with respect to that axis (or ). We will assume systems with cylindrical symmetry around the -axis for this ray description. For example, a thin converging lens with focal length transforms an incident ray to an output ray according to:
where we have used the small angle approximation .
This transformation can be expressed using a matrix, called the ABCD matrix or ray-transfer matrix, which relates the output ray vector to the input ray vector:
For the thin lens described above, the ray-transfer matrix is:
Similarly, for propagation through a homogeneous medium of length , a ray becomes where and . The ray-transfer matrix is:
For a planar interface between homogeneous media with refractive indices and (ray incident from to at near-normal incidence):
The effect of a spherical mirror with concave radius of curvature (light incident from left, vertex at origin, centre of curvature at ) is equivalent to a lens with :
The advantage of this matrix formalism is that the overall ray-transfer matrix for a cascade of optical elements is found by multiplying the individual matrices in the correct order (last element encountered by the ray first in the matrix product if input is on the right, or first element first if input is on the left, depending on how the output vector is written). For example, consider propagation through a medium of length , followed by a lens of focal length , then another propagation through length :
The matrix relating the output ray to the input ray is: