Thinking back to approximation 4 in Chapter 1, we assumed that the polarisation and magnetisation depend linearly on the instantaneous values of the electric and magnetic fields. However, as we noted, this is only true for slow changes in the fields with time and cannot be entirely true for rapidly oscillating fields. In other words, we assumed the system (material) to have no memory. This assumption implies an absence of dispersion, as we will see soon. While it seems obvious that the polarisation and magnetisation cannot depend on future field values (causality), real materials do possess some limited memory of past fields:
Intuitively, consider that the electric field creates a polarisation by inducing an oscillation of the bound electrons in the atoms of the medium, collectively producing the polarisation density. A time delay between cause () and effect () arises from the finite response time of the atoms; they cannot react infinitely fast to sudden changes in the applied fields.
With our prior assumption of an instantaneous response, we were able to drop the integral in the general response function. This is no longer the case for dispersive media. For a linear, causal system, the response is a convolution in the time domain:
Here, and are the time-dependent electric and magnetic susceptibility response functions, respectively. Causality requires and for . The presence of these convolution integrals signifies that the material has memory.
A significant advantage of Fourier transformation is that convolution in the time domain becomes simple multiplication in the frequency domain. The above equations are equivalent to:
The frequency-dependent susceptibilities and are generally complex quantities. Their form is analogous to the transfer function in linear systems theory: the response ( or ) is the product of the input signal spectrum ( or ) and the system's transfer function (the susceptibility). We can now properly define dispersion: Dispersion means that the material response function (susceptibility, and thus permittivity or permeability) depends on frequency.
To understand the meaning of a complex susceptibility, consider applying a real electric field to a medium. The Fourier transform of this field is:
With the frequency-domain relation , we obtain the polarisation spectrum:
Transforming back to the time domain gives the polarisation as a function of time:
Since the impulse response must be real for a physical system responding to a real field with a real polarisation , its Fourier transform must satisfy the conjugate symmetry property . Thus,
Therefore, relates the amplitude of the polarisation density response to the applied electric field amplitude (as ), while the phase of the complex introduces a phase shift between the polarisation and the driving field . This is shown in the next figure:
Similarly, for the magnetisation response to an applied magnetic field :
This is shown in the next figure:
We will later see that the imaginary part of the susceptibility, , is related to absorption or gain of electromagnetic energy in the medium.
2.1 Origin of Dispersion
2.1.1 Drude-Lorentz model
Let us start with a simple classical model, where we treat an electron in an atom (or a bound charge carrier) as being connected by a spring to a much heavier, effectively immobile ion core. We assume the electron mass is . The equation of motion for the displacement of the electron from its equilibrium position, driven by an external electric field and subject to damping, is that of a simple damped harmonic oscillator:
Here, is the natural resonance frequency of the oscillator, and is a damping coefficient representing energy loss mechanisms. The driving force is , where is the charge of the electron and is the local electric field experienced by the electron.
The induced dipole moment for one such oscillator is . If there are such oscillators per unit volume, the macroscopic polarisation density is . Substituting , the equation of motion for becomes:
Next, we Fourier transform this equation (assuming and with for fields like ):
This allows us to solve for in terms of and thus find the electric susceptibility :
where the static susceptibility is defined as:
The real and imaginary parts of are:
Both functions are plotted in the next figure, assuming weak damping ():
For high frequencies (), the susceptibility becomes small and negative, approaching zero, while also approaches zero. The polarisation is out of phase with the electric field by nearly if .
At resonance (), the real part , and the susceptibility is purely imaginary: from the original formula or with the sign correction for . If , then . The polarisation is phase-shifted by relative to the electric field. The peak magnitude of is , where is the quality factor of the resonator. It quantifies how much energy is stored in the resonator compared to the energy lost per oscillation cycle. Reducing the damping makes the resonance peak narrower and higher.
The same general arguments apply to magnetisation density, but at optical frequencies, magnetic resonances () are often negligible for most materials.
Very often, real systems can be represented as collections of resonators with a distribution of resonance frequencies and damping factors :
These different resonances can arise from different electronic transitions, vibrational modes, and so on.
2.1.2 Drude Model
A simplification of the Drude-Lorentz model for free charge carriers (like conduction electrons in a metal) is the Drude model. In this model, the electrons experience no restoring spring force, so the resonance frequency is . The equation of motion becomes:
where we have replaced the damping rate with a collision (or scattering) rate . Assuming , the steady-state solution for is:
With , we obtain .
The susceptibility is then:
where is the plasma frequency, and is the free carrier concentration.
The real and imaginary parts are:
In the high-frequency limit where (low damping), the susceptibility is predominantly real, and the relative permittivity becomes:
This is the dispersion relation for an undamped plasma, and we differentiate between different cases:
For frequencies below the plasma frequency (), we have . The refractive index is imaginary, so the wavevector is imaginary. The wave does not propagate but is attenuated (evanescent) within the metal. Metals are highly reflective.
For frequencies above the plasma frequency (), we have . Both and are real. The wave propagates. The metal becomes transparent.
For , we have . In this idealised case, , . This frequency marks the onset of propagation of transverse waves. Longitudinal plasma oscillations (volume plasmons) occur at where .
Next, let us compare both the Drude-Lorentz and Drude model for the reflectivity , where
is the Fresnel reflection coefficient for normal incidence from vacuum onto a medium with complex refractive index .
To model silver, the fit is usually done using a combination:
where the Drude part accounts for free electrons and Lorentz oscillators account for interband transitions. The Drude model alone is often a good fit for the low frequency (infrared) region. However, at higher frequencies (visible/UV), contributions from Lorentz oscillators representing interband transitions must be taken into account.
Lastly, let us discuss a simple implication of the plasma frequency. Consider gold, which has high reflectivity for wavelengths above approximately (red/yellow light) but lower reflectivity (and increased absorption) for shorter wavelengths (green/blue light). This results in reflected white light appearing yellowish/golden, while transmitted light through a very thin gold film can appear bluish-green. This behaviour is related to its plasma frequency and interband transitions. For a thin film of gold, while some absorption occurs, the relation (Absorbance + Reflectance + Transmittance) holds.
In this framework, the colour of metals can be understood by considering the effect of both the plasma frequency (Drude response) and interband transitions (Lorentz oscillators) on the spectral reflectivity.
2.2 Kramers-Kronig Relations
From the previous section, we have seen that the frequency dependence of the refractive index (dispersion, related to ) and absorption (related to ) are interconnected. A dispersive material must be absorptive over some frequency range, and must exhibit a frequency-dependent absorption coefficient if it is dispersive. This general principle is mathematically captured by the Kramers-Kronig relations. Even with our very simplistic models for susceptibility, linear response theory has important implications regarding causality.
The Kramers-Kronig relations connect the real and imaginary parts of a complex response function, such as the electric susceptibility . They are a direct mathematical consequence of causality (the response cannot precede the stimulus) and the assumption that the response function is real.
The relations are:
denotes the Cauchy Principal Value of the integral. The response function being real ensures , leading to being even and being odd, allowing the integrals to be written over .
These relations are powerful: they allow the calculation of one part of the complex susceptibility (or permittivity, or refractive index) if the other part is known over the entire frequency spectrum. Practically, measuring over all frequencies is impossible, but measurements over a broad range combined with physically reasonable extrapolations or models for regions far from resonances can be used. For instance, measuring the absorption spectrum () over a wide range allows the calculation of the refractive index dispersion ().
Consider sending light through a medium and detecting the transmitted intensity with a photodetector:
If the goal is to determine the complex susceptibility , one typically needs to measure both amplitude and phase changes upon interaction with the medium. A simple photodetector measures intensity, thereby losing phase information which is crucial for the real part of the refractive index (and thus ). However, by carefully measuring the absorption spectrum (related to ) as a function of frequency, the Kramers-Kronig relations can be used to reconstruct the dispersive part (and thus the full complex refractive index).
2.3 Equations in Frequency Domain
Consider again the macroscopic field relations in the time domain:
Assuming linear responses and , Fourier transforming these equations (convolution becomes multiplication) results in:
Here, is the complex relative permittivity (or dielectric function), and is the complex relative permeability. and are the absolute complex permittivity and permeability, respectively.
Next, let us transform Maxwell's equations to frequency space. Using the Fourier transform property that :
For no free charges () and no free currents (), we find:
Time Domain
Frequency Domain
It should be noted here that a truly non-dispersive medium has an instantaneous response. If (a constant independent of frequency), then its inverse Fourier transform is:
This means , an instantaneous relation.
2.4 Helmholtz Equation
Let us start from Faraday's law in the frequency domain, , and apply the curl operator:
Using the vector identity , and assuming a homogeneous medium where (which follows from if is spatially constant and ), the left side becomes .
For the right side, substitute and then Ampère-Maxwell's law (for ):
Substitute :
Rearranging gives the Helmholtz equation:
Using and defining the complex refractive index , this becomes:
This is the wave equation in the frequency domain for a linear, homogeneous, isotropic medium.
We can write . This makes the wavevector complex. For a plane wave propagating in the -direction, .
The Fourier transform of a monochromatic plane wave is . A general solution can be built from these.
For a specific frequency , a plane wave solution is .
Let . Then the spatial part is .
The term represents the oscillating phase, while represents an exponential decay of the wave amplitude if is positive and points in the direction of propagation . This decay causes a loss of energy carried by the wave as it moves through the medium.
The imaginary part of the refractive index, , determines . For a passive, lossy medium, (by common convention). If so, and are parallel.
The wavelength in the medium is defined by the real part of the wavevector: .
The phase velocity is given by . It describes the speed at which surfaces of constant phase propagate.
The time-averaged intensity of the wave is related to . We define the intensity absorption coefficient as . The intensity then decays as (if propagating in -direction). This is illustrated next:
2.5 Refractive Index in Dispersive Media
Again, let us stress that the frequency-dependence of the real part of the refractive index (dispersion) is inherently related via the Kramers-Kronig relations to the frequency-dependence of its imaginary part (absorption). If there is absorption () at some frequencies, the refractive index must be dispersive (frequency-dependent) over all frequencies, and vice-versa. While absorption resonances () are often localised in frequency, the associated changes in the refractive index extend over a much broader frequency range. We have seen: If there is absorption anywhere in the spectrum, the refractive index becomes dispersive everywhere. This is shown in the next figure for a material with three distinct absorption resonances:
For most optically transparent media (meaning they have low absorption in the visible range), the strongest electronic absorption resonances (corresponding to in the Lorentz model) typically lie in the ultraviolet. For frequencies in the visible range, we are often in the regime . Assuming weak damping (), the imaginary part of the susceptibility (and thus ) becomes very small in this transparent region, while its real part typically increases with . Since (for small or ), this implies that the refractive index exhibits 'normal' dispersion:
Near an absorption resonance (), the dispersion typically becomes anomalous before returning to normal at higher frequencies above the resonance.
In practice, empirical interpolation formulas are used to describe (or ) in regions of transparency. The imaginary part is often ignored if absorption is small. The most well-known of these formulas is the Sellmeier equation:
where the vacuum wavelength is often expressed in micrometres. Usually, two or three terms in the sum are sufficient to fit experimental data for many glasses over the visible and near-infrared. The similarity of the Sellmeier equation to a sum of Lorentz oscillators arises because it can be derived assuming resonances are very narrow (like delta functions in ), which is a good approximation far from these resonances. The coefficients are related to the strength of the -th resonance, and where is the resonance wavelength of the -th oscillator.
Lastly, note that the Sellmeier equation typically lacks accuracy in the XUV (extreme ultraviolet) and X-ray range (short wavelengths), where the refractive index of any material is very close to 1. These small deviations from unity are often described using (here is the absorption index, not propagation constant). Both and are usually small and positive for to , unless the frequency is very close to an atomic absorption edge (which corresponds to an ionisation threshold or inner-shell transition).
2.6 Light Pulses
Let us consider an important example of a non-monochromatic wave: a light pulse. Light pulses, as opposed to continuous monochromatic plane waves, are finite in time duration and thus also have a finite spatial extent along the propagation direction. Here, we still treat them as infinitely extended transverse to the propagation direction, so as plane wave pulses. It is understood that a pulse can be represented as a superposition of plane waves of varying wavelengths or frequencies:
In this conceptual image, a 'red' (lower frequency) and a 'violet' (higher frequency) wave are superposed to get the resultant 'blue' wave (the pulse envelope). The colours are illustrative and not necessarily related to physical colours. It turns out that the more frequency components (waves with different frequencies) are added coherently, the 'narrower' in time the resulting pulse can become:
The general effect of superimposing multiple waves with frequencies (where is an integer) is to produce constructive interference at times (where is an integer) and destructive interference elsewhere, leading to a train of pulses. If many waves with a broad spectrum around a centre frequency are superposed, and if represents the approximate overall bandwidth, then a short pulse of duration can be formed.
Constructive interference of two waves
Let us take a step back and consider the superposition of two plane waves with real amplitudes , frequencies and , and corresponding wavevectors and Assume they are in phase at .
Let be the average (carrier) frequency and be the frequency difference. Similarly, let and .
The superposition at can be rewritten using trigonometric identities as:
This represents a fast oscillation at modulated by a slower envelope oscillating at .
To find the speed of this envelope maximum (a point of constructive interference), consider the condition that the phases of the two waves remain equal as the wave propagates a distance in time (assuming 1D propagation along , so ):
This leads to , where .
The velocity of this constant-phase point of the envelope is the group velocity :
In the limit of a small frequency difference (), we obtain the group velocity as:
Using , we can express as:
Sometimes, the group index is used, defined as:
The group velocity represents the speed of the pulse envelope if the envelope changes slowly and the spectral bandwidth is narrow enough that is approximately constant over that bandwidth. Deviations from this ideal condition lead to pulse broadening and distortion, known as group velocity dispersion (GVD). For transparent optical materials away from resonances, the group velocity is usually smaller than the phase velocity .
Lastly, consider the following figure illustrating phase and group velocities:
In this animation, the green points represent points moving at the group velocity (envelope speed), while red points represent points moving at the phase velocity (carrier wave speed). The figure illustrates how these two velocities can differ in a dispersive medium.
2.7 Time-Bandwidth Product of Wavepackets
A light pulse is formed by superimposing plane waves with different frequencies. In a continuous form, this superposition is described by the inverse Fourier transform:
We generally call the spectrum of the pulse, indicating the complex amplitude (magnitude and phase) of each monochromatic constituent of the pulse. The spread of frequencies in the spectrum is its bandwidth (or ), which is usually much smaller than the average/centre frequency of the pulse for well-defined pulses. The spectral energy density is proportional to .
The electric field as a function of frequency, , and the electric field as a function of time, , are Fourier transform pairs. This mathematical relationship has an important implication: there is a fundamental limit to how short a pulse duration can be for a given spectral bandwidth . This is quantified by the time-bandwidth product (TBP).
Consider the next figure, showing the intensity profile in time, , and the spectral intensity, :
The Fourier transform uncertainty principle states that the product (or ) has a minimum possible value, which depends on the pulse shape (definitions of and , such as Full Width at Half Maximum, also affect this value). We call this product the time-bandwidth product. Pulses that achieve this minimum are called transform-limited pulses. For such pulses, the spectral phase is constant or, at most, linear in frequency across the pulse bandwidth.
Pulses that are not transform-limited possess a non-linear spectral phase. This results in a pulse duration that is longer than the minimum allowed by its spectral bandwidth. These pulses have a time-dependent instantaneous oscillation frequency, a phenomenon known as chirp. If is the total phase of the complex analytic signal associated with , the instantaneous frequency is . A chirp means is not constant.
Consider a transform-limited (therefore unchirped) Gaussian pulse:
The next figure shows a chirped Gaussian pulse. By definition, its time-bandwidth product is greater than the minimum possible for a Gaussian shape: .
To illustrate the effect of a chirp, consider the difference between an unchirped (blue) and a chirped (orange) light pulse, both having the same spectral amplitude magnitude:
We can see that the pulse duration (FWHM) of the unchirped blue pulse is shorter than that of the chirped orange pulse. If a chirp is positive (up-chirp), lower frequencies precede higher frequencies in time. If a chirp is negative (down-chirp), higher frequencies precede lower frequencies. The type of chirp acquired by a pulse propagating through a dispersive medium depends on the sign of the group velocity dispersion (GVD) of the medium.
Usually, GVD is quantified by the dispersion parameter :
with its units typically being ps/(nmkm) in telecommunications, where optical fibres can be kilometres long. The duration of an initially transform-limited Gaussian pulse with duration (FWHM) after propagating a distance through a medium with dispersion and initial spectral width can be approximated by:
This shows pulse broadening due to dispersion.
2.8 Phase, Group and Front Velocity
It is known that the phase velocity can exceed the speed of light in vacuum (for ), particularly for XUV light in materials or near absorption resonances in any spectral range. The group velocity can also exceed (if ) or even become negative near a strong absorption line where can be large and negative. However, these phenomena do not violate the principles of relativity, which state that no energy or information can be transmitted faster than .
Neither the phase velocity nor the group velocity are, in general, reliable measures for the speed of information transmission. The phase velocity describes the velocity of points of constant phase (such as wave crests or zero-crossings). Since a monochromatic wave extends infinitely in time and space, observing one zero-crossing allows prediction of all others, so no new information is conveyed by its propagation at . The group velocity describes the velocity of the peak of the pulse envelope under certain conditions (narrow bandwidth, slowly varying envelope, negligible higher-order dispersion). However, for pulses with broad bandwidths or in regions of strong dispersion, the pulse shape can change dramatically, and the group velocity may not accurately represent the speed of any particular feature of the pulse, nor the speed of information. For analytic pulse shapes (whose future is determined by their past), the propagation of the envelope peak may not constitute signal transmission in the strictest sense.
The true maximum speed of information or energy transmission is the front velocity of a pulse, which is the speed of any non-analytic feature at the very beginning of a pulse (the wavefront). It has been shown that the front velocity is always equal to in vacuum, regardless of the medium the pulse subsequently enters, and it cannot exceed in any medium.