# Inverse Problems Course Notes — Numerical Methods for X-Ray Transform Reconstruction

These notes are based on Gunther Uhlmann’s lectures for MATH 581 taught at the University of Washington in Autumn 2009.

An index to all of the notes is available here.

The X-Ray transform is particularly interesting because we can use it for tomography. We can send X-Rays through a patient, over many different lines, and measure the accumulated attentuation. Once this is done, we know the X-Ray transform of our object, so we can simply pass this in to the inversion formula and reconstruct a three-dimensional model. There are two substantial problems with this:

• We cannot collect data for every line, we can only sample a finite number of lines.
• Our measurements will have error.

There are in fact even more complications — beam hardening comes to mind — but these two must be dealt with first. Loosely speaking, the solution to the first problem relies on the stability estimates for the X-Ray Transform, and the solution to the second problem depends on the range characterization. In this post we will discuss the first problem, finite sample size, in detail. The second problem is manageable too: the range characterization of the X-Ray Transform lets us find the image function most likely to have produced the error-filled data. It turns out that the filtered backprojection algorithm we introduce below is an important step in this process. The rest of this post is organized as follows: we will start by showing that for an infinite, discrete set of samples we have unique reconstruction. Then we will consider finite sets of samples and see that reconstruction is hopeless without a priori assumptions. From there we will introduce the Filtered Backprojection algorithm and state and Natterer’s theorem about filtered backprojection error estimates.

1. Unique reconstruction for infinite, semi-discrete sample sets

In a practical tomography problem, we are not given the entire X-Ray Transform of a function, instead we just have

$\displaystyle Xf(x_i, \theta_j)$

for some ${x_i \in {\mathbb R}^3}$ and ${\theta_j \in S^2}$, ${1 \leq i \leq M}$, ${1 \leq j \leq N}$. Our first result is a bit of good news: even with discrete angular samples we can get a unique reconstruction (at least for distributions with compact support).

Theorem 1 For ${f \in \mathcal{E}^\prime({\mathbb R}^n)}$, if

$\displaystyle Xf(x, \theta_j) = 0$

for an infinite number of ${\theta_j}$, then ${f \equiv 0}$.

Proof:

$\displaystyle \hat{f}(\eta) = \mathfrak{F}_{{\theta^\perp}}Xf(\eta, \theta_j) = 0$

for all ${\eta \in \theta_j^\perp}$ and for all ${j}$. The set ${\theta_j}$ has an accumulation point, call it ${\theta}$, where we must have ${\hat{f}(\eta) = 0}$ in ${{\theta^\perp}}$. For ${\eta \in {\theta^\perp}}$ we can also find a sequence ${\eta_j = \eta + c_j\theta \in \theta_j^\perp}$ such that ${\eta_j \rightarrow \eta}$. Since ${\hat{f}(\eta_j) = 0}$, all derivatives in the ${\theta}$ direction are zero:

$\displaystyle (\theta\cdot \nabla)^k\hat{f}(\eta) = 0$

But ${\hat{f}}$ is identically zero on ${\theta^\perp}$ too, so all of its directional derivatives vanish at ${\eta}$. Since ${f}$ has compact support, Paley-Wiener implies that ${\hat{f}}$ is analytic. An analytic function in several variables whose derivatives all vanish must be identically zero. So ${\hat{f}}$, and hence ${f}$, is zero. $\Box$

2. Reconstruction is impossible for finite sample sets

If the result in the last section gave us some hope, the theorem we see in this section should disillusion us. It turns out that when we have only a finite number of samples, not only is the reconstruction non-unique (which is clear), but it can be horrendously bad. Behavior on the margins of arbitrary data of interest can cause all of our observed values to vanish.

Theorem 2 (Finite number of X-rays) Let ${K \subset {\mathbb R}^n}$ have compact closure and a nonempty interior. Let ${f \in C_0^{\infty}(K)}$ and Consider a finite number of directions, given by ${\{\theta_j\}_{1 \leq j \leq N}}$ where ${\theta_j \in S^{n-1}}$. Then there exists an ${f_0 \in C_0^{\infty}({\mathbb R}^n)}$ and ${x \in {\mathbb R}^n}$ such that ${f_0 = f}$ on the interior of ${K}$ and

$\displaystyle Xf_0(x, \theta_j) = 0$

for all ${\theta_j}$.

Proof: We will start with a direct proof of the case of a single line, ${N = 1}$. We want to construct an ${f_0}$ that has ${Xf_0(x, \theta_1) = 0}$. Using the Fourier Slice Theorem, we see that need to construct ${f_0}$ with

$\displaystyle \hat{f}_0(\eta) = 0 \qquad \forall \eta \in \theta_1^\perp$

To make sure we have this property, we will construct ${f_0}$ so that

$\displaystyle \hat{f}_0(\eta) = \langle \eta,\theta_1\rangle h(\eta)$

Let’s start by looking at a concrete case, and take ${\theta_1 = (1, 0, \dots, 0)}$. Now we want

$\displaystyle \hat{f}_0(\eta) = \eta_1h(\eta)$

But this happens exactly when

$\displaystyle F_0 = \frac{1}{i}D_{x_1}g(x)$

where ${h = \hat{g}}$. This suggests that if we can solve the differential equation

$\displaystyle \frac{\partial g}{\partial x_1} = f$

on the interior of ${K}$ then we will be done. (Exercise: why can we ignore the constant term?) To do this, let ${\chi \in C_0^\infty({\mathbb R}^n)}$ with ${\chi = 1}$ on the interior of ${K}$. Then we want to solve the equation

$\displaystyle \frac{\partial \chi g}{\partial x_1} = \frac{\partial \chi}{\partial x_1}g + \chi \frac{\partial g}{\partial x_1} = f$

which can be solved by integrating along the ${x_1}$ direction. Now we can just choose

$\displaystyle f_0 = \frac{\partial \chi g}{\partial x_1}$

to complete the proof. For general ${\theta_1}$, use the same approach to solve

$\displaystyle \theta_1\cdot\nabla g= f$

And take

$\displaystyle f_0 = \theta_1\cdot\nabla(\chi g)$

General case, ${N \geq 1}$. The extension to larger ${N}$ is straightforward. Now we want to find an ${f_0}$ such that

$\displaystyle \hat{f}_0(\eta) = \langle \eta, \theta_1\rangle\langle \eta, \theta_2\rangle\cdots\langle \eta, \theta_N\rangle h(\eta)$

To do this, we solve

$\displaystyle (\theta_1\cdot\nabla)(\theta_2\cdot\nabla)\cdots(\theta_N\cdot\nabla)g = f$

Which we can do by integrating ${N}$ times. Now take

$\displaystyle f_0 = \prod_{j = 1}^{N}(\theta_j\cdot\nabla)(\chi g)$

to complete the proof. $\Box$

So we cannot reconstruct a function from a finite sample of its X-Ray Transform without a priori assumptions. So let’s make those assumptions and reconstruct.

3. Filtered Back-projection

We want to reconstruct a function from a discrete set of samples of its X-Ray Transform. We will just look at the case of practical interest: ${n = 2}$. Here the X-Ray Transform is the same as the Radon Transform, and we want to compute

$\displaystyle cf = R^tH\partial_s h$

given data ${h = Rf}$. We will do this in two steps. First we will cut off the high frequencies, then we will discretize the integrals.

3.1. Step 1: Cut off high frequencies

We start by observing that for any distribution ${f}$ we have the identity

$\displaystyle f = f*\delta$

where ${\delta}$ is the delta function. We can choose a band-limited approximation to the identity, ${g_b}$ where

• ${\lim_{b \rightarrow \infty} g_b = \delta}$
• ${\hat{g}_b(\xi) = 0}$ for all ${|\xi| \geq b}$

For example, we could take ${\hat{g}_b = \chi_{[-b, b]}}$. Now we will modify our goals. High frequency components cannot be sampled properly by a finite set of observations. Roughly, we cannot hope to measure angular frequencies greater than ${\frac{2\pi}{N}}$ given ${N}$ samples. So instead of trying to reconstruct ${f}$ precisely, we will try to reconstruct${f*g_b}$, a band-limited approximation of ${f}$. Observe

Proposition 3

$\displaystyle R(f*g_b) = Rf *_{\tiny s} Rg_b$

Proof: The proof uses the Fourier Transform

$\displaystyle \begin{array}{rcl} \widehat{R(f*g_b)}(\rho\omega) &=& \hat{f}(\rho\omega)\hat{g}_b(\rho\omega) \\ &=& \mathfrak{F}_s(Rf)\mathfrak{F}_s(g_b) \\ &=& \widehat{Rf *_{\tiny s} Rg_b} \end{array}$

$\Box$

Proposition 4

$\displaystyle H\partial_s(Rf *_{\tiny s} Rg_b) = Rf *_{\tiny s} H\partial_s Rg_b$

Proof: Again, we use the Fourier Transform.

$\displaystyle \begin{array}{rcl} \widehat{H\partial_s(Rf *_{\tiny s} Rg_b)}(\rho\omega) &=& |\rho| \hat{f}(\rho\omega)\hat{g}_b(\rho\omega) \\ &=& \hat{f}(\rho\omega) (|\rho|hat{g}_b(\rho\omega)) \end{array}$

In other words, since the Fourier Transform diagonalizes all of these operators, they commute. $\Box$

Putting these facts together, we see that we want to reconstruct ${f*g_b}$ from our samples of ${Rf}$, and

$\displaystyle f*g_b = R^t(Rf*_{\tiny s}k_b) \ \ \ \ \ (1)$

Where

$\displaystyle k_b = H\partial_s Rg_b$

3.2. Step 2: Discretize

If we can compute the quantity in Equation 1 then we are done. But

$\displaystyle Rf*_{\tiny s}k_b = \int Rf(s-t, \theta)k_b(t)dt$

which is an integral that must be approximated. There are many ways to do this, but for our purposes the trapezoid rule will suffice. Specifically, assume that the support of ${f}$ is a subset of the cube ${\{ x | |x_i| \leq 1 \forall i\}}$. Then the support of ${Rf}$ is a subset of the cube ${\{(s, \omega) | |s| \leq 1\}}$. Let ${s_\ell = \ell h}$ where ${h = \frac{1}{g}}$ is our discretization step size. Then we will use the following approximation

$\displaystyle Rf *_{\tiny s} k_b(s, \theta) \approx h \sum_{\ell = -g}^{g}Rf(s_\ell, \theta)k_b(s - s_\ell, \theta) \overset{def}{=} Rf \underset{h}{*} k_b$

Which we will call the discretized convolution. We also need to discretize

$\displaystyle R^tg = \int_{S^1} g(x\cdot\omega, \omega) d\omega$

To do this, pick ${\omega_n = \left(\cos\left(\frac{2\pi n}{N}\right), \sin\left(\frac{2\pi n}{N}\right) \right)}$ and define

$\displaystyle R_d^tg(x) = \frac{2\pi}{N} \sum_{j = 1}^{N} g(x\cdot \omega_j, \omega_j)$

Putting these together, we can precisely define our reconstruction formula.

Definition 5 (Filtered Backprojection)

$\displaystyle f_{\text{FBP}} = R^t_d(Rf\underset{h}{*}k_b)$

3.3. Bounding reconstruction errors: Natterer’s Theorem

Variants of this formula are the standard reconstruction method used in practical tomography. Now we have one pressing question: how good is it? Without making a priori assumptions about ${f}$, we know we can’t hope for much. In fact Natterer has shown that under reasonable conditions we can get a lot.

Theorem 6 (Natterer) Let ${f \in C_0^\infty({\mathbb R}^2)}$ have support in the cube ${\{(s, \omega) | |s| \leq 1\}}$. Assume that we have observed samples ${Rf(s_\ell, \theta_j)}$ as indicated above. Then

$\displaystyle f * g_b = f_{\text{FBP}} + e_1(x) + e_2(x)$

where

$\displaystyle \begin{array}{rcl} e_1(x) &=& R^t(k_b * Rf) - R^t(k_b\underset{h}{*} Rf) \\ e_2(x) &=& (R^t - R_d^t)(k_b\underset{h}{*} Rf) \end{array}$

That is, the first error comes from discretization of the convolution and the second comes from the discretization of ${R^t}$. Let ${0 < \alpha < 1}$, ${b \leq \alpha N}$, and ${b \leq \frac{\pi}{h}}$. Then we have pointwise estimates

$\displaystyle \begin{array}{rcl} |e_1(x)| &\leq& C \sup_{\theta \in S^1} \int_{|\sigma| \geq b}|\sigma||\hat{f}(\sigma\theta)|d\sigma \quad \forall x, |x_i| \leq 1 \\ |e_2(x)| &\leq& C(\alpha) e^{-D(\alpha)N}\|f\|_{\infty} \qquad D(\alpha) > 0 \end{array}$

Note that the estimate on ${e_1}$ is related to control of the Sobolev norm of ${f}$. We will not prove this theorem here, look at Natterer’s book for details. Instead, we will spend the remaining lectures exploring helical tomography, local tomography, and other related problems. (A pdf version of these notes is available here.)