To evaluate the spatial resolution, the NEMA standard mandates the use of point source scans which are reconstructed using filtered backprojection. However, basically all modern PET scanners instead use an iterative maximum likelihood expectation maximization (MLEM) algorithm for reconstruction [4–15], so a scanner’s spatial resolution using filtered backprojection is not necessarily indicative of its spatial resolution for applications. While the mandated filtered backprojection is intended to benchmark the detector performance alone, we will demonstrate in the following that it disadvantages certain scanner geometries. Furthermore, the NEMA standard specifies that the spatial resolution must be determined using the projections of the reconstructed point sources inside a window in image space, without strictly specifying the size of this projection window. We will demonstrate that this can lead to an ambiguous spatial resolution which depends on the size of the projection window and allows for artificially enhancing the spatial resolution by choosing a particularly large projection window for certain scanner geometries.
The main disadvantage of filtered backprojection is that it does not include any model of the detector and assumes an ideal, ring-like PET scanner, while the detectors in real-world PET scanners are usually in a block geometry with anisotropic spatial resolutions. Line of responses (LORs) perpendicular to the detector’s front face are detected with the highest resolution, while tilted LORs have a parallax error in the detected position, which increases with more tilt of the LORs relative to the detector’s front faces as illustrated in Fig. 1. In principle, this effect can be reduced by detectors which are able to determine the depth of interaction (DOI) of the gamma interaction, but in practice most PET system do not employ detectors with DOI determination [4–6, 11, 12, 14, 16]. Additionally, PET rings have gaps between the detector, where no LORs are detected at all.
These issues with filtered backprojection will lead to artifacts in the reconstructed activity. For instance, each angle where the PET ring has an enhanced spatial resolution creates an excess in reconstructed activity along the line connecting this position with the point source and each angle with degraded spatial resolution creates a reduction in reconstructed activity along the respective line. Similarly, gaps between the detector create a lack of reconstructed activity along these lines.
To understand and demonstrate this behavior, it is instructive to look at these effects in sinogram space. In sinogram space, the enhanced spatial resolution of perpendicular LORs manifests as hot spots or rather peaks in the center of each detector modules as Fig. 2g shows. With increasing distance from the center of the detector module the spatial resolution degrades, blurring the line of the point source in sinogram space. We model this as the convolution of the sinogram of a Gaussian point source and the parallax error of the detector. The parallax error of the detector stack can be modeled as the shape of two triangles, connected at their tips as shown in Fig. 2d. The parallax error is proportional to sinφ, where φ is the angle to the normal of the block detector as defined in Fig. 1. The parallax error shown in Fig. 2d is a small-angle approximation of this.
In addition to the inherent problems of mandating the use of filtered backprojection in the NEMA standard, the standard additionally mandates projecting the reconstructed three-dimensional activity onto different one-dimensional axes using a projection window. However, the size of the projection window is not fully specified: “The response function is formed by summing all one-dimensional profiles that are parallel to the direction of measurement and within at least two times the FWHM of the orthogonal direction” [1, p. 7]. The first issue is that this definition is circular, since the minimal size of the projection window to determine the FWHM is defined using the FWHM itself. One can easily fix this problem, either using a sufficiently large projection window in the first place, or by reducing the size of the projection window iteratively in dependence of the determined FWHM in the previous iteration. However, the much bigger problem is that the size of the projection window can strongly influence the resulting spatial resolution. The cause of this is the integration of the star-like artifacts created by the anisotropic spatial resolution, as we demonstrate with the following simulation, shown in Fig. 2.
We created the activity distribution of an ideally reconstructed point source by assuming a rotationally symmetric two-dimensional normal distribution, shown in Fig. 2a. The position of the point source is off-center at a radial offset of 10 mm. To investigate the essence of the effects, we do not include noise in our simulation. From this ideally reconstructed point source, we create a sinogram by forward projection (i.e., by applying a Radon transformation). The resulting sinogram is shown in Fig. 2b.
We include the gaps between the detector stacks in our simulation by creating a sensitivity sinogram, where all bins corresponding to gaps are 0 and bins corresponding to sensitive detector area are 1 shown in Fig. 2c. The simulated geometry is depicted in Fig. 1 and follows the geometry of the Hyperion II D scanner to allow a comparison between simulation and measurement. When we include this model of gaps in our simulation by multiplying the sensitivity sinogram with our point-source sinogram (Fig. 2e) and then performing a filtered backprojection (i.e., an inverse Radon transformation), we get a reconstructed point source with slight artifacts, shown in Fig. 2h. As stated above, the artifacts are a lack of reconstructed activity along the lines connecting the gaps and the point source. When analyzing the spatial resolution of the filtered backprojection with gaps, we observe little influence of the gaps compared to the filtered backprojection of an ideal sinogram without gaps. More importantly, the resulting spatial resolution of 1.2 mm FWHM is stable to changes in the size of the projection window, as shown in Fig. 2k. Thus, gaps between the detectors are not the cause of severe artifacts and only have a very minor influence on the resulting spatial resolution with the usually small gaps of PET scanners.
When we additionally include the effect of the anisotropic detector resolution due to parallax errors by convolving the point-source sinogram and the point spread function in Fig. 2d, the resulting filtered backprojection in Fig. 2i exhibits a star-like artifact, i.e., the lines connecting the center of each detector stack and the point source exhibit a visible excess in activity.
If one of these excesses aligns with one of the Cartesian projection axis, and with the simulated geometry they do so for the x axis, the projection onto the axis perpendicular to this axis will result in a peaked excess at the maximum of the line profile, as shown in Fig. 3. A scanner’s spatial resolution is defined by the FWHM and FWTM of this profile, which depends strongly on the height of the maximum. Therefore, a peaked excess of the maximum will significantly enhance the resulting spatial resolution. For our geometry, this enhancement is only observed for the y axis, because only the x axis has an excess in activity aligned with it, as there are not any detector stack which are perpendicular to the y axis. This difference between the resolution in x and y is essentially an artifact and basically non-existent in real-world applications using an iterative maximum likelihood expectation maximization (MLEM) reconstruction. More importantly, the extent of this effect depends strongly on the size of the projection window as demonstrated in Fig. 2k. Increasing the size of the projection window enhances the resulting spatial resolution in y (i.e., decreases FWHM and FWTM) while degrading the spatial resolution in x. This makes comparison of the spatial resolution of different PET system difficult and maybe even impossible, as the NEMA standard does neither specify a clear projection window size nor does it mandate that the used window size should be reported. Thus, most publications do not state the used projection window [5, 7, 14, 16]. Other geometries may not exhibit this behavior at all, favoring or disfavoring systems which have detectors perpendicular to a Cartesian axis.
The measurement and filtered backprojection reconstruction of point sources with the Hyperion II D scanner shown in Fig. 2g and j look very similar to the simulation which includes parallax error and gaps: The sinogram has the same hot spots at the angles where the line of responses is perpendicular to the detector surface and the reconstruction exhibits the same star-like artifact. The analysis of the reconstruction yields the same observed difference in spatial resolution between the x and y axis. Additionally, we observe the same strong dependence on the size of the projection window, shown in Fig. 2m.
An extreme example of a scanner geometry affected by this issue would be a box geometry instead of the conventional ring geometry, i.e., a PET scanner with 4 large perpendicular detector modules without DOI capabilities. With such a geometry, the filtered backprojection artifact would have the shape of a cross, with both lines of excessive activity aligned with the x and y axis. Thus, the artifact would enhance the resolution along both x and y axis by boosting the maximum of both projections. This scenario is not solely hypothetical, as small-animal PET scanners with the described box-like geometry exist such as PETbox 4 [17]. In PETbox’s NEMA NU-4 performance evaluation they state that using FBP was not possible “since a FBP algorithm specific for the PETbox4 system with the unconventional geometry has not been developed” [17, p. 3797].
Other examples of published performance evaluation which have omitted the filtered backprojection altogether when evaluating the spatial resolution are [8, 18]. This is an indication that these groups do not find the results based on filtered backprojection not indicative for the performance of their system.
Fixing the issues of this method and proposing a better method to evaluate the spatial resolution is challenging. The NEMA standards committee surely knew many of these issues and we believe most of the PET community will be aware of issues with filtered backprojection, as well. However, so far, none of the performance publications based on NEMA discussed the issues presented here, so we believe it is worthwhile to state them to start a discussion.
One obvious solution would be to simply not use filtered backprojection and to perform the reconstruction with the default reconstruction method provided with the scanner, which is also used for the evaluation of the image quality phantom and for real-world applications. In modern scanners, this is usually an iterative reconstruction algorithm, e.g., ordered subset expectation maximization [19] and maximum likelihood expectation maximization [20, 21]. However, these algorithms can artificially enhance the spatial resolution of point sources without background activity due to, e.g., the non-positivity constraint or resolution recovery [22–24]. Thus, the reconstruction of a point source would mostly be a benchmark of the reconstruction and not of the underlying detector performance. We suspect that these arguments were the main reason why the NEMA standards committee chose filtered backprojection instead.
One alternative could be the evaluation of spatial resolution using a Derenzo hot-rod phantom. The standard could specify the geometry of such a phantom, specify the activity and scan time, allow the use of the reconstruction method supplied by the manufacturer, and then define a quantitative analysis method. The Derenzo phantom is already well-established in the community as a benchmark to evaluate the spatial resolution. For instance, several NEMA performance publications already include such a measurement as a benchmark of spatial resolution [5, 7, 12, 15]. However, these results are not easily comparable, as there currently is no standardized quantitative analysis method to determine the spatial resolution from a measurement of a Derenzo phantom. Usually, the spatial resolution is estimated by making a qualitative judgment at which distance the hot rods are still discernible. In principle, such a definition of spatial resolution based on the ability to resolve to close points is very reasonable and commonly used as a definition of spatial or angular resolution for telescopes and microscopes [25, 26]. However, for a quantitative definition of spatial resolution, there must be a standardized limit of the peak-to-valley ratio between two resolvable point sources, i.e., how much the intensity between the two peaks must dip to make them just resolvable. In a new standardized definition of PET spatial resolution, the PET community could follow the commonly used Rayleigh criterion with an intensity dip of 26.5% [27], or standardize a different limit.
For the scan of a Derenzo phantom, such a resolvability criterion would require to determine the valley-to-peak ratios of the profile lines over the different regions of the phantom. To include anisotropies in the spatial resolution, the profile lines should be defined over multiple angles as demonstrated in Fig. 4a. Figure 4b shows the resulting distribution of valley-to-peak ratios for the phantom’s 0.9-mm region. We would recommend that the spatial resolution is defined as the hot-rod distance in the region where at least 90% of the peak-to-valley ratios are below 0.735, i.e., the valley dips are above 26.5% for consistency with the Rayleigh criterion. Alternatively, one could define a limit based on the average peak-to-valley ratio of a region or using a different percentile than the suggested 90%. As shown in Fig. 4b, the region with distances of 0.9 mm has 100% of the valley-to-peak ratios below 0.735. For the 0.8 mm region, over half of the valley-to-peak ratios would be above 0.735 in our measurement. Thus, the resulting spatial resolution would be 0.9 mm.
To prevent arbitrary selection of peaks and valleys in a noisy reconstruction, the standard could specify a limit for the allowed deviation from the physical hot-rod distances when selecting the position of peak and valleys in the profiles of the Derenzo region.
To evaluate the influence of radial and axial offsets on the spatial resolution, the standard could specify different radial distances at which the Derenzo phantom should be placed. Similarly, the standard could also specify additional measurements of the rotated phantom to investigate the isotropy of the spatial resolution.
In our opinion, such a method would depend much less on the system’s geometry and technology and would provide a much more realistic benchmark, closely mirroring real-world use of the system. As one of the disadvantages, the precision of this method would be limited by the differences in hot-rod distances between the phantom’s region. However, with commonly used Derenzo phantoms, one would achieve a precision in the determination of the spatial resolution of 0.1 mm, which is more than adequate to assess the scanner’s viability for intended applications. Another drawback of the Derenzo phantom is that it is missing warm background activity and which could potentially lead to an artificial enhancement of spatial resolution with a high number of reconstruction iterations.
The outlined method is only intended as one possible first suggestion. We believe that developing a robust and objective method to benchmark the spatial resolution is a challenging and important research problem. One advantage of the current evaluation method is its simplicity, which simplifies Monte Carlo simulation and similar research.
As another alternative, Lodge et al. [28] have recently proposed a novel method for the measurement of clinical PET spatial resolution using a homogeneous cylinder phantom at an oblique angle. Another idea would be two use two adjacent point sources in a warm background, similar to the method described in [24].