Contemporary reconstruction algorithms all include PSF, which provide improved spatial resolution and better lesion detectability [4]. In this project, we developed a phantom to mimic small lesions in a patient, which allowed us to evaluate the accuracy and precision of RCs in small lesions. Our results confirm previous findings that PSF reconstruction leads to higher lesion detectability. However, we found that PSF reconstruction produced image artifacts such as RCs that are not a monotonic function of lesion size (Fig. 1) for all tumor-to-background ratios (Figs. 2 and 3), and RCs that are noisy and could have values greater than 1 (Fig. 4). It is crucial to understand the PSF artifacts otherwise they can lead to serious unwanted consequences when using SUV measures to monitor disease and treatment response. We note that the PSF artifacts are present on two different PET/CT systems (Figs. 2 and 3) and are thus not caused by vendor-specific implementations of PSF in their reconstruction algorithms.
A limitation of this study is that our phantom did not include all challenging characteristics of real patient scans such as varying patient sizes, image noise, location and shape of lesions, and respiratory movement, but the setup was useful to optimize PSF reconstruction parameters for quantification. The experiment including production of the spheres was highly reproducible as seen for reconstruction without PSF (Fig. 4). For the three sets of spheres, the RCs depended on the surrounding medium as expected. In free air, the positron range is large leading to the lowest RCs, with silicone as scattering medium the RCs will be slightly higher, and with 11C-silicone background there will be less spill-over and RCs will be highest. Thus, for a late image in the dynamic series with low 11C-background activity, similar RCs should be expected for spheres of the same size but with a tendency: RCair ≤ RCsilicone ≤ RC11C-silicone. However, RCs were less reproducible for PSF reconstruction.
In practice, the physical size of a lesion is difficult to derive from PET images due to spill-out and partial volume effects [18]. Thus, in clinical studies the quantification is generally based on the maximum voxel value or the average values inside a 3D isocontour [19], and we have shown that this may lead to wrong conclusions. As an example, imagine a patient with a small (sphere-shaped) tumor being treated with 6 cycles of chemotherapy. After 2 cycles, the tumor diameter shrinks from 12 to 8 mm with the remaining tumor tissue having unchanged physiology. The treatment appears to be effective, the tumor volume has been reduced by more than a factor 3, and the patient has a chance to be completely cured after the remaining treatments. However, the patient had a PET scan acquired after 2 cycles of chemotherapy being compared to a pretreatment PET baseline scan. As illustrated in Fig. 1b (compare the 12-mm sphere to the 8-mm sphere), the conclusion based on SUVs could be that SUVmax has increased after the treatment, which usually indicates disease progression. This misinterpretation is caused by PSF artifacts and could lead to a wrong decision, e.g., to change or stop an effective treatment. Thus, it is important to optimize image reconstruction parameters to suppress PSF artifacts for images used for treatment response monitoring because the artificial non-monotonic relation between SUV and lesion size is counterintuitive based on previous experiences from the PET literature as illustrated in Fig. 6. We expect that this PSF artifact is only present for high-resolution imaging of small lesions, as studied here, whereas the RC for larger lesion should not be affected by edge overshoot using PSF reconstruction. For larger lesions, average-based RCs should approach 1 with maximum-based RCs being a bit higher due to noise.
Our results indicate that for the detection of the smallest lesions (3–4 mm diameter) PSF reconstruction can perform worse than reconstruction without PSF, when using few iterations as commonly recommended by the vendors. PSF reconstruction always improves detection for the larger lesions (6–12 mm diameters). These results may be explained by lack of convergence, and similar observations have been made in studies of small lesions and myocardial defects [20, 21] that also found that TOF reconstruction require fewer iterations to converge. Figure 5 illustrates that inclusion of PSF delays convergence. Without PSF, RCs were monotonic functions of sphere size and slightly increased as function of iterations. With PSF, the maximum-based measures, RC50bg and RCmax, converged slower than RCavg (compare Fig. 5a–b) probably because the maximum voxel value is more dependent on higher-frequency features, voxel noise and edge artifacts in the image. For all spheres, RCavg and RCmax continued to increase as function of iterations (we stopped at 6 iterations 21 subsets, which is more than commonly used). This effect was particularly pronounced for the 6–8 mm spheres when using the 2-mm post-filter. The overshooting artifact leading to RC > 1 could be related to a broad PSF kernel. It has been shown that narrower PSF kernel reduces the overshooting artifact [2, 22], and that the artifact persists even when the width of the PSF matches the size of the system’s PSF [23]. It is not possible to change the PSF kernel width on commercial clinical scanners, but our results indicate that their standard PSF kernels may be too broad. However, wider post-filters suppressed the PSF artifacts and restored the expected relation that the RC fall monotonically as the sphere size decreases (Fig. 5), but the post-filter should not be too wide because this would unnecessarily reduce the RCs. As an example, TrueX with 4 iterations, 21 subsets, and a 3-mm Gaussian filter produced images that suppressed edge artifacts sufficiently to restore the expected monotonic relation between RCs and sphere size, but still provided maximum-based RCs that can be slightly larger than 1 (e.g., up to 1.2 for RC50bg). The wider 4-mm Gaussian filter was needed to also avoid RCs larger than 1, while still providing RCs that are around 20% higher than without PSF for spheres larger than 6 mm diameter.
The best choice of PSF reconstruction parameters will be a trade-off between lesion detectability and avoiding PSF artifacts affecting quantification, which should be optimized for the specific task. In this work, we studied accuracy and precision of RCs (and SUV values) that allowed us to optimize PSF reconstruction parameters for quantification. For detection, the PSF reconstruction parameters should optimize the signal-to-background ratio while also accounting for image noise, which was not studied here. PSF reconstruction leads to decreased noise at the individual voxel level but with increased correlation among nearby voxels, which leads to distinct noise texture being less granular but lumpier [9, 22]. The lumpy noise decreases reproducibility when quantifying VOIs of a similar or smaller size as the PSF noise structures, as we have demonstrated (Fig. 4). Thus, PSF images should be used with caution for quantitative analyses of small lesions.
The phenomena described are relevant not only for the measurement of SUV but for visual assessment of PET images, as well. In this work, we have studied the performance of PSF reconstruction algorithms for the quantification of tracer uptake in small, sub-centimeter lesions at varying lesion-to-background ratios. It must be anticipated that the phenomena described is valid for thin tissue layers, as well, e.g., pleura and peritoneum, the narrow periphery of cystic and necrotic tumors and for vessel walls. Also, variation in apparent cortical brain uptake could be caused by variation in cortical thickness rather than variation in metabolic activity. As a result of the same phenomena, it must also be anticipated that the SUV might be biased in small liver lesions (e.g., liver cysts) with reduced uptake compared to surrounding normal liver tissue, as well as in the lumen of arteries with increased uptake in the vessel wall (e.g., arteritis). In these special cases, the quantitative properties of high-resolution PET images should be validated using dedicated phantoms preferably avoiding artificial non-radioactive layers caused by plastic walls in commercial fillable phantoms [14]. A recent study explored the use of PSF reconstructed images for volume segmentation in delineation of tumor volumes in radiotherapy planning [24]. Compared to our study, they used a NEMA phantom with extra spheres and reconstructed images using wider filters, i.e., with lower spatial resolution, but interestingly, they identified problems related to larger spheres and recommended not to use PSF reconstruction for volume segmentation using adaptive thresholding methods. For multicenter studies, the measured SUV values are further impacted by the use of different scanners and reconstruction algorithms. NEMA studies can be made to harmonize SUV quantification, and a method to make SUV values comparable without applying wide post-filters that would obscure lesion detection has recently been suggested [25]. However, quantitative PET imaging of small lesions still require further investigations.