Point-spread function reconstructed PET images of sub-centimeter lesions are not quantitative
© The Author(s). 2017
Received: 30 July 2016
Accepted: 10 December 2016
Published: 13 January 2017
PET image reconstruction methods include modeling of resolution degrading phenomena, often referred to as point-spread function (PSF) reconstruction. The aim of this study was to develop a clinically relevant phantom and characterize the reproducibility and accuracy of high-resolution PSF reconstructed images of small lesions, which is a prerequisite for using PET in the prediction and evaluation of responses to treatment.
Sets of small homogeneous 18F-spheres (range 3–12 mm diameter, relevant for small lesions and lymph nodes) were suspended and covered by a 11C-silicone, which provided a scattering medium and a varying sphere-to-background ratio. Repeated measurements were made on PET/CT scanners from two vendors using a wide range of reconstruction parameters. Recovery coefficients (RCs) were measured for clinically used volume-of-interest definitions.
For non-PSF images, RCs were reproducible and fell monotonically as the sphere diameter decreased, which is the expected behavior. PSF images converged slower and had artifacts: RCs did not fall monotonically as sphere diameters decreased but had a maximum RC for sphere sizes around 8 mm, RCs could be greater than 1, and RCs were less reproducible. To some degree, post-reconstruction filters could suppress PSF artifacts.
High-resolution PSF images of small lesions showed artifacts that could lead to serious misinterpretations when used for monitoring treatment response. Thus, it could be safer to use non-PSF reconstruction for quantitative purposes unless PSF reconstruction parameters are optimized for the specific task.
The spatial resolution and signal-to-noise ratio of PET images have improved significantly due to image reconstruction methods that include modeling of resolution degrading phenomena, particularly detector effects such as crystal size, inter-crystal scattering, and crystal penetration, but also positron range and angle deviation from 180° during electron-positron annihilation. This more accurate system matrix is used in statistical reconstruction algorithms, also referred to as resolution recovery, resolution modeling, or point-spread function (PSF) reconstruction [1, 2]. Today, all major vendors of clinical whole-body PET/CT systems provide PSF reconstruction algorithms for clinical PET imaging. PSF reconstruction produces images with improved isotropic spatial resolution, reduced spill-in/spill-out, and ultimately increased activity concentration (Bq/mL) or standardized uptake value (SUV) in small lesions that are thus more easily detected and characterized. These benefits have been demonstrated as higher recovery coefficients (RCs) in NEMA phantom studies  and improved lesion detectability in patient studies .
There is a growing interest for using PET in the prediction and evaluation of early responses to treatment, such as chemotherapy, radiotherapy, and local ablative therapy, e.g., of liver metastases, in order to identify non-responders as soon as possible to optimize their treatment strategy [5–8]. PET images used to plan and evaluate therapeutic strategies should have a high spatial resolution and good contrast-to-noise ratio, but foremost the measurements need to be quantitative and reproducible. PSF images have distinct noise texture [3, 9] and artifacts such as edge overshoot and hyper-resolution of focal uptake [10–12]. These unwanted effects on the quantification of small structures need to be thoroughly studied in a clinical relevant setting as these artifacts are not revealed by the standard NEMA image quality test . The NEMA image quality measurement is widely used to compare the image quality of different imaging systems where a phantom containing six fillable spheres of different sizes (10–37 mm diameter) are imaged so that the hot spheres have a fixed activity concentration four and eight times that of the background. However, NEMA phantom and other commercially available phantoms have limitations with respect to evaluation of small lesions as their spheres are too large compared to the size of small lesions that can be detected in clinical practice. Furthermore, the fillable acrylic glass spheres in the phantoms separate the hot spheres from the background activity by a non-radioactive layer, which does not mimic the physiologic reality and cause quantitative errors , particularly for small diameters.
In this study, we developed a new phantom to produce clinically relevant measurements to study the performance of PSF reconstruction algorithms for the quantification of small sub-centimeter lesions in direct contact with surrounding tissue. We scanned phantoms with small sub-centimeter spheres at a wide range of sphere-to-background ratios (1:1 to 15:1). PET data were acquired on PET/CT systems from two vendors, we reconstructed images with and without PSF, and we explored effects of convergence and post-reconstruction filtering. We measured activity concentrations and RCs based on volumes of interests (VOIs) commonly used for cancer diagnostics to assess whether PSF reconstructed images of small lesions provide accurate quantification, which is needed for treatment response monitoring or tracer kinetic modeling.
Preparation of wall-less sub-centimeter sphere phantoms
A 50-mL solution of 18FDG was mixed with 4.5-g gel powder (Orthoprint, Zhermack Clinical, Italy) by rigorously shaking for 30 s. The homogeneous mixture was drawn into a syringe and injected into a row of 12 spherical molds sitting along a thin 30-cm string. The gel was allowed to settle for around 2 min before the 12 hot spheres (diameters in mm: 10, 10, 6, 6, 12, 4, 10, 3, 8, 4, 4, and 6) were released (see Additional file 1). Three sets of hot spheres were made from the same 18FDG-solution and carefully inspected for imperfections such as air bubbles. The first set was suspended in free air. The second set was suspended and covered by 250-mL two-component silicone (Magic Gel 1000, Raytech, Italy) that provided a scattering medium. The third set was embedded in 250-mL 11C-silicone that was prepared by mixing a small volume of 11C solution into one of the silicone components using a magnetic stirrer before adding the second component, which allowed us to study the spheres at varying sphere-to-background ratios. We verified that 18FDG did not move by diffusion from the spheres into the silicone. New phantoms were prepared for each PET scan and placed in the scanner with a large source of the 18FDG-reference (around 25 mL remaining from the sphere production). The sphere production method using molds was originally developed by Skretting and co-workers , and in this study it was further improved by the surrounding non-diffusible scatter medium that allowed us to construct a clinical relevant phantom mimicking small homogeneous hot lesions in direct contact with surrounding tissue.
PET list mode data acquisitions were performed on two widely used clinical PET/CT systems: Siemens Biograph 64 Truepoint TrueV (Siemens Healthcare, Erlangen, Germany) and GE Discovery 690 (GE Healthcare, Milwaukee, WI, USA). The scanners have similar spatial resolutions when measured according to the NEMA NU2-2007 using filtered back projection : 4.4  and 4.7 mm FWHM . Both PET/CT systems include iterative PSF reconstruction algorithms that further improve the spatial resolutions. List mode data were divided into 10 min frames and reconstructed with PSF. The data were decay-corrected using the 18F half-life and corrected for scatter and attenuation using CT images. For Siemens Biograph 64, list mode data were separated in prompt and random sinograms, and images were reconstructed using attenuation weighted ordered subset expectation maximization without PSF (Iterative3D) and with PSF (TrueX). For GE Discovery 690, list mode data were reconstructed using VuePoint FX SharpIR algorithm with time-of-flight (TOF) and PSF. Iterative reconstruction algorithms without resolution modeling are not available for GE Discovery 690. We explored the reconstruction algorithms with varying number of iterations (keeping the number of subsets constant), width of post-filters, and size of the reconstruction matrices with focus on high-resolution imaging of small lesions (see figures for details).
Each PET image of the phantoms was analyzed as follows: the true activity concentration in the spheres was measured using a 6-mL cube-shaped VOI placed on the large homogenous 18FDG-reference, and the background activity concentration was measured using a cube-shaped 15-mL VOI placed on the 11C-silicone. The large VOIs were not expected to be affected by PSF artifacts and partial volume effects . For the spheres suspended in 11C-silicone, the true sphere-to-background ratio was calculated as true activity concentration in the spheres divided by background activity concentration, representing the average activity ratio during the 10-min time frames. In the last PET frame, a centrally placed sphere-shaped VOI was defined on each sphere with the same diameter as the physical sphere. Then, frame-by-frame, the sphere-shaped VOIs were used to enclose three commonly used metrics: maximum activity concentration (Amax), background-corrected 3D isocontour at 50% max (A50bg), and average activity concentration (Aavg). Amax and A50bg are measures that are available on most visualization tools and therefore reflect how physicians perceive lesions during image reading when extracting SUV measures . In contrast, Aavg is rarely used clinically because it requires a VOI that exactly covers the lesion, and the exact dimensions of a lesion are usually not known and cannot be reliably deduced from clinical PET images (particularly not for the smallest diameters). The activity concentrations were used to calculate RCs (RCmax, RC50bg, and RCavg) defined as the measured activity concentration divided by the true activity concentration. The RCs and the true sphere-to-background ratio were calculated frame-by-frame in the dynamic series. Both measures are ratios and independent of decay correction. The shorter half-life of the 11C background generated an increasing true sphere-to-background ratio as function of time.
Visual inspection of hot spheres in free air
Recovery as function of sphere size and sphere-to-background ratio
Reproducibility of RC measurements
RC as function of iterations
Optimized reconstruction parameters
The majority of the tested combinations of iterations and post-filter widths lead to RC50bg larger than 1 and a clearly non-monotonic relationship between RC and sphere size (Fig. 5a). However, the use of wider post-filters suppressed the PSF artifact and restored the monotonic relationship but at the cost of lower spatial resolution. Very wide post-filters removed the PSF artifacts but also the benefit of higher RCs using PSF reconstruction. For the data in Fig. 5a, there was not one optimal set of reconstruction parameters for all sphere sizes. The use of 4 iterations, 21 subsets, and a 3-mm Gaussian filter produced images that restored the monotonic relation between RC and lesion size while still providing higher RCs than images reconstruction without PSF for sphere sizes around 6–12 mm. The PSF artifacts were much less pronounced when images are evaluated using Aavg (Fig. 5b). RCavg values were always less than 1, and only for the narrow 2-mm post-filter a non-monotonic behavior was observed for all iterations.
Contemporary reconstruction algorithms all include PSF, which provide improved spatial resolution and better lesion detectability . In this project, we developed a phantom to mimic small lesions in a patient, which allowed us to evaluate the accuracy and precision of RCs in small lesions. Our results confirm previous findings that PSF reconstruction leads to higher lesion detectability. However, we found that PSF reconstruction produced image artifacts such as RCs that are not a monotonic function of lesion size (Fig. 1) for all tumor-to-background ratios (Figs. 2 and 3), and RCs that are noisy and could have values greater than 1 (Fig. 4). It is crucial to understand the PSF artifacts otherwise they can lead to serious unwanted consequences when using SUV measures to monitor disease and treatment response. We note that the PSF artifacts are present on two different PET/CT systems (Figs. 2 and 3) and are thus not caused by vendor-specific implementations of PSF in their reconstruction algorithms.
A limitation of this study is that our phantom did not include all challenging characteristics of real patient scans such as varying patient sizes, image noise, location and shape of lesions, and respiratory movement, but the setup was useful to optimize PSF reconstruction parameters for quantification. The experiment including production of the spheres was highly reproducible as seen for reconstruction without PSF (Fig. 4). For the three sets of spheres, the RCs depended on the surrounding medium as expected. In free air, the positron range is large leading to the lowest RCs, with silicone as scattering medium the RCs will be slightly higher, and with 11C-silicone background there will be less spill-over and RCs will be highest. Thus, for a late image in the dynamic series with low 11C-background activity, similar RCs should be expected for spheres of the same size but with a tendency: RCair ≤ RCsilicone ≤ RC11C-silicone. However, RCs were less reproducible for PSF reconstruction.
Our results indicate that for the detection of the smallest lesions (3–4 mm diameter) PSF reconstruction can perform worse than reconstruction without PSF, when using few iterations as commonly recommended by the vendors. PSF reconstruction always improves detection for the larger lesions (6–12 mm diameters). These results may be explained by lack of convergence, and similar observations have been made in studies of small lesions and myocardial defects [20, 21] that also found that TOF reconstruction require fewer iterations to converge. Figure 5 illustrates that inclusion of PSF delays convergence. Without PSF, RCs were monotonic functions of sphere size and slightly increased as function of iterations. With PSF, the maximum-based measures, RC50bg and RCmax, converged slower than RCavg (compare Fig. 5a–b) probably because the maximum voxel value is more dependent on higher-frequency features, voxel noise and edge artifacts in the image. For all spheres, RCavg and RCmax continued to increase as function of iterations (we stopped at 6 iterations 21 subsets, which is more than commonly used). This effect was particularly pronounced for the 6–8 mm spheres when using the 2-mm post-filter. The overshooting artifact leading to RC > 1 could be related to a broad PSF kernel. It has been shown that narrower PSF kernel reduces the overshooting artifact [2, 22], and that the artifact persists even when the width of the PSF matches the size of the system’s PSF . It is not possible to change the PSF kernel width on commercial clinical scanners, but our results indicate that their standard PSF kernels may be too broad. However, wider post-filters suppressed the PSF artifacts and restored the expected relation that the RC fall monotonically as the sphere size decreases (Fig. 5), but the post-filter should not be too wide because this would unnecessarily reduce the RCs. As an example, TrueX with 4 iterations, 21 subsets, and a 3-mm Gaussian filter produced images that suppressed edge artifacts sufficiently to restore the expected monotonic relation between RCs and sphere size, but still provided maximum-based RCs that can be slightly larger than 1 (e.g., up to 1.2 for RC50bg). The wider 4-mm Gaussian filter was needed to also avoid RCs larger than 1, while still providing RCs that are around 20% higher than without PSF for spheres larger than 6 mm diameter.
The best choice of PSF reconstruction parameters will be a trade-off between lesion detectability and avoiding PSF artifacts affecting quantification, which should be optimized for the specific task. In this work, we studied accuracy and precision of RCs (and SUV values) that allowed us to optimize PSF reconstruction parameters for quantification. For detection, the PSF reconstruction parameters should optimize the signal-to-background ratio while also accounting for image noise, which was not studied here. PSF reconstruction leads to decreased noise at the individual voxel level but with increased correlation among nearby voxels, which leads to distinct noise texture being less granular but lumpier [9, 22]. The lumpy noise decreases reproducibility when quantifying VOIs of a similar or smaller size as the PSF noise structures, as we have demonstrated (Fig. 4). Thus, PSF images should be used with caution for quantitative analyses of small lesions.
The phenomena described are relevant not only for the measurement of SUV but for visual assessment of PET images, as well. In this work, we have studied the performance of PSF reconstruction algorithms for the quantification of tracer uptake in small, sub-centimeter lesions at varying lesion-to-background ratios. It must be anticipated that the phenomena described is valid for thin tissue layers, as well, e.g., pleura and peritoneum, the narrow periphery of cystic and necrotic tumors and for vessel walls. Also, variation in apparent cortical brain uptake could be caused by variation in cortical thickness rather than variation in metabolic activity. As a result of the same phenomena, it must also be anticipated that the SUV might be biased in small liver lesions (e.g., liver cysts) with reduced uptake compared to surrounding normal liver tissue, as well as in the lumen of arteries with increased uptake in the vessel wall (e.g., arteritis). In these special cases, the quantitative properties of high-resolution PET images should be validated using dedicated phantoms preferably avoiding artificial non-radioactive layers caused by plastic walls in commercial fillable phantoms . A recent study explored the use of PSF reconstructed images for volume segmentation in delineation of tumor volumes in radiotherapy planning . Compared to our study, they used a NEMA phantom with extra spheres and reconstructed images using wider filters, i.e., with lower spatial resolution, but interestingly, they identified problems related to larger spheres and recommended not to use PSF reconstruction for volume segmentation using adaptive thresholding methods. For multicenter studies, the measured SUV values are further impacted by the use of different scanners and reconstruction algorithms. NEMA studies can be made to harmonize SUV quantification, and a method to make SUV values comparable without applying wide post-filters that would obscure lesion detection has recently been suggested . However, quantitative PET imaging of small lesions still require further investigations.
PSF reconstruction artifacts can lead to misinterpretations when used for quantitative analyses of small sub-centimeter lesion, e.g., when monitoring treatment response on lymph nodes using SUV measurements. Recovery coefficients were non-monotonic as function of lesion size, less reproducible, and more slowly converging. Thus, it could be safer to use non-PSF reconstruction for quantitative purposes unless PSF reconstruction parameters are optimized to suppress PSF artifacts.
We acknowledge late Arne Skretting, medical physicist (The Intervention Centre, Oslo University Hospital, Oslo, Norway), for pivotal discussions on the experimental setup and data interpretation.
All authors conceived the study and participated in its design. OLM produced the sphere phantoms. OLM and LPT PET scanned the phantoms. OLM carried out the data analyses and drafted the manuscript. All authors participated in the interpretation of the data. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Panin VY, Kehren F, Michel C, Casey M. Fully 3-D PET reconstruction with system matrix derived from point source measurements. IEEE Trans Med Imaging. 2006;25:907–21.View ArticlePubMedGoogle Scholar
- Sureau FC, Reader AJ, Comtat C, et al. Impact of image-space resolution modeling for studies with the high-resolution research tomograph. J Nucl Med. 2008;49:1000–8.View ArticlePubMedGoogle Scholar
- Tong S, Allesio AM, Kinahan PE. Noise and signal properties in PSF-based fully 3D PET image reconstruction: an experimental evaluation. Phys Med Biol. 2010;55:1453–73.View ArticlePubMedPubMed CentralGoogle Scholar
- Andersen FL, Klausen TL, Loft A, Beyer T, Holm S. Clinical evaluation of PET image reconstruction using a spatial resolution model. Eur J Radiol. 2013;82:862–9.View ArticlePubMedGoogle Scholar
- Somer EJ, Pike LC, Marsden PK. Recommendations for the use of PET and PET-CT for radiotherapy planning in research projects. Br J Radiol. 2012;85:e544–8.View ArticlePubMedPubMed CentralGoogle Scholar
- de Geus-Oei LF, Vriens D, van Laarhoven HW, van der Graaf WT, Oyen WJ. Monitoring and predicting response to therapy with 18 F-FDG PET in colorectal cancer: a systematic review. J Nucl Med. 2009;50 Suppl 1:43S–54S.View ArticlePubMedGoogle Scholar
- Bussink J, Kaanders JH, van der Graaf WT, Oyen WJ. PET-CT for radiotherapy treatment planning and response monitoring in solid tumors. Nat Rev Clin Oncol. 2011;8:233–42.View ArticlePubMedGoogle Scholar
- Dimitrakopoulou-Strauss A. PET-based molecular imaging in personalized oncology: potential of the assessment of therapeutic outcome. Future Oncol. 2015;11:1083–91.View ArticlePubMedGoogle Scholar
- Rahmim A, Tang J. Noise propagation in resolution modeled PET imaging and its impact on detectability. Phys Med Biol. 2013;58:6945–68.View ArticlePubMedGoogle Scholar
- Watson CC. Estimating effective model kernel widths for PSF reconstruction in PET. IEEE Nuclear Science Symposium and Medical Imaging Conference, 2011; IEEE New York:2368–2374.Google Scholar
- Watson CC. Measurement of the physical PSF for an integrated PET/MR using targeted positron beams. IEEE Nuclear Science Symposium and Medical Imaging Conference, 2012; IEEE New York: 2089–2095.Google Scholar
- Rahmim A, Qi J, Sossi V. Resolution modeling in PET imaging: theory, practice, benefits, and pitfalls. Med Phys. 2013;40:064301.View ArticlePubMedPubMed CentralGoogle Scholar
- National Electrical Manufacturers Association (NEMA). Standards publication NU 2-2012: performance measurements of positron emission tomographs. Rosslyn: NEMA; 2012.Google Scholar
- Berthon B, Marshall C, Edwards A, Evans M, Spezi E. Influence of cold walls on PET image quantification and volume segmentation: a phantom study. Med Phys. 2013;40:082505.View ArticlePubMedGoogle Scholar
- Skretting A, Glomset O, Bogsrud TV. A phantom for investigation of tumour signal and noise in PET reconstruction with various smoothing filters: experiments and comparisons with simulated intensity diffusion. Radiat Prot Dosimetry. 2010;139:191–4.View ArticlePubMedGoogle Scholar
- Jakoby BW, Bercier Y, Watson CC, Bendriem B, Townsend DW. Performance characteristics of a new LSO PET/CT scanner with extended axial field-of-view and PSF reconstruction. IEEE Trans Nucl Sci. 2009;56:633–9.View ArticleGoogle Scholar
- Bettinardi V, Presotto L, Rapisarda E, Picchio M, Gianolli L, Gilardi MC. Physical performance of the new hybrid PET∕CT Discovery-690. Med Phys. 2011;38:5394–411.View ArticlePubMedGoogle Scholar
- Soret M, Bacharach SL, Buvat I. Partial-volume effect in PET tumor imaging. J Nucl Med. 2007;48:932–45.View ArticlePubMedGoogle Scholar
- Boellaard R, Delgado-Bolton R, Oyen WJ, et al. FDG PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur J Nucl Med Mol Imaging. 2015;42:328–54.View ArticlePubMedGoogle Scholar
- Schaefferkoetter J, Casey M, Townsend D, El Fakhri G. Clinical impact of time-of-flight and point response modeling in PET reconstructions: a lesion detection study. Phys Med Biol. 2013;58:1465–78.View ArticlePubMedPubMed CentralGoogle Scholar
- Schaefferkoetter J, Ouyang J, Rakvongthai Y, Nappi C, El Fakhri G. Effect of time-of-flight and point spread function modeling on detectability of myocardial defects in PET. Med Phys. 2014;41:062502.View ArticlePubMedPubMed CentralGoogle Scholar
- Blinder SA, Dinelle K, Sossi V. Scanning rats on the high resolution research tomograph (HRRT): a comparison study with a dedicated micro-PET. Med Phys. 2012;39:5073–83.View ArticlePubMedGoogle Scholar
- Tong S, Alessio AM, Thielemans K, Stearns C, Ross S, Kinahan PE. Properties of edge artifacts in PSF-Based PET reconstruction, IEEE Nuclear Science Symposium and Medical Imaging Conference, 2010, IEEE New York: 3649–3652Google Scholar
- Matheoud R, Ferrando O, Valzano S, et al. Performance comparison of two resolution modeling PET reconstruction algorithms in terms of physical figures of merit used in quantitative imaging. Phys Med. 2015;31:468–75.View ArticlePubMedGoogle Scholar
- Quak E, Le Roux PY, Hofman MS, et al. Harmonizing FDG PET quantification while maintaining optimal lesion detection: prospective multicentre validation in 517 oncology patients. Eur J Nucl Med Mol Imaging. 2015;42:2072–82.View ArticlePubMedPubMed CentralGoogle Scholar