This is one of the first studies showing that clinically perceived image quality in PET has a significant positive correlation with NECR measured in patients, R
DW, R
DBMI and presence of pathologic lesions. Clinical image quality furthermore has a significant negative correlation with weight, BMI and presence of bowel uptake. An optimal threshold for the R
DW and R
DBMI in which clinical IQ is at least good in more than 90% of the patients has been defined as well. Lastly, our conclusions (e.g. recommended FDG dose) are available both in terms of perceptual diagnostic image quality and of automated technical image quality measures. This means that our results can be rescaled to be applied to any other PET scanner.
Image quality
In this study, the methods used for the evaluation of IQ were found to be reliable for two main reasons. First, because both the IQL and IQG, although independent methods, were significantly and strongly correlated to each other (r = 0.94). The second reason was the significant and positive correlation between IQL and IQG to NECR, a well-established and quantitative way of assessing IQ of PET systems [3],[5].
There were no images scored below 5 on IQG or below 1.44 on IQL, because all the patients were referred to a clinical PET evaluation and, therefore, required an image with a diagnostic quality.
NECR
NECR measurements represent a method of approximation to image quality in PET systems, at least to a certain extent which seems clinically reliable. However, getting the NECR values is a challenge. Several studies have used NECR based on phantom or simulation measurements [9],[11],[12]. One study has measured NECR in a patient population [5]; however, they compared this value only with image noise (a different automatic technical measurement). One major difference compared to our study is that our estimation of NECR is patient-based and that an additional correlation was made to visual IQ assessments. Thus, we actually have a clinical image estimation, which can be transferred to other PET/CT systems with different detector technologies to compare and estimate the expected clinical IQ. NECR measurements are already widely accepted and used as means of comparing data acquisition performance between different PET systems. By establishing a link between our clinical (subjective) evaluation and the corresponding raw data quality in terms of NECR, it is indeed possible to estimate the tracer doses required to achieve similar results in other scanners. This can be achieved by comparing the phantom-based NECR curves provided by the respective manufacturers.
Correlation to clinical perception
Both IQL and IQG, as well as NECR, showed a significant positive correlation with the R
DW and R
DBMI.
These results are partially in accordance with the literature. Everaert et al. showed that administration of activities of FDG of ≥8 MBq/kg results in PET images of good to excellent quality in the vast majority of patients using a standardised protocol [4]. On the other hand, Chang and co-workers have demonstrated that an increase in injected dose means a (scanner-dependent) raise in NECR, but only until it reaches the known plateau. From this point on, there is no difference on NECR by increasing FDG injected dose anymore [5].
Our study showed an optimal threshold of 2.6 MBq/kg (R
DW) in which the images are equal or above good IQ. The threshold we found is lower than that published in the literature, because we used the FDG dose during the acquisition of PET (after the uptake time). That holds the advantage that it is not influenced by the delay between injection and PET exam. Using a standardised uptake time of 60 min, the calculated threshold in our setup value would be 3.8 MBq/kg. However, this is still lower than that published in other studies [9]. Since IQ, NECR and R
DW are well correlated with each other, the threshold of R
DW below the clinical image quality is suboptimal and can be translated into a corresponding NECR threshold, which then identifies such suboptimal image quality in a quantitative way.
The EANM recommends standardised FDG activity. For systems with a high count rate capability, the administered FDG activity and scan duration for each bed position should be a relation between the injected dose/kg and time per bed position. Therefore, one may decide to apply a higher activity and reduce the duration of the scan or use reduced activity and increase scan duration - depending on the clinical situation [13].
Putting our technical parameters and clinical evaluations in accordance with those recommendations, the FDG activity would be 6.9 MBq/kg. Our study has shown that PET imaging quality remains good to excellent in the majority of patients even using significantly lower dose.
Weight and BMI
Our study has demonstrated that weight and BMI are negatively and significantly correlated to IQ and NECR. This is expected since other studies have already established this relation. By increasing the patient's weight and BMI, the image quality deteriorates and NECR decreases significantly. This is explainable by the standardised fixed dose of 350 MBq (±10%) that was used in our department at the time period when this study was performed.
As described before, we have also found that in patients with higher BMI, higher FDG doses are needed in order to increase NECR and, hence, image quality. However, it has been shown that once the NECR curve is approaching its plateau it is not useful to increase the injected dose [5]. Furthermore, there are maximal FDG activities defined by the national laws that can be applied [13]. Thus, in current clinical systems (e.g. lutetium oxyorthosilicate (LSO)), FDG activities higher than 530 MBq should not be administered [14]. To increase the image quality in patients where the maximal FDG dose is already given, it is recommended to increase the time per bed position instead of giving higher doses [13].
Our results concerning perceptual and technical differences between weight/BMI groups are also in accordance with the current literature. Recently, Chang and co-workers have demonstrated statistical differences in NECR and signal-to-noise ratio between two different BMI groups [15]. In our population, there was no statistical difference between the BMI groups II and III regarding NECR, while concerning IQL, there was no statistical difference between the groups III and IV. One possible explanation for this might be the different appreciation of the image quality between automatic and subjective criteria. In this case, it would appear that the subjective perception of image quality loss with larger patients tends to ‘saturate’ as BMI increases.
Presence of FDG bowel uptake
In clinical routine, partly very high FDG uptake can be seen in the gastrointestinal tract, e.g. in the distal oesophagus, stomach, small intestine and large intestine representing normal patterns of tracer distribution (due to diabetes medication) or inflammatory disease [16]. Diffuse increased FDG uptake in these areas is defined as physiologic and unrelated to the malignant process with relatively high certainty [17]. However, especially medication-related high bowel uptake in patients with diabetes can sometimes impair reading in PET/CT clinical routine.
Our study has found that the presence of bowel uptake is related with (significantly) lower IQ scores, but this relation could not be established for NECR measurements. Thus, although high bowel uptake subjectively impairs the clinical reading, those differences are not reflected within the quantitative measurements.
Our study has shown that the presence of focal FDG uptake, malignant or not, is related to high IQ scores and high NECRs. The reason for the first might be the subjective analysis as well. The presence of FDG focal uptake increases the soft tissue contrast between the structures, enhancing therefore the visual IQ.
Limitations
Our study has several technical limitations. Our data shows a partial correlation between NECR and IQ on average, however, with partially high dispersion. This dispersion of IQ values for a given value of NECR is probably related to the subjective nature of the IQ itself. Although we already performed consensus reading, a given subjectivity cannot be excluded.
Relatively large variations in IQ occur for particular values of NECR. However, the intent of the manuscript was to show a clinically usable approximation between NECR and image quality. This is also why the transferability of our results is not of a quantitative but rather qualitative nature, which is thought to be clinically usable and useful.