- Original research
- Open Access
Inter-observer variability of 90Y PET/CT dosimetry in hepatocellular carcinoma after glass microspheres transarterial radioembolization
EJNMMI Physics volume 7, Article number: 29 (2020)
Strong correlation has been demonstrated between tumor dose and response and between healthy liver dose and side effects. Individualized dosimetry is increasingly recommended in the current clinical routine. However, hepatic and tumor segmentations could be complex in some cases. The aim of this study is to assess the reproducibility of the tumoral and non-tumoral liver dosimetry in selective internal radiation therapy (SIRT).
Material and methods
Twenty-three patients with hepatocellular carcinoma (HCC) who underwent SIRT with glass microspheres were retrospectively included in the study. Tumor (TV) and total liver volumes (TLV), and mean absorbed doses in tumoral liver (TD) and non-tumoral liver (THLD) were determined on the 90Y PET/CT studies using Simplicit90YTM software, by three independent observers. Dosimetry datasets were obtained by a medical physicist helped by a nuclear medicine (NM) physician with 10 years of experience (A), by a NM physician with 4-year experience (B), and by a resident who first performed 10 dosimetry assessments as a training (C). Inter-observer agreement was evaluated using intra-class correlation coefficients (ICC), coefficients of variation (CV), Bland-Altman plots, and reproducibility coefficient (RDC).
A strong agreement was observed between all three readers for estimating TLV (ICC 0.98) and THLD (ICC 0.97). Agreement was lower for TV delineation (ICC 0.94) and particularly for TD (ICC 0.73), especially for the highest values. Regarding TD, the CV (%) was 26.5, 26.9, and 20.2 between observers A and B, A and C, and B and C, respectively, and the RDC was 1.5. Regarding THLD, it was 8.5, 12.7, and 9.4, and the RDC was 1.3.
Using a standardized methodology, and regardless of the different experiences of the observers, the estimation of THLD is highly reproducible. Although the reproducibility of the assessment of tumor irradiation is overall quite high, large variations may be observed in a limited number of patients.
Selective internal radiation therapy (SIRT) with glass or resin microspheres has been used [1,2,3,4] in the treatment of liver tumors for over two decades. However, SIRT is currently not recommended as a first-line treatment in guidelines , as recent randomized multi-centric studies in HCC and colorectal metastases failed to establish its clinical role [6,7,8]. A major limitation of these trials was the use of empirical dosimetry, which assumes a homogeneous distribution of the injected activity in the liver [9, 10]. Over the past decade, correlations have been established between tumor dose and response, on the one hand, and between healthy liver dose and side effects, on the other hand [11, 12]. Individualized dosimetry is thus increasingly advocated in current clinical routine.
A key step in the dosimetry process is the identification and the segmentation of the target volumes. This may prove highly complex, especially in case of multifocal HCC or diffuse metastatic disease, and may generate an inter-observer variability of volumes measurements and consequently of estimated doses. Quite surprisingly, very limited data is available on this specific topic. Monsky et al.  showed a strong inter-observer agreement both for delineating tumoral and non-tumoral volumes. However, the study focused on a new semi-automated segmentation algorithm and there was no report on the dosimetry. The only research investigating dose measurements reported a clinically acceptable inter-observer variability for healthy liver but larger fluctuations for assessing the dose to neoplastic lesions .
The aim of the present study is therefore to assess the reproducibility of the tumoral and non-tumoral liver dosimetry in SIRT, on the basis of post-treatment Yttrium-90 (90Y) combined positron emission tomography-computed tomography (PET/CT) images.
Material and methods
Twenty-three patients with HCC who underwent SIRT with glass microspheres (TheraSpheres®) between 2016 and 2019 were retrospectively included in our study. Two of them underwent a second radioembolization, leading to a total of 25 treatments with an average injected activity of 1.98 ± 1.23 GBq. All patients were treated according to the treatment algorithms applied in our institution and after discussion during our multidisciplinary tumor board. In the six cases with multiple lesions, the overall mean tumor dose was considered. The mean age of the cohort was 66 ± 9 years (min 48 and max 84) for a male/female ratio of 1.7.
Patient characteristics are summarized in Table 1.
PET/CT studies were performed in our institution, no later than 24 h after the treatment, using a Big Bore or a Gemini TF 16 PET/CT system (both Philips Medical Systems, Cleveland, OH, USA). Three steps of 13 min per bed position (axial field of view (FOV) of 18 cm) were acquired, with an overlap of 50%. All acquisitions were corrected for scatter, decay, attenuation, and random events. Attenuation correction was performed using a low-dose computed tomography (CT) (120 kV and 50 mAs or 80 mAs, depending on the patient weight). For reconstruction, we used an iterative algorithm OSEM list-mode incorporating time of flight with 4 iterations and 8 subsets. The voxel size was 4 × 4 × 4 mm for PET images and 1.17 × 1.17 × 3 mm for low-dose CTs. The CT matrix was 512 and those of the PET 169 or 144 depending on the field of view acquisition dimension.
Hepatic segmentation and dosimetry
Tumor and non-tumor volumes were measured using both the 90Y PET/CT and a three-phase injected CT that was performed for primary staging within 1 month prior to treatment. The mean absorbed doses in the tumoral liver (TD) and non-tumoral liver (THLD) were determined using Simplicit90YTM (version 2.1) software (MIRADA Medical ®) by three sets of independent observers.
One dataset was obtained as part of the daily clinical practice by a medical physicist helped by a nuclear medicine (NM) physician with 10 years of experience in SIRT (A). The second was performed by a NM physician with 4-year experience (B). The third was performed by a resident with limited experience and who first performed 10 dosimetry assessments as a training (C). This training set is not included in the current study. The time needed for the process was not precisely recorded but there were no major variations among the 3 operators in this regard.
The tumor volumes were manually delineated slice-by-slice on the three-phase injected CT, transposed to 90Y PET/CT, and adapted to correct for rigid misregistration and volume variations between contrast-enhanced (CE) CT and SIRT. Such manual adaptation was required in all cases. Non-enhancing parts of lesions were included in the tumoral volume. In the subjects with multiple lesions, each tumor volume was similarly delineated on the CE-CT, i.e., manually and slice-by-slice. The sum of these volumes led to the total tumoral volume, which was used to calculate the mean TD, without including any normal tissue. Total liver volume was delineated using the automated segmentation algorithm provided in the software, on the CT part of the PET/CT, and imperfections were manually rectified. The volume of injected liver, corresponding to the portion of the liver that has received microspheres, was determined on the PET images using a variable thresholding, to match as much as possible the CT limits. Non-tumoral liver volume was obtained by subtracting the tumoral volume from the total liver volume. Lung shunt fraction was arbitrarily set at 0%, this value having no influence on variability.
The following parameters were obtained for each patient:
Tumoral volume (TV)
Tumoral dose (TD)
Total liver volume (TLV)
Total healthy liver dose (THLD)
The continuous data were tested for normality (Shapiro-Wilk test). For the normally distributed value (TLV), paired Student’s t test and ANOVA-2 test were used to compare the average values between the readers. Non-parametric tests (Wilcoxon signed-rank and Friedman tests) were performed to compare the other parameters. Agreement was measured by the intra-class correlation coefficient (ICC), by Bland’s and Altman’s graphs, and by the coefficient of variation (CV). Variability was also assessed by the reproducibility coefficient (RDC) as defined by Haste et al. . The RDC represents the maximum ratio of doses values obtained from the three different reviewers in 95% of cases, with a 95% confidence interval (95%CI). The results are considered significant at the uncertainty level of 5% (p < 0.05). Calculations were made using SAS version 9.4 and graphics using R version 3.5, except the RDC, for which a user-written routine was developed (Additional file 1).
A strong agreement was observed between all three readers for estimating the total liver volume with an ICC (95%CI) of 0.98 (0.97–0.99). Neither the paired t tests nor the ANOVA-2 showed any statistically significant differences between mean volumes (1749 ± 507 ml, 1748 ± 482 ml, and 1735 ± 496 ml for observers A, B, and C, respectively). This was confirmed by the low values of CV (5.1% between A and B, 3.9% between A and C, and 2.6% between B and C)
The mean healthy liver doses were 45.3 ± 30.2 Gy for observer A, 47.5 ± 30.5 Gy for observer B, and 50.5 ± 31.5 Gy for observer C. Differences between readers were statistically significant overall (p = 0.0003) and between readers A and C (p < 0.0001) and B and C (p = 0.0054). The observations for non-tumoral volumes and doses are represented in box plot diagrams (Fig. 1). The ICC for doses between the 3 readers was 0.97 (95%CI 0.96–0.99), and 0.98 (95%CI 0.97–0.99) between readers A and B, 0.96 (95%CI 0.88–0.99) between A and C, and 0.98 (95%CI 0.96–0.99) between B and C. As shown in the Bland-Altman plots (Fig. 2), a difference greater than 20 Gy occurred twice. The RDC for THLD was 1.33 (95%CI 1.26–1.68).
Regarding the tumor volume delineation, the ICC was 0.94 (95%CI 0.92–0.97) between all three readers. No significant difference was observed by Wilcoxon signed-rank and Friedman tests when comparing the mean volumes (179 ± 208 ml (median 94, min 17, max 813), 199 ± 258 ml (median 77, min 12, max 890), and 185 ± 265 ml (median 76, min 11, max 1017) for observers A, B, and C, respectively). CV was 35.9%, 36.8%, and 25.5% between A and B, A and C, and B and C, respectively. Regarding the TD obtained by the various readers, Wilcoxon signed-rank and Friedman tests did not show statistically significant differences either. The observations for tumoral volumes and doses are represented in box plot diagrams (Fig. 3). Nevertheless, ICC was lower than for THLD: 0.70 (95%CI 0.49–0.85), 0.71 (95%CI 0.51–0.86), and 0.81 (95%CI 0.68–0.91) between A and B, A and C, and B and C, respectively, and 0.73 (95%CI 0.64–0.86) between all three readers. Bland-Altman analysis is shown in Fig. 4. Dispersion was larger in lesions that received a dose greater than 300 Gy. The RDC for TD was 1.52 (95%CI 1.38–2.25).
Figure 5 shows an example of volume delineation.
Considering a target dose > 200 Gy [15,16,17] for the tumor and < 50 Gy  for the healthy liver, 7/25 (28%) patients (tumor) and 2/25 (8%) patients (non-tumor) would have been classified by the various observers in different categories, i.e., target reached/not reached. Considering a target dose < 75 Gy for non-tumoral liver, observers agreed on the category for all patients (0% classified in different categories).
Relationship between response and tumor dose is now well established in liver tumors . However, considering 90Y PET/CT-based dosimetry in HCC, data are limited and there remain discrepancies between studies in determining the efficient tumoral threshold dose. Song et al.  reported a longer PFS with a TD > 200 Gy, while Kao et al. observed a complete radiological response with a minimum dose to 70% tumor volume > 100 Gy . Furthermore, a significantly higher median TD was observed in responders (225 Gy compared to 83 Gy in non-responders) in a recent prospective trial in a cohort of 27 patients . In the latter study, a dose threshold of 200 Gy was 100% specific and had a 100% positive predictive value whereas all non-responders received a dose < 200 Gy. On the other hand, Srinivas et al. did not show a statistically significant relationship between TD and radiological response . Considering healthy liver, this study showed an association between THLD and the presence of two or more radiation-induced complications. Another prospective work conducted by Chan et al. demonstrated that the likelihood of toxicity exceeds 50% at a THLD threshold of 54 Gy .
In any case, a personalized dosimetry seems currently essential to optimize clinical results of SIRT  as factors such as small sample size, differences in activity calculation methods, response assessment criteria, microsphere type, and study design all contribute to some extent to explain the heterogeneity of the published data. Further, the way to delineate tumoral and non-tumoral liver may be subject to inter-observer variability and can potentially generate changes in TD and THLD between different physicians.
Although our results suggest that the delineation of non-tumoral volumes and the estimation of corresponding doses are reproducible, statistical comparison of samples showed significant differences of THLD between observers A-C and B-C, explained by the fact that this parameter is also influenced by the manual delineation of the tumor. The slightly lower ICC between A-C and B-C compared to A-B may be related to the lower experience of observer C. Such effect appears limited, however, given the overall high ICC, low CV, and more essentially the Bland-Altman analysis that all point toward a high reproducibility. Furthermore, it appears that all patients except 2 (8%) were classified as having received a THLD lower than 50 Gy by all three observers. This would mean that the threshold that used to be well accepted to avoid toxicity can be individually assessed with a great reproducibility . If we consider a THLD threshold at 75 Gy, which was recently advocated by international experts , all observers classified every patient in the same category.
Conversely, despite the high ICC values for TV and although the analysis of mean TD values did not show statistically significant differences between observers, all other variables point toward a greater variability in particular an ICC of 0.70 and a greater dispersion in the Bland-Altman plots, probably due to the fact that correction for misregistration between contrast-enhanced CT and 90Y PET/CT may differ between observers. From a clinical perspective and considering 200 Gy as the target, 28% of the patients were classified in different groups depending on who performed the dosimetry. In addition to the limitations recently reported regarding the predictive value of the dosimetry performed on the 99mTc-macroaggregated albumin (MAA) single-photon emission computed tomography (SPECT/CT) with regard to the actual dose delivered to the tumor, this emphasizes the need for further work for harmonizing the tumor dosimetry, in order to eventually optimize tumor response [15, 23,24,25]. On the other hand, it appears that current methodologies appear quite reliable for predicting and assessing the healthy liver dose, which is highly relevant for preventing toxicity.
Regarding hepatic volume segmentation, we demonstrated high agreement both for TV and TLV, comparable to that obtained by Monsky et al. . Although they used a semi-automated segmentation software, their ICC values (with measurements repeated twice) were similar to ours: 0.98 and 0.99 for TLV, in comparison with 0.98 between all three readers in our study, and 0.98 and 0.99 for TV when we reached 0.94.
Our dosimetry results are overall in line with the only study aimed at addressing this question . Haste et al. indeed observed a better reproducibility for TLHD compared to TD estimation, in a cohort of 73 HCC patients. To assess the variability, they used reproducibility coefficient (RDC), which represents the maximum ratio of doses values obtained from observers in 95% of cases, and obtained values of 1.4 (95%CI 1.3–1.6) for THLD and 1.6 (95%CI 1.5, 1.8) for TD. Our results appear comparable, both for THLD, with a RDC of 1.3 (95%CI 1.3–1.7) and for TD, with a value of 1.5 (95%CI 1.4–2.2). Furthermore, our results were obtained in 3 sets of readers with various levels of experience.
In addition, taking into account the strong correlation between 99mTc-MAA SPECT/CT and 90Y PET/CT in HCC in non-tumoral liver , we may infer a clinically meaningful translation of these results in greatly reducing the risk of radiation-induced liver disease in a reliable and reproducible fashion. One should be more careful in predicting the effectiveness of SIRT based upon the tumor dosimetry.
Of note, the experience of the various operators did not significantly alter the reproducibility of the dosimetry, but a learning curve had been allocated to the least experienced one, and the procedure was performed in a controlled environment, using a well-defined methodology. In the future, semi-automated determination of tumoral and healthy liver could further help to standardize the methodology for hepatic segmentation [14, 27]. Hence, a consensus on TD and THLD goals would be easier to find between different centers.
Accurate estimation of post-treatment TD and THLD is a central element to predict therapeutic efficacy of SIRT. Using a standardized methodology, 90Y PET/CT dosimetry in HCC after glass microspheres is strongly reproducible in healthy liver, even with different levels of experience between observers, which is highly relevant in the clinical practice for preventing potential toxicity. On the other hand, despite a fairly good agreement between observers, caution should be exercised for tumoral lesions as individual variations may be important, especially for the highest TD.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Coefficients of variation
Field of view
Intra-class correlation coefficients
Combined positron emission tomography-computed tomography
Selective internal radiation therapy
Single-photon emission computed tomography
Mean absorbed doses in tumoral liver
Mean absorbed doses in non-tumoral liver
Total liver volumes
Salem R, Lewandowski R, Kulik L, et al. Radioembolization results in longer time-to-progression and reduced toxicity compared with chemoembolization in patients with hepatocellular carcinoma. Gastroenterology. 2011;140(2):497–507.
Salem R, Gilbertsen M, Butt Z, et al. Increased quality of life among hepatocellular carcinoma patients treated with radioembolization, compared with chemoembolization. Clin Gastroenterol Hepatol. 2013;11(10):1358–65.
Memon K, Kulik L, Lewandowski R, et al. Prospective evaluation of patients with early-/intermediate-stage hepatocellular carcinoma with disease progression following arterial locoregional therapy: candidacy for systemic treatment or clinical trial. J Vasc Interv Radiol. 2013;24(8):1189–97.
Sangro B, Salem R, Kennedy A, et al. Radioembolization for hepatocellular carcinoma: a review of the evidence and treatment recommendations. Am J Clin Oncol. 2011;34(4):422–31.
Galle P, Forner A, Llovet J, et al. EASL clinical practice guidelines: management of hepatocellular carcinoma. J Hepatol. 2018;69(1):182–236.
Vilgrain V, Pereira H, Assenat E, et al. Efficacy and safety of selective internal radiotherapy with yttrium-90 resin microspheres compared with sorafenib in locally advanced and inoperable hepatocellular carcinoma (SARAH): an open-label randomised controlled phase 3 trial. Lancet Oncol. 2017;18(12):1624–36.
Chow P, Gandhi M, Tan S, et al. SIRveNIB: Selective internal radiation therapy versus sorafenib in Asia-Pacific patients with hepatocellular carcinoma. J Clin Oncol. 2018;36(19):1913–21.
Wasan H, Gibbs P, Sharma N, et al. First-line selective internal radiotherapy plus chemotherapy versus chemotherapy alone in patients with liver metastases from colorectal cancer (FOXFIRE, SIRFLOX, and FOXFIRE-Global): a combined analysis of three multicentre, randomised, phase 3 trials. Lancet Oncol. 2017;18(9):1159–71.
Garin E, Rolland Y, Campillo-Gimenez B, et al. Negative phase 3 study of 90 Y microspheres versus sorafenib in HCC. Lancet Oncol. 2018;19(2):e70.
Braat A, Kappadath S, Bruijnen R, et al. Adequate SIRT activity dose is as important as adequate chemotherapy dose. Lancet Oncol. 2017;18(11):e636.
Garin E, Rolland Y, Pracht M, et al. High impact of macroaggregated albumin-based tumour dose on response and overall survival in hepatocellular carcinoma patients treated with 90 Y-loaded glass microsphere radioembolization. Liver Int. 2017;37(1):101–10.
Garin E, Lenoir L, Edeline J, et al. Boosted selective internal radiation therapy with 90 Y-loaded glass microspheres (B-SIRT) for hepatocellular carcinoma patients: a new personalized promising concept. Eur J Nucl Med Mol Imaging. 2013;40:1057–68.
Monsky WL, Garza AS, Kim I, et al. Treatment planning and volumetric response assessment for yttrium-90 radioembolization: semiautomated determination of liver volume and volume of tumor necrosis in patients with hepatic malignancy. Cardiovasc Intervent Radiol. 2011;34(2):306–18.
Haste P, Tann M, Persohn S, et al. Correlation of technetium-99m macroaggregated albumin and yttrium-90 glass microsphere biodistribution in hepatocellular carcinoma: a retrospective review of pretreatment single photon emission CT and posttreatment positron emission tomography/CT. J Vasc Interv Radiol. 2017;28(5):722–30.
Garin E, Rolland Y, Laffont S, et al. Clinical impact of 99mTc-MAA SPECT/CT-based dosimetry in the radioembolization of liver malignancies with 90Y-loaded microspheres. Eur J Nucl Med Mol Imaging. 2016;43(3):559–75.
Song YS, Paeng JC, Kim HC, et al. PET/CT-Based dosimetry in 90Y-microsphere selective internal radiation therapy: single cohort comparison with pretreatment planning on (99m)Tc-MAA imaging and correlation with treatment efficacy. Medicine (Baltimore). 2015;94(23):e945.
Chan KT, Alessio AM, Johnson GE, et al. Prospective trial using internal pair-production positron emission tomography to establish the yttrium-90 radioembolization dose required for response of hepatocellular carcinoma. Int J Radiat Oncol Biol Phys. 2018;101(2):358–65.
Chan KT, Alessio AM, Johnson GE, et al. Hepatotoxic dose thresholds by positron-emission tomography after yttrium-90 radioembolization of liver tumors: a prospective single-arm observational study. Cardiovasc Intervent Radiol. 2018;41(9):1363–72.
Kao Y, Steinberg J, Tay Y, et al. Post-radioembolization yttrium-90 PET/CT - part 2: dose-response and tumor predictive dosimetry for resin microspheres. EJNMMI Res. 2013;3(1):57.
Srinivas S, Natarajan N, Kuroiwa J, et al. Determination of radiation absorbed dose to primary liver tumors and normal liver tissue using post radioembolization 90Y PET. Front Oncol. 2014;4:255.
Salem R, Padia SA, Lam M, et al. Clinical and dosimetric considerations for Y90: recommendations from an international multidisciplinary working group. Eur J Nucl Med Mol Imaging. 2019;46(8):1695–704.
Chiesa C, Maccauro M, Romito R, et al. Need, feasibility and convenience of dosimetric treatment planning in liver selective internal radiation therapy with 90Y microspheres: the experience of the National Tumor Institute of Milan. Q J Nucl Med Mol Imaging. 2011;55(2):168–97.
Garin E, Rolland Y, Edeline J, et al. Personalized dosimetry with intensification using 90Y-loaded glass microsphere radioembolization induces prolonged overall survival in hepatocellular carcinoma patients with portal vein thrombosis. J Nucl Med. 2015;56(3):339–46.
Ho S, Lau WY, Leung TWT, et al. Clinical evaluation of the partition model for estimating radiation doses from yttrium-90 microspheres in the treatment of hepatic cancers. Eur J Nucl Med. 1997;24:293–8.
Kao YH, Hock Tan AE, Burgmans MC, et al. Image-guided personalized predictive dosimetry by artery-specific SPECT/CT partition modeling for safe and effective 90Y radioembolization. J Nucl Med. 2012;53(4):559–66.
Jadoul A, Bernard C, Lovinfosse P, et al. Comparative dosimetry between 99mTc-MAA SPECT/CT and 90Y PET/CT in primary and metastatic liver tumors. Eur J Nucl Med Mol Imaging. 2019;6.
Jafargholi Rangraz E, Coudyzer W, Maleux G, et al. Multi-modal image analysis for semi-automatic segmentation of the total liver and liver arterial perfusion territories for radioembolization. EJNMMI Res. 2019;9(1):19.
We would like to express our gratitude toward the statistician Laurence Seidel.
Olivier Detry is supported by a grant from the “Fondation contre le Cancer.”
Ethics approval and consent to participate
All procedures were performed in accordance with the principles of the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The study design and exemption from informed consent were approved by the Institutional Review Board of Liege University Hospital.
Consent for publication
CHU de Liège received speaker fees from Biocompatibles UK Ltd.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Meyers, N., Jadoul, A., Bernard, C. et al. Inter-observer variability of 90Y PET/CT dosimetry in hepatocellular carcinoma after glass microspheres transarterial radioembolization. EJNMMI Phys 7, 29 (2020). https://doi.org/10.1186/s40658-020-00302-1