- Original research
- Open Access
Comparison of novel multi-level Otsu (MO-PET) and conventional PET segmentation methods for measuring FDG metabolic tumor volume in patients with soft tissue sarcoma
EJNMMI Physics volume 4, Article number: 22 (2017)
We have previously developed a novel and highly consistent PET segmentation algorithm using a multi-level Otsu method (MO-PET). The aim of this study was to evaluate the reliability of MO-PET compared to conventional PET segmentation methods for measuring 18F-FDG (FDG) PET metabolic tumor volume (MTV) in patients with soft tissue sarcoma (STS). Clinical and imaging data were obtained from the Cancer Imaging Archive. Forty-eight STS patients with FDG PET/CT and MR prior to therapy were analyzed. MTV of the tumor using MO-PET was compared to other conventional methods (absolute SUV threshold values of 2.0, 2.5, or 3.0 and percentage of tumor SUVmax values of 30, 40, 50, or 60%) and gradient-based method (PET Edge™). The reference volume was defined as an MR-based gross tumor volume (GTV). Spearman, intra-class correlation, and Bland-Altman analysis were performed to evaluate the correlation and agreement of MTV to GTV.
MTVs obtained using each conventional SUV parameter, PET Edge™, and MO-PET were highly correlated with the GTV in Spearman and intra-class correlation analysis (p < 0.05). MO-PET and PET Edge™ showed high intra-class correlation coefficient of MTV to GTV (0.93 and 0.84, respectively). The Bland-Altman bias results showed the highest agreement for MTV using MO-PET with GTV (26.0 ± 489.6 cm3) compared to other methods (SUV 2.0 with − 69.3 ± 765.8, 30% SUVmax with − 255.0 ± 876.6, and PET Edge™ with − 26.46 ± 668.82 cm3).
PET MTV segmented with MO-PET showed higher correlation and agreement with GTV in comparison to conventional percentage SUVmax and absolute SUV threshold-based PET segmentation methods. MO-PET is comparable to PET Edge™. MO-PET is a reliable and consistent method for measuring tumor MTV.
18F-fluoro-2-deoxyglucose (FDG) positron emission tomography/computed tomography (PET/CT) is widely used for the initial diagnosis, restaging, and treatment response evaluation of the many kinds of tumors . Among the multiple parameters that may be obtained from the FDG PET/CT, the standardized uptake value (SUV) is generally measured and accepted as an effective index . In several previous studies, it was reported that tumor maximum SUV (SUVmax) is related with the prognosis of cancers [3,4,5]. However, SUV has some limitations. The SUV measurement can be affected by many factors including time, blood glucose concentration, and partial volume effects . SUVmax does not reflect the metabolic activity of the entire tumor, representing only the maximum SUV in a voxel contained within a tumor region-of-interest . Also, in some tumors, the SUVmax is not correlated with the prognosis [7, 8]. Due to the limitations of the SUV, it is difficult to use only SUVmax for the prediction of tumor prognosis, and other significant PET indexes are needed. Parameters including metabolic tumor volume (MTV) and total lesion glycolysis (TLG) began to emerge compensating the role of SUV . It was reported that MTV (one of the PET parameters) is related to the prognosis of various cancers [1, 3, 9, 10].
The definition of MTV, which is related to the distribution of metabolic activity, is the volume of hypermetabolic tissue that has metabolic activity exceeding a defined threshold . In order to accurately measure MTV for cancer prognosis, various PET tumor segmentation methods have been attempted . These various conventional methods include the absolute SUV threshold method (e.g., SUV 2.0), fixed percentage SUVmax threshold method (e.g., 30% SUVmax), and signal-to-background method [9, 12]. However, the gross MTV is measured differently according to the various segmentation methods . There is no standard method for measuring MTV. Therefore, among the various methods that are currently in use, the one that can best serve as a reference method remains controversial. .
Multi-level Otsu methods have been applied in several other application areas including in segmentation problems related to CT images. In the field of PET imaging, a variation of the basic Otsu method has been introduced as a solution to the PET segmentation problem . However, in our literature search, we did not find any prior work related to the use of multi-level Otsu threshold technique applied to PET. We have applied this multi-level Otsu method to PET segmentation (MO-PET), as previously reported [14, 15]. It was demonstrated that MO-PET segmentation method is relatively accurate, stable, and consistent across a range of lesion sizes and PET lesion-to-background ratios representative of clinical tumor lesions [14, 15]. This MO-PET algorithm and method is summarized below and detailed in this reference  (https://www.google.com/patents/WO2016160538A1?cl=en).
Multi-level Otsu method, based on a more commonly known image threshold method known as Otsu’s method , is a simple and very effective clustering-based approach to convert a gray-level image to a binary image. The original Otsu method assumes that the image contains two classes of pixels (e.g., foreground and background) then calculates the optimum threshold that separates the two classes of pixels. The optimum threshold is computed such that the intra-class variance between the two classes of foreground and background pixels is minimal, which also corresponds to maximizing the inter-class variance between the two classes of the pixels. Multi-level Otsu method represents an extension of the same basic idea, i.e., minimization of the intra-class variance (which in turn results in maximization of inter-class variance) to images that contain clusters of pixel populations representing different structures that can therefore be classified at multiple threshold levels. Mathematically, MO-PET algorithm expands the original equation in the Otsu method for two pixel group classifications into an equation for classifying into an arbitrary number of classes. Thus, given the probability of occurrence of a pixel value i given by P i , the algorithm calculates the mean pixel level (μ) of the image and the inter-class variance (σ):
where i is an individual SUV value (within the SUV range), L is the maximum SUV level in a given image, and T1, T2, … TK-1 are multiple threshold levels that can potentially be computed in a given image based on the distribution of the SUV within the image or a region-of-interest. Multiple threshold level values are determined by exhaustively searching through all sets of threshold levels for the given number of classes (K) in to which the image needs to be divided to find the combination that gives the minimum within class variance (or maximum inter-class variance). Thus, ultimately, the algorithm generates K classes and K-1 thresholds for a given image.
In this research, MO-PET, an automatic algorithm requiring very minimal user-input, was used for measuring the MTV of soft tissue sarcoma. MTVs measured with MO-PET and other conventional methods were compared in order to evaluate the usefulness and robustness of MO-PET.
The clinical and imaging data were obtained from the Cancer Imaging Archive (TCIA: http://www.cancerimagingarchive.net), an archive of medical images of cancer through the National Cancer Institute (NCI) . TCIA is an open-source and open-access database . Soft tissue sarcoma database in the TCIA was used for this study [19, 20]. This dataset was acquired under a research ethics board (REB) approval by Vallières et al. . A total of 51 patients with soft tissue sarcoma were analyzed.
All PET/CT and MRI images were analyzed with Mirada RTx (Mirada Medical Ltd., Denver, CO, USA), with additional MO-PET segmentation algorithm developed and implemented as a plugin tool to use with ImageJ (https://imagej.nih.gov/ij/index.html), an image processing program developed by NIH. The plugin provided suitable support functions for reading and storing geometric contour information in RTSS data file format so that the contours can be exchanged between Mirada RTx and the plugin. One ellipsoidal volume of interest (VOI) containing primary tumor was drawn on each PET image. Various thresholds (absolute SUV threshold values of 2.0, 2.5, or 3.0, and fixed percentage of SUVmax values of 30, 40, 50, or 60%) were applied to one VOI and MTV for each threshold, termed as MTV (2.0), MTV (2.5), MTV (3.0), MTV (30%), MTV (40%), MTV (50%), or MTV (60%), respectively. MTV using MO-PET software (MTV (MO-PET)) was obtained using the identical VOI which was applied to the various threshold methods. For the reference standard volume, the tumor contours defined on the MRI were used. The MR contours, which were previously manually drawn on T2-weighted fat-suppression (T2FS) scans by Villiers et al., were obtained from the TCIA database [19, 20]. The gross MR-based tumor volume (GTV) was measured on the MR images with Mirada RTx using obtained MR contours. MTV was also measured with PET Edge™ (MIM software Inc., Cleveland, OH, USA), a gradient-based PET segmentation method. The ratio of each MTV using various thresholds to the GTV was calculated in order to evaluate the accuracy of each MTV segmentation method. The closer the ratio of GTV to each MTV is to 1, the MTV is regarded as a better segmentation method compared to GTV.
Data are expressed as mean ± SD. Spearman correlation, intra-class correlation coefficient, and Bland-Altman analysis were used to compare the data of MTVs obtained with various thresholds and MO-PET. Each volume was compared to that of MRI-derived GTV. Data were evaluated using statistics software (Medcalc version 10.1.7.0, Medcalc software, Mariakere, Belgium).
Fifty-one soft tissue sarcoma cases were obtained from the TCIA . Among them, three cases were excluded as measuring the MTV was inappropriate. In one case, the tumor was located at the left upper arm adjacent to the PET/CT gantry, which made it impossible to draw a precise VOI due to its location. The tumors of the other two cases had large edema around the primary tumor. The huge discrepancy between tumor and edema precluded accurate tumor delineation. After ruling out the 3 cases, a total of 48 cases were included for the final analysis. The features of tumor including histology, location, grade, SUVmax, and GTV are summarized in Table 1. Detailed clinical data can be accessed via the TCIA site (https://doi.org/10.7937/K9/TCIA.2015.7GO2GSKS).
The ratio of MTVs using each threshold and MO-PET to gross MR-based tumor volume (GTV) was calculated. MO-PET and the gradient-based method showed MTV to GTV ratio close to 1, at 1.12 ± 0.42 and 1.08 ± 0.38, respectively (Table 2). These ratios were most significant on Spearman correlation and intra-class correlation analyses. Percentage SUVmax and absolute SUV threshold method-based PET segmentation did not perform well in comparison to GTV. Among the fixed percentage SUVmax threshold methods, (30% SUVmax, 40% SUVmax, 50% SUVmax, and 60% SUVmax) MTV (30%) showed the highest ratio of MTV to GTV (0.64 ± 0.35), while MTV (2.0) showed the highest ratio of MTV to the GTV (1.00 ± 0.61) among the absolute SUV threshold methods (SUV 2.0, SUV 2.5, and SUV 3.0).
MTV with MO-PET and the gradient-based method demonstrated similar high correlation with GTV (Spearman correlation coefficient; 0.945 and 0.947, respectively; Table 2). Percentage SUVmax and absolute SUV threshold method-based PET segmentation did not perform as well in comparison. Spearman correlation coefficients (r) of MTVs using 30% SUVmax, 40% SUVmax, 50% SUVmax, 60% SUVmax, SUV 2.0, SUV 2.5, and SUV 3.0 to the reference GTV were 0.738, 0.621, 0.426, 0.291, 0.799, 0.680, and 0.561, respectively (50% SUVmax, p = 0.003; 60% SUVmax, p = 0.045; all other parameters, p < 0.001; Table 2). In the correlation graph, each MTV measured by various methods and GTV showed a significant correlation with each other (Fig. 1). Furthermore, MTV (MO-PET) exhibited the most accurate trend line with GTV compared with those of other MTVs.
MO-PET showed the highest intra-class correlation coefficient compared to the reference GTV with 0.93 (95% CI, 0.88–0.96; Table 2), slightly better than the gradient-based method with 0.84 (95% CI, 0.71–0.9; Table 2), although the correlations in both the methods were not statistically significant.
The Bland-Altman analysis showed biases between each MTV and reference GTV, with the lowest variation for MO-PET. The biases of MTV with MO-PET, 30% SUVmax, SUV 2.0, and PET Edge™ were 26.0 ± 489.6, −55.0 ± 876.6, −69.3 ± 765.8, and −26.46 ± 668.82 cm3, respectively (Fig. 2).
Images of representative cases are shown in Fig. 3.
In this study, we evaluated the usefulness of the newly developed PET segmentation method, MO-PET, for measuring MTV. MTVs measured using MO-PET and various threshold methods were compared to the MRI-derived GTV obtained from the TCIA database. It was demonstrated that MO-PET and the gradient-based method (PET Edge™) showed comparable MTV, with the highest correlation to GTV. In addition, these two methods were superior to the absolute SUV and percentage SUVmax threshold methods.
As shown in the result, the calculated ratio of MTV (2.0) to GTV was most close to 1. However, the SD of MTV (MO-PET) ratio was smaller than that of MTV (2.0). Furthermore, the Spearman and intra-class correlation coefficients of MTV (MO-PET) were higher than that of MTV (2.0). According to the Bland-Altman analysis, the MTV (MO-PET) and MR-based tumor volume showed superior agreement to other methods.
The Bland-Altman analysis showed that MTV(MO-PET) had strong correlation regardless of the tumor volume, while the tumor volume measured using absolute SUV threshold (SUV 2.0) or fixed percentage SUVmax threshold (30% SUVmax) showed greater discrepancy as the tumor volume increased. Furthermore, the absolute SUV threshold methods showed some limitation of tumor delineation, in cases where there were heterogeneous metabolic activities in the tumor . In such instances, the fractional parts of the tumor with metabolic activity lower than the threshold could get excluded from MTV measurement, and this may result in underestimation of the metabolic tumor volume . On the other hand, MO-PET showed good tumor delineation in cases where tumor had heterogeneous SUV. In the case of tumor with the high SUVmax, fixed percentage SUVmax threshold method may show underestimated MTV. Also, the MTV of the tumor which has low SUVmax may be undervalued with the absolute SUV threshold method. It was reported that the underestimation of MTV in the patients with low SUVmax would be possible . The MO-PET algorithm may solve this problem.
In order to further evaluate MO-PET against commercially available software, PET Edge™ of MIM software was used to measure MTV. PET Edge™ measures MTV based on the gradient-based segmentation method . Spearman correlation coefficients of MO-PET and PET Edge™ showed relatively similar values. In terms of intra-class correlation coefficient, MO-PET was slightly higher compared to that of PET Edge™. Segmentation using MO-PET was a comparable method to the gradient-based segmentation method. However, MO-PET derives the tumor contour with simple VOI, while the gradient-based segmentation method requires manual adjustment for the tumor contour. As a result, reproducibility of tumor contour using the gradient-based method may show inconsistency if the tumor is irregular in shape or has much necrotic portion in the tumor.
Recently, MTVs are increasingly studied for the prediction of the prognosis of various cancers [7, 8, 10, 11]. Superior correlation between MTV and tumor prognosis is also reported compared to that of the SUVmax . However, there is no ideal method established for the measuring MTV . It is difficult to predict the prognosis with MTV that is measured with the non-established, various threshold methods. There are many conventional methods including absolute SUV threshold method, percentage SUVmax threshold method, lesion-to-background method, and gradient method for tumor segmentation which are used to measure MTV. MTVs depend on various threshold methods [11, 12, 21]. Manually drawn segmentation method can also be used on the MRI or CT with visual assessment; however, tumor volume in this method can be affected by how the segmentation is drawn . Therefore, the development of reproducible and automatic tumor segmentation method is needed. The MO-PET method was developed in order to overcome these limitations. We previously evaluated that MO-PET is relatively accurate, stable, and consistent for measuring MTV using standard NEMA image quality phantom study compared to conventional threshold methods . In addition, it is evaluated that MO-PET can be applied to the clinical images in this study.
Regarding soft tissue sarcoma that was analyzed in this study, several researches on the correlation between the PET parameters and tumor prognosis have been reported. However, the results reported have been conflicting with each other. For instance, it was reported in one study that there is positive correlation between the PET parameters (including SUV and other volume parameters) and metastasis . In another, it was also reported that SUVmax and other volume parameters including MTV and TLG are related to tumor prognosis . Whereas, Hong et al. reported that volume-based parameters are not correlated with tumor prognosis, but only SUVmax is correlated with disease progression . Also, it was reported that TLG is a superior prognostic index to SUVmax and MTV . Due to these contradictions and discrepancies, it is necessary to study the correlation between MTV and tumor prognosis with optimal MTV measurements.
There were two limitations in this study: (1) the reference standard volume was defined using the tumor contour on the MRI as an appropriate anatomic comparator for PET MTV; but MRI tumor contours are not necessarily a definitive reference standard. Also, there may be discrepancies between the volumes using the MRI contour with the actual pathologic volume; (2) the PET/CT images that were analyzed in this study do not have whole body images. So, lesion-to-background method cannot be compared in this study.
In conclusion, PET MTV segmented with MO-PET method showed higher correlation and agreement with MRI-based GTV in comparison to conventional percentage SUVmax threshold and absolute SUV threshold-based PET segmentation methods. MO-PET is a reliable and consistent method for measuring tumor MTV. Quantitation of tumor metabolic burden using the MO-PET segmentation method shows very good assurance by its results for future clinical applications.
Gallamini A, Zwarthoed C, Borra A. Positron emission tomography (PET) in oncology. Cancers. 2014;6(4):1821–89.
Singh D, Miles K. Multiparametric PET/CT in oncology. Cancer Imaging. 2012;12(2):336–44.
Partin AW, Kattan MW, Subong EN, Walsh PC, Wojno KJ, Oesterling JE, et al. Combination of prostate-specific antigen, clinical stage, and Gleason score to predict pathological stage of localized prostate cancer. A multi-institutional update. JAMA. 1997;277(18):1445–51.
Pan L, Gu P, Huang G, Xue H, Wu S. Prognostic significance of SUV on PET/CT in patients with esophageal cancer: a systematic review and meta-analysis. Eur J Gastroenterol Hepatol. 2009;21(9):1008–15.
Park B, Kim HK, Choi YS, Kim J, Zo JI, Choi JY, et al. Prediction of pathologic grade and prognosis in mucoepidermoid carcinoma of the lung using 18F-FDG PET/CT. Korean J Radiol. 2015;16(4):929–35.
Park GC, Kim JS, Roh J-L, Choi S-H, Nam SY, Kim SY. Prognostic value of metabolic tumor volume measured by 18F-FDG PET/CT in advanced-stage squamous cell carcinoma of the larynx and hypopharynx. Ann Oncol. 2013;24(1):208-14.
Ryu IS, Kim JS, Roh JL, Lee JH, Cho KJ, Choi SH, et al. Prognostic value of preoperative metabolic tumor volume and total lesion glycolysis measured by 18F-FDG PET/CT in salivary gland carcinomas. J Nucl Med. 2013;54(7):1032–8.
Chung HH, Lee I, Kim HS, Kim JW, Park N-H, Song YS, et al. Prognostic value of preoperative metabolic tumor volume measured by 18F-FDG PET/CT and MRI in patients with endometrial cancer. Gynecol Oncol. 2013;130(3):446–51.
Moon SH, Hyun SH, Choi JY. Prognostic significance of volume-based PET parameters in cancer patients. Korean J Radiol. 2013;14(1):1–12.
Pak K, Cheon GJ, Nam HY, Kim SJ, Kang KW, Chung JK, et al. Prognostic value of metabolic tumor volume and total lesion glycolysis in head and neck cancer: a systematic review and meta-analysis. J Nucl Med. 2014;55(6):884–90.
Dibble EH, Alvarez AC, Truong MT, Mercier G, Cook EF, Subramaniam RM. 18F-FDG metabolic tumor volume and total glycolytic activity of oral cavity and oropharyngeal squamous cell cancer: adding value to clinical staging. J Nucl Med. 2012;53(5):709–15.
Sridhar P, Mercier G, Tan J, Truong MT, Daly B, Subramaniam RM. FDG PET metabolic tumor volume segmentation and pathologic volume of primary human solid tumors. AJR Am J Roentgenol. 2014;202(5):1114–9.
Khamwan K, Krisanachinda A, Pluempitiwiriyawej C. Automated tumour boundary delineation on 18F-FDG PET images using active contour coupled with shifted-optimal thresholding method. Phys Med Biol. 2012;57(19):5995–6005.
Im H-J, Solaiyappan M, Bradshaw T, Cho S. Validation of multi-level Otsu method to define metabolic tumor volume in positron emission tomography. J Nucl Med. 2016;57(supplement 2):1045.
Huang E, Solaiyappan M, Cho S. Improved stability and performance of 18F-FDG PET automated tumor segmentation using multi-level maximization of inter-class variance method. J Nucl Med. 2015;56(supplement 3):452.
Cho S, Solaiyappan M, Huang E, inventors; The Johns Hopkins University, assignee. Multi-level ostu for positron emission tomography (mo-pet) 2016.
Otsu N. A threshold selection method from gray-level histograms. IEEE. 1979;SMC-9(1):62–6.
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, et al. The cancer imaging archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26(6):1045–57.
Vallières M, Freeman CR, Skamene SR, Naqa IE. A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities. Cancer Imaging Arch. 2015.
Vallières M, Freeman CR, Skamene SR, Naqa IE. A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities. Phys Med Biol. 2015;60(14):5471.
Kanoun S, Tal I, Berriolo-Riedinger A, Rossi C, Riedinger J-M, Vrigneaud J-M, et al. Influence of software tool and methodological aspects of total metabolic tumor volume calculation on baseline 18F-FDG PET to predict survival in Hodgkin lymphoma. PLoS One. 2015;10(10):e0140830.
Park SY, Yoon J-K, Park KJ, Lee SJ. Prediction of occult lymph node metastasis using volume-based PET parameters in small-sized peripheral non-small cell lung cancer. Cancer Imaging. 2015;15:21.
Liao S, Penney BC, Wroblewski K, Zhang H, Simon CA, Kampalath R, et al. Prognostic value of metabolic tumor burden on 18F-FDG PET in nonsurgical patients with non-small cell lung cancer. Eur J Nucl Med Mol Imaging. 2012;39(1):27–38.
Romesser PB, Qureshi MM, Shah BA, Chatburn LT, Jalisi S, Devaiah AK, et al. Superior prognostic utility of gross and metabolic tumor volume compared to standardized uptake value using PET/CT in head and neck squamous cell carcinoma patients treated with intensity-modulated radiotherapy. Ann Nucl Med. 2012;26(7):527–34.
Bowden P, Fisher R, Mac Manus M, Wirth A, Duchesne G, Millward M, et al. Measurement of lung tumor volumes using three-dimensional computer planning software. Int J Radiat Oncol Biol Phys. 2002;53(3):566–73.
S-p H, Lee SE, Choi Y-L, Seo SW, Sung K-S, Koo HH, et al. Prognostic value of 18F-FDG PET/CT in patients with soft tissue sarcoma: comparisons between metabolic parameters. Skelet Radiol. 2014;43(5):641–8.
Choi E-S, Ha S-G, Kim H-S, Ha JH, Paeng JC, Han I. Total lesion glycolysis by 18F-FDG PET/CT is a reliable predictor of prognosis in soft-tissue sarcoma. Eur J Nucl Med Mol Imaging. 2013;40(12):1836–42.
Ethics approval and consent to participate
The clinical and imaging data have been anonymized by The Cancer Imaging Archive (TCIA). These data are available on an open-access database. Therefore, the institutional review board approval was exempted. For this type of retrospective study, formal consent is not required.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.