Skip to main content
  • Original research
  • Open access
  • Published:

Fit of biokinetic data in molecular radiotherapy: a machine learning approach

Abstract

Background

In literature are reported different analytical methods (AM) to choose the proper fit model and to fit data of the time-activity curve (TAC). On the other hand, Machine Learning algorithms (ML) are increasingly used for both classification and regression tasks. The aim of this work was to investigate the possibility of employing ML both to classify the most appropriate fit model and to predict the area under the curve (τ).

Methods

Two different ML systems have been developed for classifying the fit model and to predict the biokinetic parameters. The two systems were trained and tested with synthetic TACs simulating a whole-body Fraction Injected Activity for patients affected by metastatic Differentiated Thyroid Carcinoma, administered with [131I]I-NaI. Test performances, defined as classification accuracy (CA) and percentage difference between the actual and the estimated area under the curve (Δτ), were compared with those obtained using AM varying the number of points (N) of the TACs. A comparison between AM and ML were performed using data of 20 real patients.

Results

As N varies, CA remains constant for ML (about 98%), while it improves for F-test (from 62 to 92%) and AICc (from 50 to 92%), as N increases. With AM, \(\Delta \tau\) can reach down to − 67%, while using ML \(\Delta \tau\) ranges within ± 25%. Using real TACs, there is a good agreement between τ obtained with ML system and AM.

Conclusions

The employing of ML systems may be feasible, having both a better classification and a better estimation of biokinetic parameters.

Background

Biokinetic parameters play a crucial role in molecular radiotherapy (MRT) in the assessment of the absorbed dose to lesions or organs at risk as they are closely related to treatment toxicity and efficacy [1].

Determining the time integrated activity (\(\tau\)) accurately is challenging [2, 3]. The assessment of Time Activity Curve (TAC) depends on the data (TAC-Ps) collection and on the choice of the model that best fits data, in order to calculate the time integrated activity coefficient (TIAc). Many parameters affect the fit model selection: the number of TAC-Ps, the time window, the time sampling and the error that affects each measurement.

In 2007 Glatting et al. [4] proposed the use of the corrected Akaike Information Criterion (AICc) and F-test [5,6,7,8] methods for the comparison of two different models in molecular radiotherapy. In 2013 Kletting et al. [9] showed the need of using a fitting method dedicated to MRT data whereby the physical and biological aspects of the biokinetics curve must be considered in order to compute meaningful parameters.

In recent years, several applications based on Machine Learning (ML) algorithms have been developed in nuclear medicine [10,11,12]. The first aim of this study was to implement ML systems to classify the proper curves model and predict, via regression, the TIAc. The secondary aim was to compare the ML performances with the ones obtained with fit algorithm and the analytical methods, AICc and F-test.

Methods

In order to have a better understanding of this manuscript we assume the following definitions: (i) Points of the Time-Activity Curve (TAC-Ps) are the set of points used to determine a TAC (ii) The training set (TrN) and the test set (TsN) consist of 10,000 and 2000 TAC-Ps, respectively, where each individual TAC-Ps is composed of a number of points equal to N.

The study was subdivided in 3 steps: (i) the generation of synthetic TrN and TsN to train and test the machine learning systems. (ii) The training of the ML systems, (iii) the test of the ML systems and performance evaluation in comparison with analytical methods. All these steps are fully described in the sections below and shown in Fig. 1.

Fig. 1
figure 1

Study workflow

At the end, in order to show the applicability of the developed methods, the ML systems were used to calculate the TIAc of 20 patients’ TAC-Ps.

Training and test dataset generation

To train and test the ML systems a series of TrN and TsN were synthetically generated. The aim of this first step was to generate TAC-Ps simulating whole-body fractions of injected activity (FIAWB) of patient affected by metastatic Differentiated Thyroid Carcinoma (mDTC), defined as:

$$FI{A}_{WB}\left(t\right)=\frac{{A}_{WB}\left(t\right)}{{A}_{Adm}}$$
(1)

where \({A}_{WB}\left(t\right)\) is the whole-body activity at time t and \({A}_{Adm}\) is the administered activity.

The synthetic TAC-Ps were generated with both a Mono-Exponential (MEf) and Bi-Exponential functions (BEf) into a band of possible curves, showed in Fig. 2, and their values are listed in Table 1. In the “Appendix” the generation method is reported.

Fig. 2
figure 2

All the generated TACs are into the blue and red bands for Mono-Exponential and Bi-Exponential functions respectively

To study how N affects the ML performances, 10 different TrN were created, with N ranging from 4 to 20, in a time window of 160 h. In order to simulate the patient’s hospitalization, the first N-1 points were equally distributed in a range of [0; 54] h, while the last point was set at 160 h. Each TrN was used to train the ML systems and it was derived from 5000 MEf and 5000 BEf curves.

Similarly, 10 TsN were generated to test the ML systems; they were derived from 1000 MEf and 1000 BEf curves. The generation procedure is similar to the one of the TrN, with the addition of Gaussian noise on the FIA coordinates, in order to simulate measuring errors. The amount of added noise is randomly extracted from a Gaussian distribution centred on the FIA coordinates, with a standard deviation of 5% on the FIA values [13].

Implementation and training of ML system

Two different supervised learning systems were implemented using Python scripts: one for a binary classification task (ML1) and one for a regression task (ML2). Scikit-Learn library provided different machine learning models.

Hardware was composed of a personal computer having 10th generation Intel I7 CPU with 16 GB RAM.

ML1 is an ensemble of two Logistic Regression models, which used the Soft Voting method to predict the proper class [14]. The TAC-Ps were implemented as features (2N features considering x and y coordinates), and it returns MEf or BEf class as output.

ML2 is an AdaBoost ensemble composed of 5 Gradient Boosting Regressors (GBRs), working sequentially. Each GBR was designed as a chain of 1000 Decision Tree Regressors [14, 16]. The ML2 task is to predict the TIAcs (\({A}_{i}, {\lambda }_{i}\)), in order to calculate \(\tau\). In ML2, TAC-Ps and fit model (MEf or BEf) were used as input features (2N + 1 features). Each TrN with a specific number of points individually trains ML2. Therefore, even though the ML2 algorithms are always the same, ML2 consists of 10 ML2N, depending on the number of points N. To simplify, the subscript N is omitted when referring to ML2 in the rest of this text.

The adopted models from Python Scikit-Learn library were customized with hyperparameters [17] reported in Tables 2 and 3, for the classifier and the regressor respectively. All the other ones that are not reported in tables were set with the default values proposed by the library.

Table 1 TIAc Parameters for the band of possible TACs
Table 2 Scikit-Learn hyperparameters used for the customization of ML1
Table 3 Scikit-Learn hyperparameters used for the customization of ML2

Analytical methods: AICc and F-test

Each TAC-Ps of TsN were fitted using the Trust-Region algorithm, implemented on the Matlab toolkit, while a stripping algorithm was used to evaluate the fit starting point [18]. Each TAC-Ps was fitted both with MEf and BEf and the two results were compared using the AICc and F-Test (with a confidence interval of 95%), as mentioned by Kletting et al. [4].

As reported in [7] the AICc is described by the following equation:

$$AICc=N\cdot {\text{ln}}\left(\frac{SS}{N}\right)+2\cdot K+\frac{2\cdot K\cdot (K+1)}{N-K-1}$$
(2)

where K is the number of estimated parameters included in the model, N is the number of points, and SS is the sum of squared deviations (between the measurement and the fitted curve). The model with the lower AICc score is the model that is more likely to be correct. Once AICc has been calculated for each fitting model, it is possible to compute the probability that the correct model (i) has been chosen, as follows:

$${w}_{i}=\frac{{e}^{\left(-\frac{\Delta }{2}\right)}}{1+{e}^{-\frac{\Delta }{2}}}$$
(3)

where \(\Delta\) is the difference between the AICc score.

The model with the minimal AICc value between all candidate models indicates the best model.

The F-test is a hypothesis test in which only two models can be compared using the following equation [4, 7]:

$$F=\frac{(S{S}_{ME}-S{S}_{BE})/S{S}_{BE}}{(D{F}_{ME}-D{F}_{BE})/D{F}_{BE}}$$
(4)

where \(SS\) are the sums of squared deviations (for MEf and BEf) and DF are the degrees of freedom (DF = N-K). The decision to accept or discard the MEf model is based on the p-value calculated from the F ratio. For a P value below the chosen significance level (0.05) the MEf model is rejected and therefore the more complex model is assumed to fit the data in a significantly better way.

Test and performance evaluation

A tenfold cross validation of ML1 system has been performed using 80% of the TrN in training and the remaining 20% for testing. The classification accuracy (CA) of the correct fit model was chosen as performance estimator. It is defined as:

$$CA=\frac{C{C}_{ME}+C{C}_{BE}}{2000}$$
(5)

where \({CC}_{ME}\) and \(C{C}_{BE}\) are the number of correct classifications for mono- and bi-exponential functions respectively, and 2000 is the number of TACs.

The ability of ML1 system, AICc and F-test to correctly classify the model was evaluated, determining the CA using TsN as input.

The chosen parameter to evaluate the goodness of TIAc prediction was the area under each TAC (\(\tau )\), assessed through the following equation [15]:

$$\tau =\sum_{i=1}^{n}\frac{{A}_{i}}{{\lambda }_{i}}$$
(6)

where \(n\) is equal to 1 for MEf and 2 for BEf and \({A}_{i}\) and \({\lambda }_{i}\) are the parameters obtained from the fit.

Two different tests were performed to evaluate the goodness of TIAc prediction: the first had the aim to evaluate the best possible performances of the fit algorithm and ML2, considering a classification with no errors. Meanwhile the second test evaluated the performances of the chains ML1 + ML2, fit + AICc and fit + F-Test, considering all the classification errors. In addition, a fourth chain was considered, named fit + AICc-W. In this method, an average model was considered, taking into account the probability of each model given by the parameter \({w}_{i}\). The area \(\tau\) was calculated as follows:

$$\tau ={\tau }_{MEf}\cdot {w}_{MEf}+{\tau }_{BEf}\cdot {w}_{BEf}$$
(7)

The performances of the two tests were evaluated assessing the distributions of the percentage differences \((\mathrm{\Delta \tau })\) between the calculated \(\tau\) and true one, the interquartile range and the Maximum Error Range (MER). The MER was represented as the range between the minimum and maximum value of \(\mathrm{\Delta \tau }\).

Test on patient data

The whole-body TAC-Ps of 20 patients, affected by mDTC, were evaluated employing both methods: Fit + F-Test and ML1 + ML2. All TAC data were obtained by calculating the geometric mean of measurements taken with a plastic scintillator placed at a distance of 4 m from the patient and calibrated in terms of H*(10), providing a value of the dose rate in µSv/h.

Each TAC-Ps had 5 or 6 points and they were uniformly classified as 10 MEf and 10 BEf by the F-Test.

Results

The computation time to train ML1 and ML2 sequentially is about eight minutes, while the computer takes few seconds to perform the test with 2000 curves. The same hardware takes about 10 min to perform fit + F-test and fit + AICc using the same datasets.

The CA in the cross validation, varying the number of points (N), is reasonably constant at a value of 99%. Figure 3 shows the CA of the three systems using the test dataset that is about constant at the value of 98% for ML1, while it increases from 50 to 92% for AICc and from 62 to 92% for F-test, when the number of points increases.

Fig. 3
figure 3

Classification accuracy obtained by three methods varying the number of points of the TACs simulated

Figure 4 shows the \(\mathrm{\Delta \tau }\) distributions obtained using a classification without errors. The median values remain approximately constant around 0% for both methods, and also the width of the distribution between the first and third interquartile is independent from the number of points with values equal to 4.5% and 3.6% for ML2 and the fit algorithm respectively. MER are represented by the bands in Fig. 4, which become thinner as the number of points increases, and their ranges vary from [− 14.2; 14.5]% to [− 8.1; 7.5]%, and from [− 16.8; 22.1]% to [− 6.7; 17.3]% for ML2 and fit algorithm respectively.

Fig. 4
figure 4

Boxplots representing the Δτ distributions and the MER bands obtained using the fit algorithm (blue) and ML2 (red). For each set of measures the performances of the ML system and the fit algorithm were compared only for the assessment of the AUC, since it was known a priori if the points follow a mono-exponential or a bi-exponential trend. Through the estimation of the fit model parameters the AUC was calculated and the distributions were represented

The \(\mathrm{\Delta \tau }\) distributions shown in Fig. 5 were obtained including any classification failure from AICc, AICc-W, F-test and ML1. Even in this case, varying N, the median values are approximately constant for all the considered methods, while interquartile range varies from 24.7 to 3.1% for the fit + AICc, from 24.6 to 3.1% for the fit + AICc-W and from 15.8% to 3.1% for the fit + F-test. The width remains constant around 4.5% for the ML1 + ML2 system. With the increase of the number of points, MER varies from [− 66.7; 11.9]% to [− 19.3; 20.9]% for fit + AICc, from [− 66.7; 12.0] to [− 20.7; 18.1] for fit + AICc-W, from [− 65.3; 32.1]% to [− 15.1; 20.9]% for fit + F-test and from [− 25.2; 23.6] to [− 10.6; 12.5]% for ML1 + ML2.

Fig. 5
figure 5

Boxplots representing the Δτ distributions and the MER bands obtained using the fit + AICc (blue), fit + AICc-W (green), fit + F-test (grey) and ML1 + ML2 (red). For the analytical methods, points were fitted with both models and the AICc and F-Test determine which AUC should be considered. For ML, first the best fit model was predicted and then the model parameters in order to calculate the AUC

When N < 8, \(\mathrm{\Delta \tau }\) distributions obtained by AICc and F-Test are asymmetrical, while for N > 8 the distribution width, obtained with the three methods, are equally distributed. The MER band of ML2 is always thinner than the other two.

Considering the TAC-Ps of 20 patients, 9 out of the 10 MEfs were classified as BEfs, while all the 10 presented BEfs were confirmed to be BEfs by ML1. Figure 6 shows the correlation between the \(\tau\) calculated with ML1 + ML2 and the one calculated with the fit Algorithm + F-Test on the 20 patients data, obtaining an R2 equal to 0.93, and a difference percentage of the results in a [− 26.9;12.8]% range, with a median value of 0.2%.

Fig. 6
figure 6

Comparison between the τ values obtained with ML1 + ML2 and Fit Algorithm + F-test. The red points represent the area under the curve of TACs classified as mono-exponential for the F-test while the blue points represent the area under the curve of TACs classified as bi-exponential for the F-test

Discussion

The obtained results show the feasibility of using machine learning system both to classify the proper model and predict the TIAc with an improvement of the performances.

The training phase plays a crucial role into obtaining results close to true values, and one of the major advantages shown by ML is the possibility for the model to be trained with error-free data and then tested with data simulating real curves, resulting in a high CA (about 98.5%) and a \(\mathrm{\Delta \tau }\) less than ± 25%. This aspect can be particularly advantageous in order to train the system without knowing the error of the measurement system “a priori”. Training with error-free data is not only advantageous but also recommended, as reported by Geron et al. [14]. The training phase should be conducted with the cleanest possible data, to avoid confusing the system and to prevent overfitting. To confirm this, training was performed with noisy data similar to the ones used in TrN, and a decrease in classification accuracy of around 5% was recorded.

The time spent for training is very short, and it needs to be executed just once before running classifications and regressions.

The main difference between ML and the analytical method is the workflow. The former predicts the model and then assesses the biokinetic parameters. Instead, the latter calculates the biokinetic parameters for both models first and then chooses the optimal one. This aspect is crucial: the analytical method is not optimized because it necessarily performs both fit models and, in addition, the choice of the model is strongly dependent on the algorithm fit and on the goodness of the TIAc.

The cross-validation results are better than the ones obtained in the test phase, but this was to be expected: in fact, TsN simulates the measurements errors (thanks to the addition of Gaussian noise), and so the results get worse without being associated with an overfitting condition. Figure 3 shows that CA is not dependent on the number of points for ML1, while it is highly dependent for AICc and F-test. The analytical methods show a low accuracy when the number of points is near to the limit of applicability (from Eqs. 3 and 4 it is possible to notice that F-test and AICc are available only if the number of points is at least 5 and 6 respectively). As reported by Kletting et al. in [19], F-test and AICc classify as mono-exponential most of the bi-exponential curves (CA = 61.0% and CA = 50.0%) if the number of points is 5 or 6 respectively. In addition, the CA obtained with ML1, equal to 98.5%, is higher than the results obtained with AICc and F-test.

Figure 4 shows the best results obtainable through the two systems, dedicated to the assessment of biokinetic parameters. Considering an error-free classification, the width of the \(\mathrm{\Delta \tau }\) distributions, obtained with ML2 and with the fit algorithm, are similar: hence, in most cases the two systems are equivalent. The MER obtained from ML2, represented by the red band in Fig. 4, does not exceed ± 15% and it is smaller than the one obtained using the fit algorithm (about ± 20% and represented by the blue band).

Glatting et al. in [4] use fitting algorithms that take into account uncertainties on individual points of the TAC. The fits used in the present study, on the other hand, do not consider such errors and, in future works, it will be necessary to perform a comparison. The use of these algorithms will likely lead to a convergence of performance between analytical methods and ML methods, but a priori knowledge of errors is not always possible. Therefore, “a priori”-trained ML systems remain a more general method for both model selection and obtaining kinetic parameters.

When the input to determine the biokinetic parameters are the classification results obtained through ML1, AICc, AICc-W and F-test, performances get worse for all the examined four systems (Fig. 5), but the ML1 + ML2 system keeps MER within ± 25%. The analytical methods tend to underestimate the area under the curve when the number of points is lower than 8. This effect is due to the tendency of the two methods to classify bi-exponential curves as mono-exponential.

The use of ML in this field can also lead to the possibility of using it in a more radical way. For example, an attempt was made to train the ML2 system to obtain the area under the curve directly as the output, instead of the fit parameters. The obtained results were similar to those shown in Figs. 4 and 5, but, according to the authors, this is an incorrect way of using ML systems: in fact they are already considered black boxes, and using the system in the aforementioned way, without the possibility to assure the goodness of the fit, can be challenging to justify in the clinical use of dosimetry.

It is important to underline that the performance of ML systems has proven to be fairly independent from the number of points composing the TACs, whereas a more pronounced dependence has been observed in relation to errors in the data. Some tests were conducted by varying the error on the points, resulting in a decrease in performance. These analyses were not reported in the results section because they go beyond the scope of this work, which is to demonstrate the feasibility of using ML systems in dosimetry in molecular radiotherapy, for model selection and the calculation of the area under the TAC curve.

Figure 6 shows the applicability of the ML systems on real data: it confirms that, when ML1 and F-test classify the curve with the same model, the assessed biokinetic parameters are very similar. On the other hand, when classification does not coincide, the distance between the two results increases. An example of these results is shown in Fig. 7. Considering CA for 5 points (shown in Fig. 3) and the graph in Fig. 5, there is a high probability that the correct classification is given by ML1, and consequently the ML2 result could better represent the biokinetic curve of the patient.

Fig. 7
figure 7

TACs of Wholebody FIA of three different patients and fit obtained with fit + F-test and ML1 + ML2 systems. a The two model agree classifying and fitting the data with a MEf. b Also in this case ML1 and F-test classify with a BEf. c In the third case the F-test classify the curve as MEf while it is a BEf for ML2. In this case the two obtained fit curves are significantly different

ML1 and ML2 have been trained with curves simulating the whole-body FIA of patients administered with [131I]-NaI, but can be trained to perform cumulated activity calculations also for different radionuclides in metabolic radiotherapy, and with different dosimetry calculation methods (i.e. voxel dosimetry).

In 2017 Sarrut et al. [20] showed that the selection of the proper fit model reduces the number of fit failures in voxel dosimetry (defined as the case in which the optimizer does not converge and reaches the maximum number of iterations, or if the R2 is lower than a certain threshold). The brief time to perform the 2000 test fits and the independence of the performance from the number of points make these systems suitable for this application.

It is necessary to emphasize that each new application (such as the study of new curve models) needs a new training, and that the performance is strongly dependent on the similarity between the training samples and the clinical case. In addition, in this study, Gaussian-type errors were used on the data because the purpose of the work was to demonstrate the feasibility of using ML systems. However, in future studies, it will be necessary to investigate how data errors affect performance using different thresholds and various types of distributions, such as the Poisson distribution.

This is a study on the feasibility of using the ML system for classifying the correct fit model and predicting the TIAc, and it doesn’t have the purpose to study which ML algorithm is the optimal one in performing these two tasks. The choice of the logistic regression and the AdaBoost algorithm is arbitrary, and a subsequent study is necessary to identify which algorithms could increase the performance results.

In conclusion, to the knowledge of the authors, this study is the first to propose the use of ML systems for TIAc calculations for dosimetry in MRT. As demonstrated, the use is feasible and promising, but it requires further investigations in different fields. Therefore, investigations will be performed in order to train such systems with different algorithms, treatments, radionuclides, curve models and for voxel dosimetry.

Availability of data and material

The data that support the findings of this study are available on request from the corresponding author.

References

  1. Dewaraja YK, Schipper MJ, Shen J, et al. Tumor-absorbed dose predicts progression-free survival following (131)I-tositumomab radioimmunotherapy. J Nucl Med. 2014;55(7):1047–53. https://doi.org/10.2967/jnumed.113.136044.

    Article  CAS  PubMed  Google Scholar 

  2. Bolch WE, Eckerman KF, Sgouros G, et al. MIRD pamphlet No. 21: a generalized schema for radiopharmaceutical dosimetry-standardization of nomenclature. J Nucl Med. 2009;50(3):477–84. https://doi.org/10.2967/jnumed.108.056036.

    Article  CAS  PubMed  Google Scholar 

  3. Siegel JA, Thomas SR, Stubbs JB, et al. MIRD pamphlet no. 16: Techniques for quantitative radiopharmaceutical biodistribution data acquisition and analysis for use in human radiation dose estimates. J Nucl Med. 1999;40(2):37S-61S.

    CAS  PubMed  Google Scholar 

  4. Glatting G, Kletting P, Reske SN, et al. Choosing the optimal fit function: comparison of the Akaike information criterion and the F-test. Med Phys. 2007;34(11):4285–92. https://doi.org/10.1118/1.2794176.

    Article  CAS  PubMed  Google Scholar 

  5. Akaike H. Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F, editors. Second international symposium on inference theory. Budapest: Akademiai Kiado; 1973. p. 267–81.

    Google Scholar 

  6. Burnham KP, Anderson DR. Model selection and multimodel interference-a practical information-theoretic approach. New York: Springer; 2002.

    Google Scholar 

  7. Motulsky HJ, Christopoulos A. Fitting models to biological data using linear and nonlinear regression: a practical guide to curve fitting. Boston: GraphPad Software Inc; 2004.

    Book  Google Scholar 

  8. Hurvich CM, Tsai CL. Regression and time series model selection in small samples. Biometrika. 1989;76(2):297–307. https://doi.org/10.1093/biomet/76.2.297.

    Article  MathSciNet  Google Scholar 

  9. Kletting P, Schimmel S, Kestler HA, et al. Molecular radiotherapy: the NUKFIT software for calculating the time-integrated activity coefficient. Med Phys. 2013;40(10):102504. https://doi.org/10.1118/1.4820367.

    Article  CAS  PubMed  Google Scholar 

  10. Currie G, Hawk KE, Rohren E, et al. Machine learning and deep learning in medical imaging: intelligent imaging. J Med Imaging Radiat Sci. 2019;50(4):477–87. https://doi.org/10.1016/j.jmir.2019.09.005.

    Article  PubMed  Google Scholar 

  11. Uribe CF, Mathotaarachchi S, Gaudet V, et al. Machine learning in nuclear medicine: part 1-introduction. J Nucl Med. 2019;60(4):451–8. https://doi.org/10.2967/jnumed.118.223495.

    Article  PubMed  Google Scholar 

  12. Zukotynski K, Gaudet V, Uribe CF, et al. Machine learning in nuclear medicine: part 2-neural networks and clinical aspects. J Nucl Med. 2021;62(1):22–9. https://doi.org/10.2967/jnumed.119.231837.

    Article  PubMed  Google Scholar 

  13. Flux GD, Guy MJ, Beddows R, Pryor M, et al. Estimation and implications of random errors in whole-body dosimetry for targeted radionuclide therapy. Phys Med Biol. 2002;47(17):3211–23. https://doi.org/10.1088/0031-9155/47/17/311.

    Article  PubMed  Google Scholar 

  14. Géron A. Hands-on machine learning with scikit-learn, keras, and TensorFlow. Sebastopol: O’Reilly Media; 2019.

    Google Scholar 

  15. Lassmann M, Hänscheid H, Chiesa C, et al. EANM Dosimetry Committee. EANM Dosimetry Committee series on standard operational procedures for pre-therapeutic dosimetry I: blood and bone marrow dosimetry in differentiated thyroid cancer therapy. Eur J Nucl Med Mol Imaging. 2008;35(7):1405–12. https://doi.org/10.1007/s00259-008-0761-x.

    Article  PubMed  Google Scholar 

  16. Schapire RE. The strength of weak learnability. Mach Learn. 1990;5:197–227. https://doi.org/10.1007/BF00116037.

    Article  Google Scholar 

  17. https://scikit-learn.org/

  18. Kirkup L, Sutherland J. Curve stripping and nonlinear fitting of polyexponential functions to data using a microcomputer. Comput Phys. 1988;2:64. https://doi.org/10.1063/1.168313.

    Article  ADS  Google Scholar 

  19. Kletting P, Glatting G. Model selection for time-activity curves: the corrected Akaike information criterion and the F-test. Z Med Phys. 2009;19(3):200–6. https://doi.org/10.1016/j.zemedi.2009.05.003.

    Article  PubMed  Google Scholar 

  20. Sarrut D, Halty A, Badel JN, et al. Voxel-based multimodel fitting method for modeling time activity curves in SPECT images. Med Phys. 2017;44(12):6280–8. https://doi.org/10.1002/mp.12586.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

None.

Funding

This work was supported also by the Italian Ministry of Health with Current Research funds.

Author information

Authors and Affiliations

Authors

Contributions

(I) Conception and design: DC, BC and VC; (II) Patients data acquisition: CA, MFV, MP and MCG; (III) Generation Curve algorithm: DC, BC, AN, GC and LP; (IV) Programming in Python and MATLAB: DC, BC, AN, GC, LP and CP; (V) Synthetic Data Analysis: DC, BC, DS, FM, AN, CP, ESC, GC and LP; (VI) Patient Data Analysis: DC, BC, SD, CP, ESC, CA, MFV, MP, MCG and VC; (VII) Interpretation of data: DC, BC, SD, FM, AN, CP, ESC, GC and LP; (VIII) Manuscript writing: DC, BC and VC (IX) Final approval of manuscript: All authors. Large Language Models: No Large Language Models (LLMs), such as ChatGPT, has been used.

Corresponding author

Correspondence to Bartolomeo Cassano.

Ethics declarations

Ethics approval and consent to participate

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Consent for publication

Informed consents for the publication of the study, the diagnostic and therapeutic procedure were obtained from the legally authorized representative.

Competing interests

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Method for synthetic TAC-Ps generation

The datasets (TRN and TSN) used to train and test the systems consist of TAC-Ps, where each point composing them simulates the FIAWB of patients with metastatic differentiated thyroid carcinoma (mDTC), administered with [131I]I-NaI. Each point in the TAC-Ps simulates the result of the geometric mean between measurements acquired from the anterior and posterior positions, using an external probe located 4 m from the patient. The TIAc values for the two bands are reported in Table 1 and are obtained using the upper and lower bounds of the TACs calculated for 50 patients administered in our institution and subjected to dosimetric study.

To generate each of the TAC-Ps, the following method was used for both mono-exponential and bi-exponential trends. For BEf, four time points were selected at 0, 24, 90, and 160 h and, for each of these times, a range of possible values was determined by the band showed in Fig. 2. From each of these ranges, a random FIAWB value was extracted, resulting in 4 values corresponding to the 4 different times. These values were fitted with a bi-exponential curve, obtaining the parameters of the generated curve. Finally, a check is performed to ensure that the generated curve does not intersect the bands up to 500 h. If the check is successful, the curve parameters are recorded, and the TAC-Ps are created with the chosen number of points for either TrN or TsN. The procedure for TAC-Ps with a mono-exponential trend is identical, with the only difference being that two points are extracted at times 0 and 90 h.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ciucci, D., Cassano, B., Donatiello, S. et al. Fit of biokinetic data in molecular radiotherapy: a machine learning approach. EJNMMI Phys 11, 19 (2024). https://doi.org/10.1186/s40658-024-00623-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40658-024-00623-5

Keywords