- Original research
- Open access
- Published:
Signal separation of simultaneous dual-tracer PET imaging based on global spatial information and channel attention
EJNMMI Physics volume 11, Article number: 47 (2024)
Abstract
Background
Simultaneous dual-tracer positron emission tomography (PET) imaging efficiently provides more complete information for disease diagnosis. The signal separation has long been a challenge of dual-tracer PET imaging. To predict the single-tracer images, we proposed a separation network based on global spatial information and channel attention, and connected it to FBP-Net to form the FBPnet-Sep model.
Results
Experiments using simulated dynamic PET data were conducted to: (1) compare the proposed FBPnet-Sep model to Sep-FBPnet model and currently existing Multi-task CNN, (2) verify the effectiveness of modules incorporated in FBPnet-Sep model, (3) investigate the generalization of FBPnet-Sep model to low-dose data, and (4) investigate the application of FBPnet-Sep model to multiple tracer combinations with decay corrections. Compared to the Sep-FBPnet model and Multi-task CNN, the FBPnet-Sep model reconstructed single-tracer images with higher structural similarity, peak signal-to-noise ratio and lower mean squared error, and reconstructed time-activity curves with lower bias and variation in most regions. Excluding the Inception or channel attention module resulted in degraded image qualities. The FBPnet-Sep model showed acceptable performance when applied to low-dose data. Additionally, it could deal with multiple tracer combinations. The qualities of predicted images, as well as the accuracy of derived time-activity curves and macro-parameters were slightly improved by incorporating a decay correction module.
Conclusions
The proposed FBPnet-Sep model was considered a potential method for the reconstruction and signal separation of simultaneous dual-tracer PET imaging.
Introduction
Positron emission tomography (PET) is a molecular imaging technology measuring metabolic functions in vivo by radionuclide-labeled tracers. Benefiting from the development of various types of PET tracers, a growing number of biomarkers can be measured to aid in the early detection and diagnosis of diseases [1,2,3].
For diseases with complex pathological characteristics, measurements of more than one biomarker are needed, e.g., the amyloid-\(\beta\) and tau in Alzheimer’s disease [4]. While this is currently accomplished by performing separate PET scans with different tracers [5], it is natural to think of using multiple tracers in a single scan for efficiency. However, the difficulty of signal separation in multi-tracer PET imaging has been an obstacle to its application, since all tracers emit 511-keV photons generated from the positron-electron annihilation, making it impossible to distinguish different tracers during the detection.
Signal separation, or reconstruction, of rapid dual-tracer PET imaging has been studied extensively. Except a small portion of studies using special tracers that emit additional prompt \(\gamma\)-rays [6,7,8,9], the main idea of most separation approaches is to make use of distinct temporal characteristics of different tracers to separate the mixed dual-tracer signals, as reviewed in [10]. In that case, a dynamic PET scan is necessary.
Before the rising of deep learning, methods based on the difference in tracer half-lives were firstly proposed [11] and applied [12]. Although established the foundation of using temporal characteristics to separate dual-tracer signals, these methods could only be applied when tracer concentrations reached the equilibrium. Later proposed methods were mainly based on the kinetic modeling of dual-tracer time-activity curves (TACs) by parallel compartment models [13,14,15,16,17,18,19,20]. After estimating the model parameters, the single-tracer TACs could be easily recovered. Other approaches free of compartment models included principal component analysis [21], generalized factor analysis [22], basis function fitting [23], spectral analysis [24] and recurrent extreme gradient boosting [25]. Nevertheless, these methods were limited by the requirement of arterial blood sampling or staggered injection, or might be sensitive to tracer pairs and the order of injection.
The deep learning methods do not require the arterial blood sampling, and can be applied to simultaneously injected tracers. These methods can be divided into two categories. One is the indirect reconstruction methods, which firstly reconstruct the dual-tracer images by traditional algorithms, then separate the images by neural networks [26,27,28,29,30,31,32]. The separation can be performed to either the voxel TACs [26,27,28,29,30] or the dynamic image as a whole [31, 32], with the latter utilizing the spatial information in addition to the temporal information.
The other category is the direct reconstruction methods that reconstruct the single-tracer images from the dual-tracer sinogram by usually the convolutional neural networks (CNNs), like FBP-CNN [33] and Multi-task CNN [34, 35]. The reconstruction part of FBP-CNN adopted a two-dimensional convolution layer to approximate the spatial filter, and a fully-connected layer to approximate the back-projection in the traditional filtered back-projection (FBP) algorithm. And the separation part used three-dimensional (3D) convolution kernels to learn the spatiotemporal features. In the Multi-task CNN, an encoder-decoder framework was adopted instead of the reconstruction-separation framework, and the multi-task learning mechanism was incorporated. The network was fully composed of 3D convolution and deconvolution layers. However, the performance of FBP-CNN was limited by its large network scale, which was mainly caused by the fully-connected layer. Although the Multi-task CNN contained much less parameters and was therefore easier to train, it was less interpretable. Considering the limitations of FBP-CNN and Multi-task CNN, this study aimed to propose a reconstruction-separation neural network with limited number of parameters. Besides, the 3D convolution or deconvolution kernels used in both methods processed the spatiotemporal information locally. To expand the receptive field, introducing the global feature processing might be more effective than cascading the layers using small-size kernels, which was also considered in the current study.
We firstly proposed a CNN-based model for the signal separation of simultaneous dual-tracer PET imaging. In order to extract and process global features, an Inception-like module and channel attention were applied to spatial and temporal dimensions respectively. The Inception module parallelly extracts and concatenates features of different scales to expand the receptive field [36]. In its advanced version, 3\(\times\)3 kernels were replaced by a combination of 1\(\times\)3 and 3\(\times\)1 kernels to save the memory [37]. Motivated by these works, we included an Inception-like module to extract global spatial features by \(H\times\)1 and 1\(\times W\) kernels, with H and W representing the height and width of the images. The attention mechanism lets neural networks automatically learn to assign different weights to features, which means learning to focus on and select important information. It can be applied to different dimensions [38,39,40,41]. Inspired by the Squeeze-and-Excitation Networks [39], we used the channel attention for the time dimension of dynamic PET data.
The proposed separation network was then cascaded with FBP-Net [42], a deep learning implementation of the FBP algorithm, generating a direct reconstruction model named FBPnet-Sep. The FBPnet-Sep model was verified by simulated dynamic PET data, and compared to Multi-task CNN [34]. Moreover, experiments were performed to verify the superiority of image separation over sinogram separation, the effectiveness of using global spatial information and channel attention, as well as the application to low-dose data and to different tracer combinations, which will be illustrated in the following sections.
Methods
Model of simultaneous dual-tracer PET imaging
The simultaneous dual-tracer PET imaging is modeled as follows:
\(S_{dual}(t)\) is the dual-tracer sinogram at time t. \(I_1(t)\) and \(I_2(t)\) are the activity images of Tracer 1 and Tracer 2. G is the system matrix. \(r_1(t)\) and \(r_2(t)\) are random coincidence events. The scatter coincidence events are not considered in this study.
The values of the j-th pixel in a dynamic activity image I depend on regional physiological parameters, the activity in the blood, and the radioactive decay of the tracer:
\(C_T\) and \(\varvec{C_P}\) represent undecayed radioactivity concentrations in tissue and plasma. \(\varvec{k}^{(j)}\) is the physiology-related parameters in the region corresponding to the j-th pixel. T is the half-life of radioactive decay.
Network architecture
In this study, we proposed a novel CNN for signal separation of simultaneous dual-tracer PET imaging, incorporating the use of global spatial information and attention mechanism. The structure of the separation network is displayed in Fig. 1a.
The input of the network is the dynamic dual-tracer image, which is in the shape of \(N_b\times C\times H\times W\), where \(N_b\) is the batch size, C is the number of channels (i.e., frames), H and W are the height and width of the image.
The network is made up of several basic convolution modules, an Inception-like module, and a channel attention module. The basic convolution module consists of a convolution layer, a batch normalization layer, and the rectified linear unit. The Inception-like module includes two basic convolution modules, which parallelly use \(H\times\)1 and 1\(\times W\) kernels to extract features by rows and columns. The features are concatenated in channel dimension. Although different from the original Inception that used small-size kernels [37], it is still named as Inception in the current study. The channel attention module is composed of max pooling, average pooling, fully-connected layers with 2C neurons, and sigmoid activation function. At last, the output is separated in channel dimension to get two single-tracer dynamic images.
We further cascaded the separation network with FBP-Net to form the FBPnet-Sep model (Fig. 1b). This model firstly reconstructs and denoises the dual-tracer image by FBP-Net, then separates the image by the separation network in Fig. 1a. The FBP-Net adopts a learnable filter in the frequency domain, and maps the filtered sinogram to image by traditional back-projection without learnable parameters. Following the reconstruction, it denoises the images by a residual CNN. The details of FBP-Net can be referred to previous work of Wang and Liu [42]. For comparison, the Sep-FBPnet model (Fig. 1c) was also proposed, by which the sinogram separation was conducted before image reconstruction.
For the application to multiple tracer combinations, especially when two tracers have different half-lives, we improved the FBPnet-Sep model by adding decay corrections before the separation part, forming the FBPnet-DC-Sep model (Fig. 1d). As shown in Fig. 1d, the reconstructed dual-tracer image is decay corrected according to half-lives of two tracers respectively. The two decay-corrected images, together with the uncorrected image, are input to the separation network, and processed by different Inception modules.
Loss functions
The loss function of a single output is a weighted summation of mean squared error (MSE) and structural similarity (SSIM) index [43] between the output and label:
where \({\hat{x}}\) is the predicted dynamic output, and x is the label. The coefficient \(\beta\) is used to balance the MSE and SSIM.
The total loss of the entire model is composed of loss of the FBP-Net and loss of the separation part:
\(\lambda _{FBP}\) and \(\lambda _{Sep}\) are the weights of two losses. And the above-mentioned parameter \(\beta\) is set different in reconstruction part (\(\beta _{FBP}\)) and separation part (\(\beta _{Sep}\)).
For the FBPnet-Sep (Fig. 1b) and FBPnet-DC-Sep (Fig. 1d) models, the \(L_{FBP}\) serves as the auxiliary loss and \(L_{Sep}\) as the main loss, which are formulated as:
\(I_{dual}\), \(I_1\) and \(I_2\) are images of dual-tracer mixture, Tracer 1 and Tracer 2 respectively. Here, the coefficients in loss function are set as: \(\beta _{FBP}=0.5\), \(\beta _{Sep}=0.95\), \(\lambda _{FBP}=1\), \(\lambda _{Sep}=10\).
As for the Sep-FBPnet model (Fig. 1c), the \(L_{Sep}\) serves as the auxiliary loss and \(L_{FBP}\) as the main loss, which are formulated as:
\(S_1\) and \(S_2\) are sinograms of Tracer 1 and Tracer 2. Here, the coefficients are set as: \(\beta _{FBP}=0.5\), \(\beta _{Sep}=0.99\), \(\lambda _{FBP=}1\), \(\lambda _{Sep}=1\).
Experiments
Experimental settings
In this study, we conducted four experiments to verify the capability of the proposed method by multiple simulated datasets. The experimental settings and tracer combinations are listed in Table 1 and Table 2.
In Experiment 1, we compared the FBPnet-Sep model which separated in the image domain with two contrast methods. One was the Sep-FBPnet model that separated in the sinogram domain, and the other was our previously proposed Multi-task CNN, which directly mapped the dual-tracer sinogram to two single-tracer images. The FBPnet-Sep model, Sep-FBPnet model and Multi-task CNN contain 0.64 M, 0.66 M and 1.31 M parameters respectively. All three models were trained and tested by simulated \(^{18}\)F-FDG/\(^{11}\)C-FMZ dynamic PET data. As shown in Table 2, \(^{18}\)F-FDG and \(^{11}\)C-FMZ have different half-lives and different types of kinetic characteristics [44]. Single-tracer images inferenced by these three methods were also compared to those reconstructed from noisy single-tracer sinograms by maximum likelihood expectation maximization (MLEM) with 50 iterations.
In Experiment 2, we conducted the ablation study of FBPnet-Sep model by the same \(^{18}\)F-FDG/\(^{11}\)C-FMZ dataset. The proposed FBPnet-Sep model included basic convolution modules (Conv), the Inception module (Inc) and the channel attention module (CA) in its separation part. To investigate the contribution of the Inception and channel attention modules, we compared the original FBPnet-Sep(Conv+Inc+CA) model with the FBPnet-Sep(Conv), FBPnet-Sep(Conv+Inc) and FBPnet-Sep(Conv+CA) models. In addition, to confirm the superiority of deep learning-based reconstruction over traditional iterative reconstruction, we further compared the FBPnet-Sep(Conv+Inc+CA) model with the OSEM-Sep(Conv+Inc+CA) approach, which reconstructed the dual-tracer image by ordered-subset expectation maximization (OSEM) algorithm [45] with 6 iterations and 5 subsets.
In Experiment 3, we investigated the generalization of the FBPnet-Sep model to several dose levels. The model trained in Experiment 1 was tested by \(^{18}\)F-FDG/\(^{11}\)C-FMZ data of 1/2, 1/3 and 1/5 standard dose. The low-dose PET data were obtained by event reduction of test data used in Experiment 1.
In Experiment 4, data of four tracer combinations were together used to train the models, which represented different relationships between two tracers. As listed in Table 2, \(^{11}\)C-FMZ has different half-life and kinetic type from \(^{18}\)F-FDG. \(^{11}\)C-MET differs in half-life from \(^{18}\)F-FDG while \(^{18}\)F-AV45 differs in kinetic type. And \(^{18}\)F-FLT has the same half-life and kinetic type with \(^{18}\)F-FDG. To deal with multiple tracer combinations, we improved the original FBPnet-Sep model by taking the decay correction into account, getting the FBPnet-DC-Sep model. The two models were tested and compared.
Data simulation
Phantoms
The two-dimensional brain phantoms used for data simulation were modified from the 3D Zubal brain phantom [46]. We chose 40 slices having different structures. The 40 phantoms are sized 128 pixels \(\times\) 128 pixels, and each contains up to five regions of interest (ROIs). The average size of ROI 1 to ROI 5 are 1393, 1427, 78, 110 and 130 pixels. Representative phantoms are shown in Fig. 2.
Generation of dynamic activity images
The dynamic PET images were generated based on the two-tissue compartment model [44], which is a widely used kinetic model of PET tracers. It is described as:
\(C_P(t)\) is the plasma TAC, i.e., the plasma input function. \(C_T(t)\) is the tissue TAC, which is the summation of TACs of two compartments, \(C_1(t)\) and \(C_2(t)\). The two compartments represent different metabolic states in tissue of the tracer. The rates of exchanges between different compartments (including the blood vessel) depend on the rate constants \(K_1\), \(k_2\), \(k_3\) and \(k_4\), which are also known as kinetic parameters. The values of these parameters are related to the pharmacokinetics of the tracer and physiological characteristics of its molecular target. When \(k_3\gg k_4\), the transfer from Compartment 1 to Compartment 2 is regarded irreversible, the tracer is therefore considered having irreversible kinetics. Otherwise, the bound between the tracer and its target is reversible.
The Compartment Model Kinetic Analysis Tool (COMKAT) [47] provides numerical solutions of compartment models. Before solving Eqs. (9)–(11), the scanning protocol, input function and kinetic parameters of each pixel were determined. As an example, the \(^{18}\)F-FDG dynamic scan was designed with a duration of 60 min and divided into 26 frames (15 s \(\times\) 8, 60 s \(\times\) 8, 300 s \(\times\) 10). The input function of \(^{18}\)F-FDG is modeled as:
According to experimental data of human subjects [48], \(A_1=851.1 \mu Ci/mL\), \(A_2=21.88 \mu Ci/mL\), \(A_3=20.81 \mu Ci/mL\), \(\lambda _1=-4.134 min^{-1}\), \(\lambda _2=-0.1191 min^{-1}\), \(\lambda _3=-0.01043 min^{-1}\). In order to mimic the individual difference in physiological states, the Gaussian randomization was applied to the parameters of input function as well as the kinetic parameters of each ROI. Mean values of the parameters were referred to those reported in previous studies [18, 48], and the standard deviations were set to 10% of the mean values. Totally, 22 groups of physiological parameters were simulated. The output of compartment model was computed by COMKAT for each pixel to form the dynamic activity image of \(^{18}\)F-FDG.
Images of \(^{11}\)C-FMZ, \(^{11}\)C-MET, \(^{18}\)F-AV45 and \(^{18}\)F-FLT were generated in the same way, following the same 60-min-26-frame protocol. The input functions and the empirical values of kinetic parameters were different among tracers, which were determined according to corresponding researches [18, 48,49,50,51,52]. The dual-tracer activity images were obtained by adding two single-tracer images. All images were sized 128 pixels \(\times\) 128 pixels \(\times\) 26 frames.
Generation of dynamic sinograms
The dynamic single-tracer sinograms were obtained by projecting the single-tracer activity images using the Michigan Image Reconstruction Toolbox [53]. The geometry of Inveon PET/CT scanner (Siemens) was simulated. Subsequently, 20% random coincidence events were added to the projections, and the Poisson noises were simulated. The dual-tracer sinograms were the summation of two single-tracer sinograms. All sinograms were sized 128 bins \(\times\) 160 angles \(\times\) 26 frames.
Data preprocessing
For each tracer combination, a total of 880 groups (22 sets of physiological parameters \(\times\) 40 phantoms) of dynamic PET data were generated. Each group of matched dynamic data consisted of the noisy dual-tracer sinogram (\(S_{dual}\)), two noisy single-tracer sinograms (\(S_1\), \(S_2\)), noise-free dual-tracer activity image (\(I_{dual}\)) and two noise-free single-tracer activity images (\(I_1\), \(I_2\)). In each group, sinograms were scaled by dividing max(\(S_{dual}\))/3, and images were scaled by dividing max(\(S_{dual}\))/150.
Training details
The 880 groups of data were randomly split into training and test datasets by 10:1 according to parameter sets. In other words, all the 40 phantom slices were included in three datasets, while the physiological parameters were not repeated. The sample sizes are listed in Table 1. Note that in the training of FBPnet-DC-Sep model in Experiment 4, data was augmented by including \(^{11}\)C-FMZ/\(^{18}\)F-FDG and \(^{11}\)C-MET/\(^{18}\)F-FDG combinations, which were obtained by switching two tracers of \(^{18}\)F-FDG/\(^{11}\)C-FMZ and \(^{18}\)F-FDG/\(^{11}\)C-MET combinations.
Network training was performed on PyTorch 1.11 by a NVIDIA TITAN RTX graphics card. In the training session of each experiment, the FBP-Net and separation network were separately pretrained for 300 epochs and 100 epochs respectively, using the training dataset listed in Table 1, and both with a batch size of 4 and a learning rate of 0.0001. The entire models were trained for 100 epochs based on pre-trained network parameters and by the same dataset, with a batch size of 16 and a learning rate of 0.0001. No early stopping or regularization was used during training.
Evaluative metrics
The performance of different methods was quantitatively evaluated by MSE, SSIM and peak signal-to-noise ratio (PSNR) between the predictions and labels. In Experiment 1, bias and standard deviation of ROI TACs and images were also evaluated.
In Experiment 4, we also estimated the macro-parameters from the predicted ROI TACs, and compared them to those derived from the label TACs. For irreversible tracers (\(^{18}\)F-FDG, \(^{11}\)C-MET, \(^{18}\)F-FLT), the net uptake rate \(K_i\) was estimated by Patlak plot [54]. For reversible tracers (\(^{11}\)C-FMZ, \(^{18}\)F-AV45), the total distribution volume \(V_T\) was estimated by Logan plot [54]. The parameters were computed in COMKAT using data of the last 10 frames. And the relative errors of the estimated macro-parameters were calculated.
Results
Experiment 1
Figure 3 displays the single-tracer images reconstructed by MLEM, FBPnet-Sep model, Sep-FBPnet model and Multi-task CNN, among which the results of the proposed FBPnet-Sep model show clearest details at the boundary of ROIs.
Table 3 lists the mean values of the metrics. For all methods, the predictions of \(^{18}\)F-FDG images are more accurate than those of \(^{11}\)C-FMZ images. For both tracers, the FBPnet-Sep model has highest SSIM and PSNR, and lowest MSE compared to other methods. Since the noise-free single-tracer images were used as labels, all three deep-learning methods show superior performance than MLEM. In the following comparisons, MLEM was not included. Figure 4 displays the frame-wise metrics. For both \(^{18}\)F-FDG and \(^{11}\)C-FMZ, the superiority of FBPnet-Sep model lasts throughout the scan.
Table 4 lists the mean bias and standard deviations of ROI TACs predicted by different models. For \(^{18}\)F-FDG images, FBPnet-Sep model shows lowest bias in all ROIs except ROI 1. The other two models show high bias in small ROIs (ROI 3–5). FBPnet-Sep model reconstructs images with lowest variation in ROI 1, 2 and 4. As for \(^{11}\)C-FMZ images, the proposed FBPnet-Sep model show lowest bias in ROI 1, 2 and 3, but higher bias in ROI 4 and 5. The other two methods are highly biased in most ROIs. And the FBPnet-Sep model has lowest variation in all ROIs. Figure 5 plots the mean bias and standard deviation of images predicted by these methods, among which the FBPnet-Sep method reconstructs images with lowest bias and standard deviation.
Experiment 2
Figure 6 displays the representative single-tracer images predicted by FBPnet-Sep model and its ablation models. The full FBPnet-Sep(Conv+Inc+CA) model reconstructs images with best qualities, followed by FBPnet-Sep(Conv+Inc) or FBPnet-Sep(Conv+CA), while the basic FBPnet-Sep(Conv) model reconstructs images with poorer qualities. Images predicted by the OSEM-Sep(Conv+Inc+CA) approach are severely blurred.
Quantitative evaluations are listed in Table 5. For all models, the predicted \(^{18}\)F-FDG images have higher SSIM and PSNR than \(^{11}\)C-FMZ images, and have lower MSE except the FBPnet-Sep(Conv) model. For \(^{18}\)F-FDG, the FBPnet-Sep(Conv+Inc+CA) model show best performance, which is consistent with the observations in Fig. 6. For \(^{11}\)C-FMZ, the FBPnet-Sep(Conv+Inc+CA) as well as the FBPnet-Sep(Conv+Inc) models show better performance than other models.
Experiment 3
Table 6 records the quantitative evaluations of the predicted images under different dose levels. The counts of standard dose were around 1e7. Same as the results of former experiments, the reconstructed \(^{18}\)F-FDG images are more accurate than \(^{11}\)C-FMZ images. Although slightly degrades with the decreasing dose, the qualities of reconstructed images of both tracers are acceptable even if under low-dose conditions.
Experiment 4
Table 7 displays the metrics of predicted images from four tracer combinations. Both FBPnet-Sep and FBPnet-DC-Sep models were capable to deal with multiple tracer combinations. In most cases, the FBPnet-DC-Sep model exceeded the FBPnet-Sep model. Figure 7 further plots the metrics of FBPnet-DC-Sep predictions. In each tracer combination, the image qualities of \(^{18}\)F-FDG predictions were better than Tracer 2 except the PSNR of \(^{18}\)F-AV45 images. Among the other four tracers used as Tracer 2, \(^{18}\)F-FLT had lowest SSIM and PSNR, and highest MSE.
Figure 8 plots the representative ROI TACs extracted from the predicted images. ROI 1 and ROI 5 were chosen to represent large ROIs and small ROIs respectively. In all tracer combinations, the TACs predicted by FBPnet-DC-Sep model fitted better to the ground truth than TACs predicted by FBPnet-Sep model.
Macro-parameters of all tracers and all ROIs were estimated using the TACs derived from label images and predicted images. Figure 9 plots the average parameters of the test dataset. For \(^{18}\)F-FDG, the \(K_i\) derived from predicted images were close to the ground truth. And no obvious difference was found between the FBPnet-DC-Sep and FBPnet-Sep models. Similar results were found in the \(K_i\) of \(^{18}\)F-FLT. In the subplots of \(^{11}\)C-FMZ and \(^{18}\)F-AV45, \(V_T\) derived from FBPnet-DC-Sep model were more accurate than those from FBPnet-Sep model. However, the \(^{11}\)C-MET \(K_i\) derived from the FBPnet-DC-Sep model were more biased. Table 8 lists the average relative errors of the estimated macro-parameters. The results were in general consistent with Fig. 9, showing that the FBPnet-DC-Sep model was comparable to or better than FBPnet-Sep model in all tracers but \(^{11}\)C-MET. However, the accuracy of macro-parameters was sensitive to tracers and ROIs.
Discussion
In this study, we proposed a CNN-based approach for the separation of simultaneously injected dual-tracer PET imaging. The network incorporated the Inception module and channel attention module. The Inception module used \(H\times\)1 and 1\(\times W\) kernels to extract global spatial features. When applied to sinograms, it extracted features by projection angles and distance bins off the center of view. When applied to images, it extracted features by rows and columns. And the attention vector learned by channel attention could also be regarded as a global filter in time dimension. The separation network was connected to FBP-Net to predict single-tracer images from dual-tracer sinogram.
In Experiment 1, the proposed FBPnet-Sep model was firstly compared with Multi-task CNN. The Multi-task CNN had an encoder-decoder framework, while FBPnet-Sep model adopted a more interpretable reconstruction-separation framework. The results in Figs. 3, 4, 5 and Tables 3, 4 and showed that images predicted by FBPnet-Sep model had better qualities than predicted by Multi-task CNN, which might be attributed to using Inception and channel attention to process information globally rather than fully using small convolution kernels.
Additionally, we also compared image separation (FBPnet-Sep model) with sinogram separation (Sep-FBPnet model). The results showed that the FBPnet-Sep model performed better than the Sep-FBPnet model. This might be due to the differences in tasks and data distributions. In the Sep-FBPnet model, the dual-tracer sinogram \(S_{dual}\) was separated to get single-tracer sinograms \(S_1\) and \(S_2\), both containing noises. It was beyond the capability of the model to separate the random noises. Apart from that, \(S_1\) and \(S_2\) were from different distributions. When they were simultaneously fed into the FBP-Net, the distribution of input became decentralized, making reconstruction more difficult. Contrarily in the FBPnet-Sep model, the FBP-Net reconstructed dual-tracer image \(I_{dual}\) from \(S_{dual}\), of which the distribution was relatively centralized. Besides, the reconstructed \(I_{dual}\) had already been denoised, which was easier for the separation network to process.
In Experiment 2, the ablation study of the FBPnet-Sep model was conducted. We first studied on the separation part. From the results in Fig. 6 and Table 5, adding Inception module or channel attention into the basic convolutional separation network improved the model performance, and incorporating both modules achieved best performance. These results demonstrated the effectiveness of the Inception and channel attention modules. In addition, we also compared the FBPnet-Sep and OSEM-Sep approaches. The obvious superiority of the former indicated the necessity of deep learning-based reconstruction instead of traditional reconstruction algorithm.
The radiation safety and restrictions on injected doses is one of the concerns in dual-tracer PET imaging. In Experiment 3, we investigated the generalization of FBPnet-Sep model to low-dose data. As shown in Table 6, the FBPnet-Sep model could directly generalize to data of 1/2, 1/3 and 1/5 standard dose without training.
According to Eq.(2), temporal changes in radioactivity concentrations depend on both the physical radioactive decay of the tracer, which is characterized by half-life, and the physiological process the tracer is involved, which is characterized by kinetic parameters. In single-tracer PET imaging, decay correction is performed during the reconstruction. In dual-tracer PET imaging, however, decay correction cannot be conducted when two tracers have different half-lives. In that case, deep learning methods for dual-tracer PET reconstruction need to learn the hidden information about radioactive decay in addition to the kinetic characteristics.
In Experiment 4, we investigated the capability of the proposed method to deal with multiple tracer combinations. To cope with tracer pairs having different half-lives, we also proposed and tested the FBPnet-DC-Sep model. As displayed in Table 7 and Fig. 7, both methods could well predict single-tracer images of multiple tracer combinations, with the FBPnet-DC-Sep model showing better performance in most cases. According to Figs. 8, 9 and Table 8, both methods could derive ROI TACs and macro-parameters with acceptable accuracy, with the FBPnet-DC-Sep performing slightly better. These results indicated the effectiveness of the decay correction module. Although the corrections were biased, the processed images provided hints of decay information, which might make it easier for the separation network to extract useful features.
The current study still had several limitations. As shown in results, even though the method could separate two tracers, the derived TACs and macro-parameters were not accurate enough, which limited the application of dual-tracer PET imaging in further quantitative analysis. In addition, due to the lack of data from real simultaneous dual-tracer PET scans, the proposed method was verified only by simulated data. The scarcity of training data from real PET scans also limits the current study as well as other deep learning-based separation methods to separate images slice by slice. Another limitation of the proposed method is that it cannot be generalized to other phantoms, as shown in the supplementary material.
The acquisition and utilizing of training data from real dual-tracer PET scans face many challenges, like the inadequacy of paired data, image misalignment and image noises. In future studies, incorporating the kinetic model into deep learning methods for dual-tracer reconstruction and separation should be taken into consideration. A hybrid of data-driven and model-driven methods can increase the network interpretability, reduce the need for training data and be less sensitive to noises. Also, it can be easily applied to the training using ROI TACs, to increase the accuracy of recovered ROI TACs and macro-parameters. Furthermore, the generalization of the reconstruction methods to data from new distributions should also be investigated.
Conclusions
We proposed a CNN-based model incorporating global spatial information and channel attention for the signal separation of dual-tracer PET. The separation network was extended to the FBPnet-Sep model, which predicted the single-tracer images from the dual-tracer sinogram. The FBPnet-Sep model was confirmed superior to the previously proposed Multi-task CNN. The effectiveness of Inception and channel attention modules was also verified. Moreover, the FBPnet-Sep model could be applied to data of low doses or multiple tracer combinations. Therefore, the FBPnet-Sep model can be considered as a potential method for dual-tracer PET reconstruction.
Availability of data and materials
The datasets used and analysed during the current study are available from the corresponding author on reasonable request.
Abbreviations
- PET:
-
Positron emission tomography
- TAC:
-
Time-activity curve
- CNN:
-
Convolutional neural network
- FBP:
-
Filtered back-projection
- MSE:
-
Mean squared error
- SSIM:
-
Structural similarity
- MLEM:
-
Maximum likelihood expectation maximization
- OSEM:
-
Ordered-subset expectation maximization
- ROI:
-
Region of interest
- PSNR:
-
Peak signal-to-noise ratio
References
Mertens K, Slaets D, Lambert B, Acou M, De Vos F, Goethals I. PET with 18F-labelled choline-based tracers for tumour imaging: a review of the literature. Eur J Nucl Med Mol Imaging. 2010;37(11):2188–93. https://doi.org/10.1007/s00259-010-1496-z.
de Zwart PL, van Dijken BRJ, Holtman GA, Stormezand GN, Dierckx RAJO, van Laar PJ, et al. Diagnostic accuracy of PET tracers for the differentiation of tumor progression from treatment-related changes in high-grade glioma: a systematic review and metaanalysis. J Nucl Med. 2020;61(4):498–504. https://doi.org/10.2967/jnumed.119.233809.
Okamura N, Harada R, Ishiki A, Kikuchi A, Nakamura T, Kudo Y. The development and validation of tau PET tracers: current status and future directions. Clin Transl Imaging. 2018;6(4):305–16. https://doi.org/10.1007/s40336-018-0290-y.
Firouzian A, Whittington A, Searle GE, Koychev I, Zamboni G, Lovestone S, et al. Imaging Aβ and tau in early stage Alzheimer’s disease with [18F]AV45 and [18F]AV1451. EJNMMI Res. 2018;8(1):19. https://doi.org/10.1186/s13550-018-0371-y.
Michalski K, Ruf J, Goetz C, Seitz AK, Buck AK, Lapa C, et al. Prognostic implications of dual tracer PET/CT: PSMA ligand and [18F]FDG PET/CT in patients undergoing [177Lu]PSMA radioligand therapy. Eur J Nucl Med Mol Imaging. 2021;48(6):2024–30. https://doi.org/10.1007/s00259-020-05160-8.
Andreyev A, Celler A. Dual-isotope PET using positron-gamma emitters. Phys Med Biol. 2011;56(14):4539. https://doi.org/10.1088/0031-9155/56/14/020.
Fukuchi T, Okauchi T, Shigeta M, Yamamoto S, Watanabe Y, Enomoto S. Positron emission tomography with additional γ-ray detectors for multiple-tracer imaging. Med Phys. 2017;44(6):2257–66. https://doi.org/10.1002/mp.12149.
Fukuchi T, Shigeta M, Haba H, Mori D, Yokokita T, Komori Y, et al. Image reconstruction method for dual-isotope positron emission tomography. J Instrum. 2021;16(01):P01035. https://doi.org/10.1088/1748-0221/16/01/P01035.
Pratt EC, Lopez-Montes A, Volpe A, Crowley MJ, Carter LM, Mittal V, et al. Simultaneous quantitative imaging of two PET radiotracers via the detection of positron–electron annihilation and prompt gamma emissions. Nat Biomed Eng. 2023;7(8):1028–39. https://doi.org/10.1038/s41551-023-01060-y.
Kadrmas DJ, Hoffman JM. Methodology for quantitative rapid multi-tracer PET tumor characterizations. Theranostics. 2013;3:757–73. https://doi.org/10.7150/thno.5201.
Huang SC, Carson RE, Hoffman EJ, Kuhl DE, Phelps ME. An investigation of a double-tracer technique for positron computerized tomography. J Nucl Med. 1982;23(9):816–22.
Figueiras FP, Jiménez X, Pareto D, Gómez V, Llop J, Herance R, et al. Simultaneous dual-tracer PET imaging of the rat brain and its application in the study of cerebral ischemia. Mol Imaging Biol. 2011;13(3):500–10. https://doi.org/10.1007/s11307-010-0370-5.
Koeppe RA, Raffel DM, Snyder SE, Ficaro EP, Kilbourn MR, Kuhl DE. Dual-[11C]tracer single-acquisition positron emission tomography studies. J Cereb Blood Flow Metab. 2001;21(12):1480–92. https://doi.org/10.1097/00004647-200112000-00013.
Rust TC, Kadrmas DJ. Rapid dual-tracer PTSM+ATSM PET imaging of tumour blood flow and hypoxia: a simulation study. Phys Med Biol. 2005;51(1):61. https://doi.org/10.1088/0031-9155/51/1/005.
Black NF, McJames S, Kadrmas DJ. Rapid multi-tracer PET tumor imaging with 18F-FDG and secondary shorter-lived tracers. IEEE Trans Nucl Sci. 2009;56(5):2750–8. https://doi.org/10.1109/TNS.2009.2026417.
Joshi AD, Koeppe RA, Fessier JA, Kilbourn MR. Signal separation and parameter estimation in noninvasive dual-tracer PET scans using reference-region approaches. J Cereb Blood Flow Metab. 2009;29(7):1346–57. https://doi.org/10.1038/jcbfm.2009.53.
Kadrmas DJ, Rust TC, Hoffman JM. Single-scan dual-tracer FLT+FDG PET tumor characterization. Phys Med Biol. 2013;58(3):429. https://doi.org/10.1088/0031-9155/58/3/429.
Cheng X, Li Z, Liu Z, Navab N, Huang SC, Keller U, et al. Direct parametric image reconstruction in reduced parameter space for rapid multi-tracer PET imaging. IEEE Trans Med Imaging. 2015;34(7):1498–512. https://doi.org/10.1109/TMI.2015.2403300.
Zhang JL, Morey AM, Kadrmas DJ. Application of separable parameter space techniques to multi-tracer PET compartment modeling. Phys Med Biol. 2016;61(3):1238. https://doi.org/10.1088/0031-9155/61/3/1238.
Gao F, Liu H, Jian Y, Shi P. Dynamic dual-tracer PET reconstruction. In: Prince JL, Pham DL, Myers KJ, editors. Information processing in medical imaging. Berlin: Springer; 2009. p. 38–49.
Kadrmas DJ, Rust TC. Feasibility of rapid multitracer PET tumor imaging. IEEE Trans Nucl Sci. 2005;52(5):1341–7. https://doi.org/10.1109/TNS.2005.858230.
El Fakhri G, Trott CM, Sitek A, Bonab A, Alpert NM. Dual-tracer PET using generalized factor analysis of dynamic sequences. Mol Imaging Biol. 2013;15(6):666–74. https://doi.org/10.1007/s11307-013-0631-1.
Verhaeghe J, Reader AJ. Simultaneous water activation and glucose metabolic rate imaging with PET. Phys Med Biol. 2013;58(3):393. https://doi.org/10.1088/0031-9155/58/3/393.
Taheri N, Crom BL, Bouillot C, Chérel M, Costes N, Gouard S, et al. Design of a generic method for single dual-tracer PET imaging acquisition in clinical routine. Phys Med Biol. 2023;68(8): 085016. https://doi.org/10.1088/1361-6560/acc723.
Ding W, Yu J, Zheng C, Fu P, Huang Q, Feng DD, et al. Machine learning-based noninvasive quantification of single-imaging session dual-tracer 18F-FDG and 68Ga-DOTATATE dynamic PET-CT in oncology. IEEE Trans Med Imaging. 2022;41(2):347–59. https://doi.org/10.1109/TMI.2021.3112783.
Ruan D, Liu H. Separation of a mixture of simultaneous dual-tracer PET signals: a data-driven approach. IEEE Trans Nucl Sci. 2017;64(9):2588–97. https://doi.org/10.1109/TNS.2017.2736644.
Xu J, Liu H. Deep-learning-based separation of a mixture of dual-tracer single-acquisition PET signals with equal half-lives: a simulation study. IEEE Trans Radiat Plasma Med Sci. 2019;3(6):649–59. https://doi.org/10.1109/TRPMS.2019.2897120.
Wan Y, Ye H, Liu H. Deep-learning based joint estimation of dual-tracer PET image activity maps and clustering of time activity curves. In: Bosmans H, Zhao W, Yu L, editors. Medical imaging 2021: physics of medical imaging, vol. 11595. International Society for Optics and Photonics. SPIE; 2021. p. 115953T. https://doi.org/10.1117/12.2580873.
Qing M, Wan Y, Huang W, Xu Y, Liu H. Separation of dual-tracer PET signals using a deep stacking network. Nucl Instrum Methods Phys Res Sect A. 2021;1013: 165681. https://doi.org/10.1016/j.nima.2021.165681.
Tong J, Wang C, Liu H. Temporal information-guided dynamic dual-tracer PET signal separation network. Med Phys. 2022;49(7):4585–98. https://doi.org/10.1002/mp.15566.
Lian D, Li Y, Liu H. Spatiotemporal attention constrained deep learning framework for dual-tracer PET imaging. In: Yang G, Aviles-Rivero A, Roberts M, Schönlieb CB, editors. Medical image understanding and analysis. Cham: Springer; 2022. p. 87–100.
Pan B, Marsden PK, Reader AJ. Dual-tracer PET image separation by deep learning: a simulation study. Appl Sci. 2023;13(7):66. https://doi.org/10.3390/app13074089.
Xu J, Liu H. Three-dimensional convolutional neural networks for simultaneous dual-tracer PET imaging. Phys Med Biol. 2019;64(18): 185016. https://doi.org/10.1088/1361-6560/ab3103.
Zeng F, Fang J, Muhashi A, Liu H. Direct reconstruction for simultaneous dual-tracer PET imaging based on multi-task learning. EJNMMI Res. 2023;13(1):7. https://doi.org/10.1186/s13550-023-00955-w.
Wang C, Fang J, Liu H. Direct reconstruction and separation for triple-tracer PET imaging based on three-dimensional encoder-decoder network. In: Yu L, Fahrig R, Sabol JM, editors. Medical imaging 2023: physics of medical imaging, vol. 12463. International Society for Optics and Photonics. SPIE; 2023. p. 124632P. https://doi.org/10.1117/12.2653876.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR); 2015. p. 1–9. https://doi.org/10.1109/CVPR.2015.7298594.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR); 2016. p. 2818–26. https://doi.org/10.1109/CVPR.2016.308.
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhutdinov R, et al. Show, attend and tell: neural image caption generation with visual attention. https://doi.org/10.48550/arXiv.1502.03044.
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition; 2018. p. 7132–41. https://doi.org/10.1109/CVPR.2018.00745.
Wang X, Girshick R, Gupta A, He K. Non-local neural networks. https://doi.org/10.48550/arXiv.1711.07971.
Woo S, Park J, Lee JY, Kweon IS. CBAM: convolutional block attention module. https://doi.org/10.48550/arXiv.1807.06521.
Wang B, Liu H. FBP-Net for direct reconstruction of dynamic PET images. Phys Med Biol. 2020;65(23): 235008. https://doi.org/10.1088/1361-6560/abc09d.
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600–12. https://doi.org/10.1109/TIP.2003.819861.
Gunn RN, Gunn SR, Cunningham VJ. Positron emission tomography compartmental models. J Cereb Blood Flow Metab. 2001;21(6):635–52. https://doi.org/10.1097/00004647-200106000-00002.
Hudson HM, Larkin RS. Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans Med Imaging. 1994;13(4):601–9. https://doi.org/10.1109/42.363108.
Zubal IG, Harrell CR, Smith EO, Rattner Z, Gindi G, Hoffer PB. Computerized three-dimensional segmented human anatomy. Med Phys. 1994;21(2):299–302. https://doi.org/10.1118/1.597290.
Raymond F, Muzic J, Cornelius S. COMKAT: compartment model kinetic analysis tool. J Nucl Med. 2001;42(4):636–45.
Feng D, Huang SC, Wang X. Models for computer simulation studies of input functions for tracer kinetic modeling with positron emission tomography. Int J Biomed Comput. 1993;32(2):95–110. https://doi.org/10.1016/0020-7101(93)90049-C.
Wang B, Ruan D, Liu H. Noninvasive estimation of macro-parameters by deep learning. IEEE Trans Radiat Plasma Med Sci. 2020;4(6):684–95. https://doi.org/10.1109/TRPMS.2020.2979017.
Koeppe RA, Holthoff VA, Frey KA, Kilbourn MR, Kuhl DE. Compartmental analysis of [11C]Flumazenil kinetics for the estimation of ligand transport rate and receptor distribution using positron emission tomography. J Cereb Blood Flow Metab. 1991;11(5):735–44. https://doi.org/10.1038/jcbfm.1991.130.
Ottoy J, Verhaeghe J, Niemantsverdriet E, Wyffels L, Somers C, Roeck ED, et al. Validation of the semiquantitative static SUVR method for 18F-AV45 PET by pharmacokinetic modeling with an arterial input function. J Nucl Med. 2017;58(9):1483–9. https://doi.org/10.2967/jnumed.116.184481.
Ullrich R, Backes H, Li H, Kracht L, Miletic H, Kesper K, et al. Glioma proliferation as assessed by 3′-Fluoro-3′-Deoxy-l-Thymidine positron emission tomography in patients with newly diagnosed high-grade glioma. Clin Cancer Res. 2008;14(7):2049–55. https://doi.org/10.1158/1078-0432.CCR-07-1553.
Fessler JA. Michigan image reconstruction toolbox. https://web.eecs.umich.edu/~fessler/code/.
Logan J. Graphical analysis of PET data applied to reversible and irreversible tracers. Nucl Med Biol. 2000;27(7):661–70. https://doi.org/10.1016/S0969-8051(00)00137-2.
Acknowledgements
Not applicable.
Funding
This work was supported in part by the National Key Research and Development Program of China (No: 2020AAA0109502), by the National Natural Science Foundation of China (No: U1809204, 61525106, 61427807) and by the Talent Program of Zhejiang Province (No: 2021R51004).
Author information
Authors and Affiliations
Contributions
Study design: JF, FZ, HL. Data simulation and netowrk training: JF, FZ. Results analysis and manuscript writing: JF. Manuscript revision: HL. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fang, J., Zeng, F. & Liu, H. Signal separation of simultaneous dual-tracer PET imaging based on global spatial information and channel attention. EJNMMI Phys 11, 47 (2024). https://doi.org/10.1186/s40658-024-00649-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s40658-024-00649-9