Automated mPCa lesion segmentation with self-configuring nnU-Net framework

An automated segmentation framework based on deep learning for metastatic prostate cancer (mPCa) lesions in whole-body [68Ga]Ga-PSMA-11 PET/CT images for the purpose of extracting patient-level prognostic biomarkers.

AI-tool features

Disease model: metastatic prostate cancer (mPCa)

Imaging Task: Lesion segmentation from whole-body 68Ga-PSMA-11 PET/CT images

AI-method: nnU-Net, a self-configuring biomedical image segmentation framework that automates key aspects of the segmentation pipeline according to a set of formulated heuristics that are task-agnostic.

Architecture: 3D U-Net cascade consisting of two 3D U-Nets

Data Populations: Three hundred thirty-seven (N=337) [68Ga]Ga-PSMA-11 PET/CT images were retrieved from a cohort of biochemically recurrent PCa patients. Before training the model, PSMA-negative patient scans (n = 53) were separated from the total dataset and reserved solely as negative controls for model testing to mitigate the already large class imbalance in the dataset. Of the remaining PSMA-positive scans (n = 284), approximately 25% (n = 75) were randomly assigned to the test set while the rest were used for model training (n = 209). This random split was done at the patient level, meaning there was no patient cross-over between the training and testing set which could represent a form of data leakage that could bias the results.

Training Method: Prior to input into the cascade network, patient CT scans were resampled into the same coordinate space as the PET images using B-spline interpolation and PET scans were converted into SUVbodyweight. The first 3D U-Net in the cascade was trained on down-sampled PET and CT images (patch size = 80 × 80 × 224, voxel resolution = 5.22 × 5.22 × 2 mm3), so as to incorporate more contextual information from the images, and generated a coarse segmentation map. This segmentation map then served as a third channel input (along with the PET and CT image inputs) into the second 3D U-Net. The input PET and CT images adopted this time a full resolution (patch size = 96 × 96 × 256, voxel resolution = 4.07 × 4.07 × 2 mm3). The second U-Net yielded the final volumetric segmentation.

The U-Nets configured by the nnU-Net framework share much of the same characteristics as the original U-Net design, with minor modifications such as the use of instance normalisation, and leaky ReLU as the activation function. Both components of the cascade network were trained using five-fold cross-validation, where each fold was trained for a total of 1000 epochs using stochastic gradient descent with an initial learning rate of 0.01 that decayed to zero at the last training epoch. The dice and binary cross entropy loss functions were summed together with an equal weighting to create the final loss function used throughout training . Further details about the nnU-Net design choices and empirical pipeline configurations based on dataset properties can be found in the nnU-Net reference paper and the associated GitHub repository. Models were trained on an NVIDIA Titan RTX GPU on PyTorch version 1.10.

Evaluation Method: Segmentation pipeline performance assessed by way of voxel-level comparisons between the generated fully automatic segmentation mask output and the corresponding ground truth manual segmentation for each testing set scan. Evaluating metrics were calculated at 3 different levels to assess model’s ability to perform three different computer vision tasks:

  • patient-level classification (metrics: accuracy, sensitivity, positive predictive value, specificity, and negative predictive value)
  • lesion-level detection (metrics: positive predictive value, sensitivity, and F1 score (i.e. harmonic mean of positive predictive value and sensitivity)
  • voxel-level (network) segmentation (metrics: Dice similarity coefficient, sensitivity, and positive predictive value)

Quantitative imaging biomarkers were extracted from the automatically generated segmentations and assessed for their potential to stratify patients based on overall survival.

  • total lesional volume (TLVauto) was quantified by adding the volume of all positive voxels identified in the automated segmentations, and
  • total lesional uptake (TLUauto) calculated by summing the SUVs of the identified positive voxels.

Clinical potential: The nnU-Net has demonstrated considerable generalisable potential by achieving state-of-the-art results across a wide variety of different biomedical image segmentation tasks

Reference Links

Source Code:

  • GitHub page of the self-configuring nnU-Net framework

Publications:

Leave a comment