Edit this page

NA-MIC Project Weeks

Back to Projects List

Lumbar Spine Segmentation using MONAI Label

Key Investigators

Project Description

Our goal is to have a trained model in MONAI that is able to segment five lumbar vertebrae and four intervertebral discs, working with the 3D volumes of a public dataset.


Train and deploy neural network to segment lumbar verteral bodies and intervertebral discs from MRI 3D volumes of the lumbar region.

  1. Objective A. Prepare the public dataset of choice, using the T2.
  2. Objective B. Train MONAI to segment the 5 vertebrae (from L1 to L5) and the 4 intervertebral corresponding discs.
  3. Objective C. Deploy the trained model to another computer and be able to run it to perform segmentations without being connected to a server.

Approach and Plan

  1. Choose a public dataset
  2. Get the files ready for MONAI.
  3. Install the MONAI server locally
  4. Train the model

Next Steps

  1. The chosen dataset is Multi-scanner and multi-modal lumbar vertebral body and intervertebral disc segmentation database: consisting of 39 patients. From all the images provided, we selected 51 of them, the ones with a T2 volume.
  2. Change the file format to .nii for the selected t2 images.
  3. Unify the 9 segments into a single file (5 segments for vertebrae and 4 for discs)

Past issues


T2 volume with the mask file, showing the 9 segments:

The segmentation, with the corresponding labels, as seen in 3DSlicer


Using the public dataset, once all segments had been placed in the same .nii file, and with the right segment IDs: 1 to 5 for the vertebrae (named from top to bottom) and 6 to 9 for the discs (named from top to bottom as well), we trained the model using different setups.

The tags were configured like this in the segmentation.py file:

These were commented or left as this, depending on the number of tags to use on each training attempt. One issue we found is that tags should be always be numbered starting at 1; if we tried to run the model using tags from 6 to 9 we would get an error and the model wouldn’t train. This issue will be solved soon in the code.

Al Khalil dataset

2000 epochs and 9 tags

The accuracy was too low (23%), and the results weren’t good. Note also that we had only 32 images for training and 8 for validation. So we went for less tags.

2000 epochs and 5 tags

This time we are only segmentating bone: tags 1 to 5 corresponding to the vertebrae.

Visual results are better, but the accuracy is still too low (51%).

2000 epochs and 2 tags

Trying to determine if the poor results were caused by the lack of an appropriate nuber of images, the model was trained for only two tags: vertebrae L1 and L2.

Accuracy 52%, almost the same than when aiming for 5 tags.

A possible approach then would be to train five different networks, each for two elements. Then use it to segment volumes from another dataset, put all of these segmentations into a single file (like what we did when we prepared the Al Khalil dataset) and have a larger segmented dataset with the 9 tags. This idea was put on hold, to go for further training with different approaches.

CHU dataset

We tried with another public dataset, called CHU for short, Annotated T2-weighted MR images of the Lower Spine. This one only has bone, and all seven vertebrae (the 2 last thoractic and 5 lumbar) are all in the same tag.

The dataset comprises 23 images, and we hand processed them, separating the 7 vertebrae into 7 segments (numbered from top to bottom).

2000 epochs and 5 tags

Good accuracy (92%), acceptable visual results, but there is an evident confusion in one of the vertebrae, where tags are mixed.

We tried one of the images from the Al Khalil database with this model and got this result, because of the different ways the x,y,z coordinates are oriented on each dataset:

2000 epochs and 1 tags

This is using a single tag with all the vertebrae together. (The image shows only the last segment, but it segmented the 7 vertebrae)

Accuracy 94%.


Andres Diaz Pinto suggested to train using the deep edit module, instead of the segmentation we had been using until the moment.

The computer we have been using for this has a RTX 3070, and we tailored the memory usage for a 128x128x128 train image size:

200 epochs and 9 tags

It took 22 hours to train. Again, we had 32+8 images and 9 tags.

Taking into account the low number of images used for training, the results are good. But not good enough to start segmentating other images. Accuracy is 80%.

Work in progress

Background and References


Many thanks to all those who stopped by the Discord channel to contribute their knowledge. And especially thanks to Andrés Diaz-Pinto for his availability and patience in helping us to configure the models.