Edit this page

NA-MIC Project Weeks

Back to Projects List

Conversion of bone marrow smear dataset from MIRAX format into DICOM

Key Investigators

Project Description

As the DICOM standard is increasingly used in digital pathology imaging, conversion of available datasets from proprietary formats into DICOM format can make the data more FAIR and improve transparency and reproducibility of research conducted with these data. For this reason, the NCI Imaging Data Commons (IDC) hosts all its data in DICOM format.

A set of bone marrow smear WSI available in MIRAX (.mrxs) format are to be ingested into the IDC. For that purpose they need to be converted into DICOM (.dcm) along with all available image and clinical metadata. In addition, this dataset contains extensive deep-learning generated nuclei annotations (bounding boxes) that should also be converted into DICOM in a suitable way.

Objective

  1. Objective A: Have a working script for the conversion of the complete set of bone marrow smear WSI into DICOM format based on wsidicomizer.
  2. Objective B: Include clinical metadata in an IDC-conformant way.
  3. Objective C (optional): Have a script that converts the nuclei annotations into DICOM. Consider this issue: https://github.com/imi-bigpicture/wsidicomizer/issues/56

Approach and Plan

Objective A

  1. Implement and verify code for basic conversion of the .mrxs files as is into .dcm.
  2. Investigate automatically filled metadata (including pixel spacing). wsidicomizer’s default data can be found here, an overview of attributes for VL Whole Slide Microscopy IOD here.
  3. Add code for ingestion of metadata that are not obtained from the .mrxs files / correct potential falsely estimated metadata (via wsidicom API or JSON file).
  4. Verify correct conversion with dciodvfy on every file and dcentvfy on every set of files in a series.
  5. Have a few successfully converted samples and be ready to run code on complete collection.

Objective B

  1. Prepare additional clinical and lab data as table such that they can be ingested into IDC as BigQuery table.

Objective C (optional):

  1. Discuss and decide in what way available annotations can be best encoded in DICOM.
  2. Implement conversion pipeline for annotation conversion based on IDC annotation conversion code by Chris Bridge.

Progress and Next Steps

Objective A:

Objective B:

Objective C:

Next steps:

Illustrations

Example image of bone marrow smears
Example image of bone marrow smears. Taken from: https://doi.org/10.1177/1040638712452731.

Background and References

Background reading:

Further resources: