Edit this page

NA-MIC Project Weeks

Back to Projects List

Docker-based system to assess challenge submissions

Key Investigators

Presenter location: Online

Project Description

Project Description:

Our project is focused on developing a Docker-based submission mechanism for challenge participants. To maintain fairness and make sure that the test set is not used in the training process, the test set will not be released to the participants. Instead, participants will be required to containerize their methods using Docker and submit their Docker containers for evaluation.

Docker provides an excellent solution for running algorithms in isolated environments known as containers. In our project, we will leverage Docker to create a container that replicates the participants’ pipeline requirements and executes their inference script. By encapsulating the entire environment within a container, we can ensure consistent execution and reproducibility.


Approach and Plan

Progress and Next Steps

We created a baseline algorithm to assist participants with their submissions. We used evalutils to develop a code template that participants can customize with their specific algorithms. We will work with Grand Challenges to create the input and output interface standards for participants which aids us in creating clear instructions on how to format and provide the necessary data. Participants have to follow the guidelines for building their Docker containers. We will link to the guideline on the original challenge website.

Evaluator container:

For Generating The Project Structure we will use Evautils. Evalutils contains a project generator based on CookieCutter that I can use to generate the boilerplate for our evaluation.

We will also generate our project with docker by running a container and sharing our current user id:

docker run -it --rm -u `id -u` -v $(pwd):/usr/src/myapp -w /usr/src/myapp python:3 bash -c "pip install evalutils && evalutils init evaluation LNQ2023"

Either of these commands will generate a folder called LNQ2023 with everything we need to get started.

The .gitattributes file at the root of the repository specifies all the files which should be tracked by git-lfs. By default all files in the ground truth and test directories are configured to be tracked by git-lfs, but they will only be registered once the git lfs extension is installed on my system and the git lfs install command has been issued inside the generated repository.

The structure of the project will be:

└── LNQ2023
    ├── build.sh            # Builds your evaluation container
    ├── Dockerfile          # Defines how to build your evaluation container
    ├── evaluation.py       # Contains your evaluation code - this is where you will extend the Evaluation class
    ├── export.sh           # Exports your container to a .tar file for use on grand-challenge.org
    ├── .gitattributes      # Define which files git should put under git-lfs
    ├── .gitignore          # Define which files git should ignore
    ├── ground-truth        # A folder that contains your ground truth annotations
    │   └── reference.csv   # In this example the ground truth is a csv file
    ├── README.md           # For describing your evaluation to others
    ├── requirements.txt    # The python dependencies of your evaluation container - add any new dependencies here
    ├── test                # A folder that contains an example submission for testing
    │   └── submission.csv  # In this example the participants will submit a csv file
    └── test.sh             # A script that runs your evaluation container on the test submission

evaluation.py. is the file where we will extend the Evaluation class and implement the evaluation for our challenge. In this file, a new class has been created, and it is instantiated and run with:

if __name__ == "__main__":

This is all that is needed for evalutils to perform the evaluation and generate the output for each new submission.

Background and References

The generated code for segmentation tasks:

class Myproject(ClassificationEvaluation):
    def __init__(self):

    def score_case(self, *, idx, case):
        gt_path = case["path_ground_truth"]
        pred_path = case["path_prediction"]

        # Load the images for this case
        gt = self._file_loader.load_image(gt_path)
        pred = self._file_loader.load_image(pred_path)

        # Check that they're the right images
        assert self._file_loader.hash_image(gt) == case["hash_ground_truth"]
        assert self._file_loader.hash_image(pred) == case["hash_prediction"]

        # Cast to the same type
        caster = SimpleITK.CastImageFilter()
        gt = caster.Execute(gt)
        pred = caster.Execute(pred)

        # Score the case
        overlap_measures = SimpleITK.LabelOverlapMeasuresImageFilter()
        overlap_measures.Execute(gt, pred)

        return {
            'ASSD': overlap_measures.GetASSD(),
            'DiceCoefficient': overlap_measures.GetDiceCoefficient(),

The next step is Building and testing, exporting the evaluation container and working on the Algorithm container.