XNAT Machine Learning Development Plugin

Offers support for running training experiments on models to support machine learning in XNAT. This workflow urrently supports NVIDIA Clara and other TLT models.

Building

To build the XNAT Machine Learning plugin:

If you haven't already, clone this repository and cd to the newly cloned folder.
Build the plugin: ./gradlew jar (on Windows, you can use the batch file: gradlew.bat jar). This should build the plugin in the file build/libs/xnat-template-plugin-1.0.0.jar (the version may differ based on updates to the code).
Copy the plugin jar to your plugins folder: cp build/libs/ml-plugin-1.0.0.jar /data/xnat/home/plugins

You'll also need the XNAT Datasets Plugin installed.

Quick Start

The following examples:

Include the server address http://xnat. You should replace this with the site URL for your deployed XNAT system.
Use httpie to demonstrate how calls might work

Create a dataset definition

A dataset definition is analogous to a query or stored search: it specifies the criteria data should meet to be included in the resolved dataset, but does not itself indicate any particular file or files. A dataset definition consists of the following properties:

Project
Label
Description (optional)
Criteria, which is a list of criterion objects

The criteria are the primary content in the definition. Each criterion itself consists of two properties:

Resolver indicates the implementation that should interpret the criterion
Payload is the data that the resolver interprets

Currently the main type of dataset definition uses a resolver named TaggedResourceMap. This resolver takes one or more resource tag fields, each of which specifies a tag value and values for the following properties:

Property	Description	Table
SeriesDescription	Searches image scan attributes type, series_description, and series_class	xnat_imagescandata
ResourceLabel	Searches the resource label for matching scans	xnat_abstractresource
ResourceFormat	Searches the resource format for matching scans	xnat_resource
ResourceContent	Searches the resource content for matching scans	xnat_resource

The format of the search values implies the type of comparison used:

If a value contains the character '%' by itself, the search uses a LIKE comparison. If you want to include the actual '%' character without using a LIKE, escape it with another '%', e.g. X%%Y.
If a value starts and ends with the character '/', the search uses a regular expression comparison.
If a value starts with '/' and ends with '/i', the search uses a case-insensitive regular expression comparison.
Any other value is treated as a literal search, i.e. attribute = 'value'.

Here's a sample dataset definition:

{
    "project": "AbSegCt",
    "label": "AbSegCt_training_data",
    "description": "This is a definition for data for training the AbSegCt segmentation model",
    "criteria": [
        {
            "resolver": "TaggedResourceMap",
            "payload": {
                "Images": {
                    "tag": "image",
                    "SeriesDescription": ["T1%"],
                    "ResourceFormat": ["NIFTI"],
                    "ResourceContent": ["/T1./i"],
                    "ResourceLabel": ["/nifti/i"]
                },
                "Labels": {
                    "tag": "label",
                    "SeriesDescription": ["Segment%"],
                    "ResourceFormat": ["NIFTI"],
                    "ResourceContent": ["/Segmentat.{3}/i"],
                    "ResourceLabel": ["/nifti/i"]
                }
            }
        }
    ]
}

If you save this JSON to a file named absegct-dataset-definition.json, you can create the definition object in XNAT with a call like this:

$ cat absegct-dataset-definition.json | http --session=username POST https://xnatdev.xnat.org/xapi/sets/definitions

Create a dataset

A dataset is the result obtained from resolving a dataset definition at a particular point in time. The contents of a particular dataset don't change based on new data being added or existing data being renamed, moved, or deleted. To resolve a dataset definition, you can POST to a REST endpoint identifying a particular definition:

http://xnat/xapi/definitions/_id_, where id is the experiment ID for the definition you wish to resolve
http://xnat/xapi/definitions/projects/_projectId_/_idOrLabel_, where project is the project containing the definition and idOrLabel is the experiment ID or the label within the project for the definition you wish to resolve

These calls would look similar to those below:

$ http --session=username POST https://xnatdev.xnat.org/xapi/sets/definitions/XNAT_E00101
$ http --session=username POST https://xnatdev.xnat.org/xapi/sets/definitions/projects/AbSegCT/AbSegCt_training_data

Create a new model

To create a new model, you can post to the REST endpoint http://xnat/xapi/ml/models/model/_PROJECT_/_MODEL_, where:

PROJECT indicates the project in which the model should be created
MODEL indicates the label for the new model object

There are two ways you can submit the actual files that compose the model:

Set the content type to multipart/form-data and add each file to the "form" request with the name modelFile
Set the content type to application/zip and the request body to a zip file containing all of the files

From the command-line, these calls might look like this:

$ http --session=username --form http://xnat/xapi/ml/models/model/AbSegCT/model_1 modelFile@checkpoint modelFile@model.ckpt.data-00000-of-00001 modelFile@model.ckpt.index modelFile@model.ckpt.meta modelFile@model.fzn.pb modelFile@model.trt.pb
$ http --session=username --form http://xnat/xapi/ml/models/model/AbSegCT/model_2 @model.zip

Create a training configuration

Question: The model is not referenced by the training configuration. Should it be? How independent of a model is the training configuration? Can a training configuration be used for more than one model?

In addition to the standard project and label properties, a training configuration brings together a few different items:

The configuration to be used when launching training sessions for the model
Any fields within the configuration that may be parameterized at launch
The ID of the resolved dataset
A JSON template that contains a wrapper for the dataset
Parameters for partitioning the dataset files

A training configuration can be most easily created by POSTing a JSON body that looks something like this:

{
    "project": "project",
    "label": "label",
    "collectionId": "XNAT_E00102",
    "configuration": {
        "parameterizable": ["epochs", "multi_gpu", "learning_rate"],
        "template": { ... }
    },
    "dataset": {
        "template": { ... },
        "parameterizable": {
            "training": 70,
            "validation": 20,
            "test": 10
        }
    }
}

If you specify both project and label in the training configuration JSON, you can simply POST the JSON to the REST endpoint http://xnat/xapi/ml/config. You can omit the project and label fields in the JSON to allow easier re-use of the same configuration template, but then need to specify these values with the REST URL:

$ cat config_train.json | http --session=username POST http://xnat/xapi/ml/config
$ cat config_train.json | http --session=username POST http://xnat/xapi/ml/config/project/AbSegCT/AbSegCT_config_train

Note that the second form of this REST call uses the values for project and label from the URL, even if these have different values in the POSTed object!

Both template fields can be inserted as literal JSON (i.e. no encoding required). These are usually tightly tied to the model and training algorithm and should be specified by the developer(s) of the model to be trained. Note that the configuration template is delivered exactly as specified when the configuration is rendered, with the exception of substituting values for any fields the user specifies at launch time, while the dataset template is rendered by adding elements for each of the fields specified in the dataset's parameterizable field. Given the following dataset configuration:

"dataset": {
    "template": {
        "name": "AbSegCt",
        "quantitative": [
            0,
            1
        ],
        "licence": "CC-BY-SA 4.0",
        "labels": {
            "1": "PZ",
            "2": "TZ",
            "0": "background"
        },
        "release": "1.0 04/05/2018",
        "modality": {
            "1": "ADC",
            "0": "T2"
        },
        "tensorImageSize": "4D",
        "reference": "Miskatonic University",
        "description": "Abdominal segmentation"
    },
    "parameterizable": {
        "training": 70,
        "validation": 20,
        "test": 10
    }
}

The resulting rendered dataset would look like this (the actual image lists are truncated to a single session for readability):

{
    "name": "AbSegCt",
    "quantitative": [
        0,
        1
    ],
    "licence": "CC-BY-SA 4.0",
    "labels": {
        "1": "PZ",
        "2": "TZ",
        "0": "background"
    },
    "release": "1.0 04/05/2018",
    "modality": {
        "1": "ADC",
        "0": "T2"
    },
    "tensorImageSize": "4D",
    "reference": "Miskatonic University",
    "description": "Abdominal segmentation",
    "training": [
        {
            "image": "/data/xnat/archive/prostate/arc001/prostate_45_MR_01/SCANS/1/NIFTI/prostate_45.nii.gz",
            "label": "/data/xnat/archive/prostate/arc001/prostate_45_MR_01/SCANS/2/NIFTI/prostate_45.nii.gz"
        }
    ],
    "numTraining": 70,
    "validation": [
        {
            "image": "/data/xnat/archive/prostate/arc001/prostate_45_MR_01/SCANS/1/NIFTI/prostate_45.nii.gz",
            "label": "/data/xnat/archive/prostate/arc001/prostate_45_MR_01/SCANS/2/NIFTI/prostate_45.nii.gz"
        }
    ],
    "numValidation": 20,
    "test": [
        {
            "image": "/data/xnat/archive/prostate/arc001/prostate_45_MR_01/SCANS/1/NIFTI/prostate_45.nii.gz",
            "label": "/data/xnat/archive/prostate/arc001/prostate_45_MR_01/SCANS/2/NIFTI/prostate_45.nii.gz"
        }
    ],
    "numTest": 10
}

Note that the number of images in each set as indicated by the num_Partition_ values reflects the values set in the dataset's parameterizable field:

If the values for those parameters add up to 100, they are taken as percentages and the dataset is partitioned based on those percentages, regardless of the number of images in the dataset. In this case, a dataset with 500 images would have a training partition with 350 images, a validation partition with 100 images, and a test partition with 50 images.
If the values for the parameters don't add up to 100, they must add up to the same value as the total number of images in the dataset (note that there might be multiple tags such as image and label for an image: these are considered to be part of a single image).

Launching a training session

Once you have a model, its training configuration, and a dataset, you can begin to train the model. You may launch multiple training sessions simultaneously or serially for the same model, varying the configuration parameters each time to fine tune the training outcome. A training session launch request can take the following attributes:

Property	Description
label	The label for the training session. This is intended to be human readable and can be used to make it easy to distinguish training sessions, e.g. "session epochs 50 learning rate 0.4" and "session epochs 50 learning rate 0.6".
processingId	A unique processing ID for the session. This is used internally by XNAT for things like routing requests to containers running the training session or generating processing data to allow monitoring training progress.
modelId	The ID of the model to be trained.
configurationId	The ID of the training configuration to be used for training.
username	The username of the user requesting the training session.
parameters	Any parameters and arguments for the training session.
sessionId	The training session ID. This is optional and can be used when updating a training session that has been queued but not yet launched.

The REST endpoint to launch a training session is http://xnat/xapi/ml/train/launch:

http --session=username POST http://xnat/xapi/ml/train/launch processingId="abSegCt-model-20200413153659" label="AbSegCt model epochs 50 LR 0.5 multi-gpu" modelId=XNAT_E00100 configurationId=XNAT_E00101 parameters:='{"epochs": "50", "learning_rate": "0.5", "multi_gpu": "true"}'

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
gradle/wrapper		gradle/wrapper
src		src
.gitignore		.gitignore
Jenkinsfile		Jenkinsfile
LICENSE.txt		LICENSE.txt
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XNAT Machine Learning Development Plugin

Building

Quick Start

Create a dataset definition

Create a dataset

Create a new model

Create a training configuration

Launching a training session

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

XNAT Machine Learning Development Plugin

Building

Quick Start

Create a dataset definition

Create a dataset

Create a new model

Create a training configuration

Launching a training session

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages