Edit this page

NA-MIC Project Weeks

Back to Projects List

AMP SCZ Combining baseline and longitudinal information for prediction of psychosis conversion

Key Investigators

Presenter location: In-person

Project Description

This project is part of the AMP SCZ program, an initiative for early detection of risk for schizophrenia.

A key goal in AMP SCZ is to predict which patients that present initially mild or sub-threshold symptoms will eventually develop psychosis. Most predictive models are based on data acquired on their first medical visit (the baseline visit). An important question is how much is gained by following patients over time (longitudinal data). Moreover, what is a principled way to combine baseline and longitudinal information?

In this project we will implement predictive models that make use of both baseline and longitudinal information for psychosis prediction. This project builds on a previous one, in which we implemented an approach called “joint modeling”, which had important limitations. For this project, we will implement one based on a combination of two approaches:


  1. Implement a Python-based version of MKL-DTW longitudinal models adapted for common best practices in machine learning (separate train/test, scikit-learn compatible methods).
  2. Quantify the advantage of longitudinal models vs baseline predictors in a legacy dataset.

Approach and Plan

  1. Write an estimator of kernel distances based on DTW in python.
  2. Write an extension of the MKL package MKLpy that can integrate DTW kernels for longitudinal modalities with traditional kernels for baseline modalities.
  3. Benchmark performance on a legacy dataset.

Progress and Next Steps

  1. We implemented a number of similarity measures for multivariate longitudinal sequences.
  2. We implemented the extension of multiple kernel learning to use these kernels in longitudinal datasets.
  3. We curated a dataset from a semi-public source (NIH) with cross-sectional and longitudinal information.
  4. We tried using the curared dataset to validate the new prediction method. We are currently finding some issues with the samples, which we are fixing.

Next steps:

  1. Fix the issues with the proposed dataset.
  2. Find a new dataset to make longitudinal predictions in a clinically usefull scenario (e.g. few visits)


No response

Background and References

No response