Not registered yet? Please contact our project manager Vanessa Köhler.

Research

Open Science: Preregistrations & Tools

TIMESPAN is committed to transparency, reproducibility, and responsible sharing of knowledge. This page highlights two key elements of our open science approach:

  1. Preregistered Protocols – outlining our study designs before data analysis begins

  2. Open Tools & Code – enabling reuse and review of our models and methods

 

Preregistrated Protocols

To ensure research transparency and reduce bias, several TIMESPAN studies have been preregistered on the Open Science Framework (OSF) the public platform. You can read more about preregistration in this dedicated blog post. These registrations detail study objectives, methods, and analysis plans prior to data collection or analysis.

Study title Date registered Link
ADHD treatment discontinuation across the lifespan: a multi-national study December 2022 https://osf.io/py4s7
How does ADHD/ADHD medication influence medication persistence for pharmacological treatment of hypertension May 2024 https://osf.io/s93cq
How does ADHD/ADHD medication influence medication persistence for pharmacological treatment of type 2 diabetes May 2024 https://osf.io/mu4nw
Clinical modifiers of ADHD treatment discontinuation across the lifespan: a multi-national study June 2024 https://osf.io/q6eah
ADHD treatment discontinuation after a cardiovascular disease diagnosis: A multi-national study April 2025 https://osf.io/jcqhw
ADHD/ADHD medication and risk of major adverse cardiac events and all-cause mortality in adults with type 2 diabetes: a multi-national study February 2025 https://osf.io/fza9b
ADHD/ADHD medication and risk of major adverse cardiovascular events and all-cause mortality in adults initiating pharmacotherapy for hypertension without established cardiovascular disease February 2025 http://osf.io/rhu6j

 

Open Tools and Code

In addition to sharing study protocols, TIMESPAN develops and publishes tools to foster reproducibility, particularly in the area of machine learning and pharmacoepidemiology.

WP6 Deep Learning Models

As part of Work Package 6, we created deep learning neural network (DLNN) models.

D6.1 – Data Structure DLNNs

The main objective of the D6.1 is to create innovative data structures DLNNs to predict cardiometabolic outcomes and treatment discontinuity using registry and clinical data. Our machine learning and deep learning framework for this objective is complete and the codes are freely available via Github repository.

All codes are written in Python, using Scikit-learn (Pedregosa et al., 2012), Keras (Charles, 2013) and Tensorflow libraries (Abadi et al., 2016; GoogleResearch, 2015).
Briefly, this repository contains the following files:

  1. Read input tabular data (including generate training, validation and testing subsets; scaling features and binarize targets, i.e our outcomes of interests such as cardiometabolic diagnosis or events)
  2. PCA feature reduction: a commonly used feature reduction and engineering method
  3. Commonly used Scikit-learn models (including ensemble models) for tabular data
  4. Scikit-learn model hyperparameter search (covering a wide range of models and hyperparameters, and all of the commonly used search algorithms)
  5. Multilayer perceptron (MLP) model: A neural network model suitable for tabular data
  6. Hyperopt search for MLP: Hyperparameter search algorithm for the MLP models using Hyperopt (http://hyperopt.github.io/hyperopt/)
  7. Ensemble-MLP model: generate ensemble MLP model and stabilized predictions
  8. Seq2Seq model with GRUs (Dey & Salem, 2017; Wu et al., 2016): a longitudinal neural network model that will use time-series data input and predict the future events or event serials (Y. Zhang-James, Hess, et al., 2021)
  9. Feature importance analysis: a collection of various methods to examine and extract feature importance scores for various of models

For more information, you can access our public deliverable report on D6.1 here.

D6.2 – Genomic DLNNs

The main objective of the D6.2 is to create innovative Deep Learning Neural Network (DLNNs) using convolutional layers for genomic data. Our machine learning and deep learning framework for this objective is now complete and the codes are freely available via Github repository.

All codes are written in Python, using R, Scikit-learn (Pedregosa et al., 2012), Keras (Charles, 2013) and Tensorflow libraries (Abadi et al., 2016; GoogleResearch, 2015).

Briefly, this repository contains the following files:

  1.  Generation of context informed data matrix: Adding genomic annotations
  2. Generation of context informed data matrix: Correlation finding
  3. Generation of context informed data matrix: Creating Genomic input for CNN
  4. Matched pairing
  5. Genomic CNN including Keras-Tuner search for selecting optimal hyperparameters

For more information, you can access our public deliverable report on D6.2 here.

Why This Matters

By preregistering studies and publishing code and tools, we aim to:

  • Increase research transparency and reproducibility

  • Support collaboration and knowledge transfer

  • Enable other researchers and stakeholders to build upon TIMESPAN’s work