Kaspar Märtens

Machine Learning Researcher

I am a Machine Learning researcher with 10+ years of experience developing innovative AI/ML methods with demonstrated impact in both foundational research and real-world applications. With a background in probabilistic deep learning, my expertise spans generative models, Large Language Models (LLMs), and multimodal learning. Currently, I am a Research Scientist at Novo Nordisk, where I focus on leveraging LLMs and LLM agents for scientific discovery in biology. My recent work explores how LLMs can drive progress in cellular perturbation modeling and contribute to the broader goal of building virtual cells.

I did my PhD in Statistical Machine Learning at the University of Oxford, as part of the OxCSML group in the Department of Statistics, where I was supervised by Christopher Yau and Chris Holmes. Upon graduation, I had an opportunity to continue working with the Apple  Health AI team, as well as spend some time in academia, in the Alan Turing Institute (as a recipient of the Turing-Crick Biomedical Data Science Award) and the Big Data Institute in Oxford.

News: Our work “LangPert: LLM-Driven Contextual Synthesis for Unseen Perturbation Prediction” was selected for an Oral Presentation at the MLGenX workshop at ICLR 2025

Blog Posts

Aug 10, 2018 14 min read 0 Comments

Neural Processes as distributions over functions

In this year’s ICML, some interesting work was presented on Neural Processes. In this blog post, I discuss what Neural Processes are and how they behave as a prior over functions.

Selected Work

Enhancing generative perturbation models with LLM-informed gene embeddings

Genetic perturbations are key to understanding how genes regulate cell behavior, yet the ability to predict responses to these perturbations remains a significant challenge. While numerous generative models have been developed, most lack the capability to generalize to perturbations not encountered during training. To alleviate this limitation, we introduce a novel methodology that incorporates prior knowledge through embeddings derived from LLMs, effectively informing our predictive models with a deeper biological context.

K Märtens, R Donovan-Maiye, J Ferkinghoff-Borg

Accepted to MLGenX workshop at ICLR 2024

Link to paper

Disentangling shared and private latent factors in multimodal Variational Autoencoders

Here, we investigate the capabilities of Multimodal Variational Autoencoders, such as MVAE and MMVAE, to reliably infer private and shared latent factors from multi-omics data. In particular, we highlight a challenging problem setting where modality-specific variation dominates the shared signal. Taking a cross-modal prediction perspective, we discuss and demonstrate limitations of existing models, and propose a modification how to make them more robust to modality-specific variation.

K Märtens, C Yau

Accepted to MLCB (2023)

Link to paper Code Video

Neural Decomposition: Functional ANOVA with Variational Autoencoders

Due to the black-box nature of VAEs, their utility for healthcare and genomics applications has been limited. Here, we focus on characterising the sources of variation in Conditional VAEs. Our goal is to provide a feature-level variance decomposition, to separate out the marginal additive effects of latent variables z and fixed inputs c from their non-linear interactions. We propose to achieve this through what we call Neural Decomposition - an adaptation of the functional ANOVA decomposition from classical statistics to deep learning models.

K Märtens, C Yau

Accepted to AISTATS (2020)

Link to paper Code Slides Video

BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders

It would be desirable to construct a joint modelling framework for simultaneous dimensionality reduction and clustering of features. Here, we focus on embedding such capabilities within the Variational Autoencoder (VAE) framework. Specifically, we introduce a probabilistic clustering prior within the VAE decoding model which lets us learn a one-hot basis function representation. For scenarios where not all features are aligned, we develop an extension to handle translation-invariant basis functions.

K Märtens, C Yau

Accepted to AISTATS (2020)

Link to paper Code Slides Video

Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models

The interpretation of high-dimensional data typically requires the use of dimensionality reduction techniques. However, in many real-world problems the extracted low-dimensional representations may not be sufficient to aid interpretation on their own, and it would be desirable to interpret the model in terms of the original features themselves. Our goal is to characterise how feature-level variation depends on latent low-dimensional representations, external covariates, and non-linear interactions between the two. Here, we propose to achieve this through a structured kernel decomposition in a hybrid Gaussian Process model which we call the Covariate Gaussian Process Latent Variable Model (c-GPLVM).

K Märtens, K Campbell, C Yau

Accepted to ICML (2019)

Link to paper Code Poster Slides Video

Augmented Ensemble MCMC sampling in Factorial Hidden Markov Models

Here, we introduce an augmented ensemble MCMC technique to improve on existing poorly mixing samplers for factorial HMMs. This is achieved by combining parallel tempering and an auxiliary variable scheme to exchange information between the chains in an efficient way.

K Märtens, MK Titsias, C Yau

Accepted to AISTATS (2019)

Link to paper Code Poster

Predicting quantitative traits from genome and phenome with near perfect accuracy

Here, we explore whether accurate complex trait predictions can be achieved in practice. Using a genome sequenced population of ∼7,000 yeast strains of high but varying relatedness, we consider a variety of models for predicting growth traits from various sources of information (family information, genetic variants, and growth in other environments).

K Märtens, J Hallin, J Warringer, G Liti, L Parts

In Nature communications (2016)

Link to paper Code

seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data

Here, we propose a statistical method for identifying differentially methylated regions, based on the minimum description length (MDL) principle. Our method is available as an R package.

R Kolde*, K Märtens*, K Lokk, S Laur, J Vilo

In Bioinformatics (2016)

Link to paper Code Poster

More projects

No-U-Turn sampler

NUTS implementation in R

Non-parametric mixture models

Rcpp implementation for DP and MFM mixtures

Bayesian logistic regression via Polya-Gamma latent variables

R package implementing the Polya-Gamma augmentation scheme

Map of Estonian most popular names

Teaching

Course on Data Science and Visualisation

Here you can find course material (in Estonian!) on Data Science and Visualisation, which we created together with Tanel Pärnamaa. This course “Statistiline andmeteadus ja visualiseerimine” is centered around a number of interesting case studies, and it focuses on teaching good practices of data science in R by applying statistical methods to solve these real-life problems.

Teaching experience

2017 and 2018 – teaching assistant/class tutor for the course “Statistical Machine Learning” at the University of Oxford
2017 – tutorial on Visual Exploratory (Genomic) Data Analysis in R
- for MSc students in Bioinformatics at the University of Birmingham
- for DPhil students in the Wellcome Trust Centre for Human Genetics in Oxford
2015 – teaching and co-organising the course “Data Science and Visualisation” at the University of Tartu
2013 and 2014 – teaching assistant for the course “Calculus I” at the University of Tartu