The gut microbiome is extremely diverse and interactions among the individual bacterial community members are not limited to pairs, but can occur in groups of larger size where for example the presence and activity of one strain affects the way two others are interacting with each other. The purpose of this thesis is to computationally investigate the ecological interactions of a synthetic gut bacterial community with respect to metabolic outcomes with state-of-the-art statistical and machine learning approaches. You can read the full proposal here
.

Many diseases are multifactorial in origin, meaning that they are caused by a combination of genetic and environmental components. These independent genetic factors are often common, occurring frequently in the absence of disease, and therefore cannot yet be used to predict disease. The role that the gut microbiome plays in these diseases remains unexplored.
The purpose of this M.Sc. thesis is to explore the relationship between microbiome composition and cardiovascular disease using state-of-art machine learning and statistical methods. You can read the full proposal here
.

The purpose of this M.Sc. thesis proposal is to evaluate the effectiveness of different representations of molecular structures at predicting the transcriptional response of human gut pathogens to different types of chemical stress. To achieve this task, we will rely on data generated as a part of the StressRegNet consortium, where the expression of specific genes in Salmonella enterica and Campylobacter jejuni have been measured in response to various chemical compounds.
The student will compare commonly used chemical representations such as the extended connectivity fingerprint (ECFP4), and chemical descriptors, to our previously described pre-trained chemical representation MolE. You can read the full proposal here
.

The main objective of this M.Sc. thesis proposal is to develop a statistical pipeline that detects chemical stress that significantly increases (or decreases) the expression of bacterial pathogen genes. To accomplish this, we will make use of the data generated by the StressRegNet consortium.
Once “hits” are determined, the student will analyze the similarity between compounds based on their measured effects on gene expression via various clustering and network inference algorithms. At the same time, detection of genetic regulatory circuits will be carried out with similar methods. You can read the full proposal here
.

The main objective of this M.Sc. thesis proposal is to explore the use of the SparseMax function in the graph attention network (GAT) framework to generate an interpretable representation of molecular structures to predict a certain outcome. Benchmarks will include various regression and classification tasks from MoleculeNet. We will also apply the model to the task of predicting antimicrobial activity in the microbiome.
The student will determine if the use of SparseMax improves predictive performance over regular GATs. You can read the full proposal here
.

Complex microbiome samples can be summarized into a single measure by alpha diversity indices, characterizing the structure of a community in (microbial) ecology. To date, a wide variety of alpha diversity measures have been developed, ranging from traditional diversity estimates from macro-ecology to recently developed microbiome-specific measures. With this MSc thesis, we would like to review the appropriateness of traditional and new alpha diversity measures for microbiome data from a statistical point of view, and to compare these diversity measures on mock and clinical microbiome data. You can read the full proposal here
.

The primary objective of this thesis is to implement and benchmark the FDR-controlled variable selection procedure for sparse log-contrast models, which is a constrained sparse regression model. This procedure is designed to identify microbial predictors while controlling for false discoveries, thereby enhancing the reliability of the results. You can read the full proposal here
.
“Biomedical Statistics and Data Science”