Bayesian clustering, Dirichlet process mixture modelling and metabolic profile landscape analysis of fat and lipid biomarkers derived from large-scale lipidomics

Type of project

Competition funded PhD projects


Contact Dr James Smith to discuss this project further informally.

Project description

A very recent paper published in the Lancet Diabetes Endocrinology 2018 ( S2213-8587(18)30051-2) demonstrated that supervised stratification, using six variables taken from a survey of the literature, can discriminate between individuals from a heterogeneous population of type 2 diabetics clustering them into five clinical groups. In reality, there are many useful biomarkers, that can describe the health and nutritional status of individuals and stratify patient populations with different forms of diabetes, metabolic syndrome and cardiovascular disease. This work will explore multivariate landscapes, how combinations of the fats, nearly 40 covariates, typically observed from large-scale lipidomics data, can be used as status biomarkers. Fats are the building blocks of acyl-lipids. Singlet, duplet and triplet combinations form individual lipids that contribute to pools.

The mathematical question to be answered is how to formulate mixture models that best represent the explicit contributions of lipid subpopulations within observed pools. Only the building blocks (fats) and their final lipid pools are observable leaving much of the duplet and triplet combinations of lipids to be estimated. This PhD project aims to develop a formal compositional definition for the lipidome, previously defined loosely as an individual's fat and lipid profile circulating in the blood. Plate notations are a useful for describing multivariate Gaussian mixture models and the hope is that the hierarchy of fats within the lipid pools could be described as compositional mixture models. Another approach is epidemiological, using directed acyclic graphs (exposure-outcome causal graphical models) to illustrate how components combine into lipids and how lipids, that share equivalent configurations form pools. Graphical models illustrate assumptions about population data and depending on their detail can be used as sufficient-component cause models to illustrate specific hypotheses about underlying mechanisms (gene expression for fatty acid elongation & desaturation).

Their use to predict reliable diagnostic or status biomarkers based on the latent (hidden) distribution structure of fat and lipid combinations would be game-changing. In this research project, we want to go further and use profile regression, an alternative to regression models as we wish to make inference beyond the main effects in the lipid profiles with potentially correlated fat and lipids as the covariates. This involves Bayesian clustering using Dirichlet process mixture models (DPMMs), e.g. using R/PReMiuM. The dependence structure can then reveal which co-variates (fats) actively drive the mixture components and which share common characteristics to all lipid components and the pools.

Entry requirements

Applications are invited from candidates with or expecting a minimum of a UK upper second class honours degree (2:1) and/or a Master's degree in a relevant science subject such as (ideally) mathematics, biostatistics, epidemiology, computational biology, quantitative biology, analytical chemistry, food science & nutrition, biochemical engineering.

How to apply

Formal applications for research degree study should be made online through the university's website. Please state clearly in the research information second that the PhD you wish to be considered for is 'Bayesian clustering, Dirichlet process mixture modelling and metabolic profile landscape analysis of fat and lipid biomarkers derived from large-scale lipidomics' as well as Dr James Smith as your proposed supervisor.

If English is not your first language, you must provide evidence that you meet the University's minimum English Language requirements.

We welcome scholarship applications from all suitably-qualified candidates, but UK black and minority ethnic (BME) researchers are currently under-represented in our Postgraduate Research community, and we would therefore particularly encourage applications from UK BME candidates. All scholarships will be awarded on the basis of merit.