An official website of the United States government.

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Identifying Biomarkers from Multi-source; Multi-way Data


Project SummaryIn medical research, a growing number of high-content platforms and technologies are used to measure di-verse but related information. Examples include sequencing of the genome, epigenome, transcriptome andtranslatome, metabolite pro?ling, and imaging modalities. Moreover, data from the same high-content platformare often measured over multiple dimensions, such as multiple tissues, body regions, or developmental timepoints. We refer to data measured over multiple platforms or technologies as multi-source, and data measuredover multiple dimensions as multi-way. Many modern biomedical studies collect data that are both multi-sourceand multi-way, meaning multi-way data are collected from multiple platforms. Multi-source multi-way data hasenormous potential to capture and synthesize every facet of a complex biological system. However, to datethere has been little methodology developed for fully integrative analysis of such data. We will focus on devel-oping methods to identify biomarkers for a clinical outcome from multi-source multi-way data. Biomarkers areoften used as a surrogate for disease progression or as an endpoint for clinical trials, and so their precisionin capturing a given medical phenomenon is crucial. We propose to develop new composite biomarker meth-ods that identify patterns across multiple sources of data, and multiple dimensions, that are associated witha clinical outcome. Our central hypothesis is that a fully integrated and multivariate approach will yield moreprecise biomarkers and simplify their interpretation. The novel product of this project will be a suite of methodsextending common biomarker tasks to the multi-source multi-way context, including dimension reduction (Aim1a), missing value imputation (Aim 1b), high-dimensional prediction (Aim 2) and dependent hypothesis testing(Aim 3). This work is motivated by our involvement in several ongoing collaborative translational projects withrich multi-source multi-way data, including biomarker discovery for the development of lung cancer in chronicobstructive pulmonary disease patients, for the progression of neurodegenerative disorders such as Friedre-ich's Ataxia, and for brain iron de?ciency in infants. We will apply and rigorously assess our multi-sourcemulti-way approaches on these applications. All methods will be implemented in free, open-source and easilyaccessible software to facilitate their use by other researchers and practitioners.

Lock, Eric F
University of Minnesota
Start date
End date
Project number
Accession number