Bayesian Dimension Reduction and Prediction With Multiple Datasets- [electronic resource]
Bayesian Dimension Reduction and Prediction With Multiple Datasets- [electronic resource]
- 자료유형
- 학위논문파일 국외
- 최종처리일시
- 20240214101240
- ISBN
- 9798379958459
- DDC
- 574
- 서명/저자
- Bayesian Dimension Reduction and Prediction With Multiple Datasets - [electronic resource]
- 발행사항
- [S.l.]: : University of Minnesota., 2023
- 발행사항
- Ann Arbor : : ProQuest Dissertations & Theses,, 2023
- 형태사항
- 1 online resource(152 p.)
- 주기사항
- Source: Dissertations Abstracts International, Volume: 85-02, Section: B.
- 주기사항
- Advisor: Lock, Eric F.
- 학위논문주기
- Thesis (Ph.D.)--University of Minnesota, 2023.
- 사용제한주기
- This item must not be sold to any third party vendors.
- 초록/해제
- 요약Biomedical investigators are increasingly able to collect multiple sources of omics data in pursuit of the understanding of disease pathogenesis. Integrative factorization methods for multi-omic datasets have been developed to reveal latent biological patterns driving variation among the observations. However, few methods can accommodate prediction for clinical or biological outcomes within datasets having this complex structure. In Chapter 2, we propose a framework for dimension reduction and prediction in the context of multi-omic, multi-cohort (bidimensional) datasets. We also extend the oft-used Bayesian variable selection approach, the spike-and-slab prior, to accommodate hierarchical variable selection across multiple regression models. We applied this framework to multi-omic data from the Cancer Genome Atlas to predict overall survival across disparate cancer types. We identified multi-omic biological patterns related to survival that persist across multiple cancers. In Chapter 3, we proposed a Bayesian framework to perform either integrative factorization or simultaneous factorization and prediction, which we term Bayesian Simultaneous Factorization and Prediction (BSFP). BSFP concurrently estimates latent factors driving variation within and across omics datasets while estimating their effects on an outcome, providing a complete framework for uncertainty. We show via simulation the importance of accounting for uncertainty in the estimated factorization within the predictive model and the flexibility of this framework for multiple imputation. We also apply BSFP to metabolomic and proteomic data to predict lung function decline among individuals living with HIV. Finally, in Chapter 4, we extend the framework described in Chapter 3 to accommodate simultaneous factorization and prediction using bidimensional data, i.e. across multiple omics sources and multiple sample cohorts, which we term multi-cohort BSFP, or MCBSFP. We evaluate the performance of this framework in recovering latent variation structures via simulation and we use this model to reanalyze the proteomic and metabolomic data from the study considered in Chapter 3.
- 일반주제명
- Biostatistics.
- 일반주제명
- Oncology.
- 일반주제명
- Bioinformatics.
- 키워드
- Multi-omics
- 기타저자
- University of Minnesota Biostatistics
- 기본자료저록
- Dissertations Abstracts International. 85-02B.
- 기본자료저록
- Dissertation Abstract International
- 전자적 위치 및 접속
- 로그인 후 원문을 볼 수 있습니다.