백석예술대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

내용보기

Estimation and Optimization of Information Measures with Applications to Fairness and Differential Privacy- [electronic resource]

자료유형: 학위논문파일 국외

최종처리일시: 20240214100500

ISBN: 9798379604790

DDC: 519

저자명: Alghamdi, Wael Mohammed A.

서명/저자: Estimation and Optimization of Information Measures with Applications to Fairness and Differential Privacy - [electronic resource]

발행사항: [S.l.]: : Harvard University., 2023

발행사항: Ann Arbor : : ProQuest Dissertations & Theses,, 2023

형태사항: 1 online resource(373 p.)

주기사항: Source: Dissertations Abstracts International, Volume: 84-12, Section: A.

주기사항: Advisor: Calmon, Flavio.

학위논문주기: Thesis (Ph.D.)--Harvard University, 2023.

사용제한주기: This item must not be sold to any third party vendors.

초록/해제: 요약My dissertation solves three theoretical problems on optimizing and estimating information measures, and it also builds on this theory to introduce novel practical algorithms for: 1) Optimal mechanism design for differential privacy (DP); 2) Optimal group-fair enhancement in machine learning; and 3) Estimation of information measures from data using sample moments. Information measures (in particular, f-divergences) provide a rigorous way to tackle several real-world problems. Examples include: 1) Quantifying the degree of privacy afforded by data releasing mechanisms---using the hockey-stick divergence; 2) Correcting machine learning (ML) trained classifiers for group-fairness---via optimizing cross-entropy; and 3) Detecting new dependencies between pairs of natural phenomena---via estimating mutual information from data. Herein, we put forth mathematically grounded approaches for the above three practical problems. In the first third of the dissertation, we design optimal DP mechanisms in the large-composition regime, and we also derive a fast and accurate DP accountant for the large-composition regime via the method of steepest descent from mathematical physics. We prove that the privacy parameter is equivalent to a KL-divergence term, then we provide solutions to the ensuing minmax KL-divergence problem. In the second third of the dissertation, we generalize the ubiquitous concept of information projection to the case of conditional distributions---which we term model projection. We derive explicit formulas for model projection, as well as a parallelizable algorithm to compute it efficiently and at scale. We instantiate our model projection theory to the domain of group-fair ML, thereby obtaining an optimal multi-class fairness enhancement method that runs in the order of seconds on datasets of size more than 1 million samples. In the last third of the dissertation, we derive the functional form of the relationship between information measures and the underlying moments. Plugging in the sample moments of data into our new moments-based formulas, we are able to estimate mutual information and differential entropy efficiently and robustly against affine-transformations of the samples.