백석예술대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

내용보기

Privacy and Efficiency in Personalized Decision-Making and Recommendation- [electronic resource]

자료유형: 학위논문파일 국외

최종처리일시: 20240214101625

ISBN: 9798380469920

DDC: 658

저자명: Carranza, Aldo Gael.

서명/저자: Privacy and Efficiency in Personalized Decision-Making and Recommendation - [electronic resource]

발행사항: [S.l.]: : Stanford University., 2023

발행사항: Ann Arbor : : ProQuest Dissertations & Theses,, 2023

형태사항: 1 online resource(218 p.)

주기사항: Source: Dissertations Abstracts International, Volume: 85-04, Section: B.

주기사항: Advisor: Athey, Susan.

학위논문주기: Thesis (Ph.D.)--Stanford University, 2023.

사용제한주기: This item must not be sold to any third party vendors.

초록/해제: 요약In the current digital era, marked by the ubiquity of individual-level data and sophisticated artificial intelligence systems highly capable of exploiting data heterogeneity, data-driven personalized decision-making and recommendation systems have become prevalent in providing customized services and experiences to individuals. These bespoke systems, however, continue to present considerable challenges regarding their deployment in adaptive, heterogeneous, and privacy-sensitive settings. This dissertation presents three research projects that delve into some of these critical issues, offering insights and novel solutions aimed at enhancing the privacy and efficiency of personalized decision-making and recommendation systems.Chapter 1 of this dissertation investigates the challenges of model learning in contextual bandits for adaptive decision-making and presents a method to improve data efficiency and robustness to model misspecification in this online setting. Contextual bandit algorithms often estimate reward models to inform decision-making. However, true rewards can contain action-independent redundancies that are not relevant for decision-making. We show it is more data-efficient to estimate any function that explains the reward differences between actions, that is, the treatment effects. Motivated by this observation, building on recent work on oracle-based bandit algorithms, we provide a universal reduction of contextual bandits to general-purpose heterogeneous treatment effect estimation, and we design a simple and computationally efficient algorithm based on this reduction. Our theoretical and experimental results demonstrate that heterogeneous treatment effect estimation in contextual bandits offers practical advantages over reward estimation, including more efficient model estimation and greater robustness to model misspecification.In Chapter 2, we consider heterogeneous data adaptation and privacy in decision-making informed by historical observational data. We consider the problem of learning personalized decision policies on observational bandit feedback data from heterogeneous data sources. Moreover, we examine this problem in the federated setting where a central server aims to learn a policy on the data distributed across the heterogeneous sources without exchanging their raw data due to privacy concerns. We present a federated policy learning algorithm based on aggregation of local policies trained with doubly robust offline policy evaluation and learning strategies. We provide a novel regret analysis for our approach that establishes a finite-sample upper bound on a notion of global regret across a distribution of clients. In addition, for any individual client, we establish a corresponding local regret upper bound characterized by the presence of distribution shift relative to all other clients. We support our theoretical findings with experimental results. Our analysis and experiments provide insights into the value of heterogeneous client participation in federation for policy learning in heterogeneous settings.Lastly, in Chapter 3, we pivot slightly from the focus of the first two chapters on decision-making systems with online and offline policy learning methods to investigating data privacy in recommender systems. We propose a novel approach for developing privacy-preserving large-scale recommender systems using differentially private (DP) large language models (LLMs) which overcomes certain challenges and limitations in DP training these complex systems. This method is particularly well suited for the emerging area of LLM-based recommender systems, but can be readily employed for any recommender systems that process representations of natural language inputs. Our approach involves using DP training methods to fine-tune a publicly pre-trained LLM on a query generation task.