Reinforcement Learning From Static Datasets: Algorithms, Analysis, and Applications- [electronic resource]
Reinforcement Learning From Static Datasets: Algorithms, Analysis, and Applications- [electronic resource]
- 자료유형
- 학위논문파일 국외
- 최종처리일시
- 20240214101657
- ISBN
- 9798380877275
- DDC
- 004
- 저자명
- Kumar, Aviral.
- 서명/저자
- Reinforcement Learning From Static Datasets: Algorithms, Analysis, and Applications - [electronic resource]
- 발행사항
- [S.l.]: : University of California, Berkeley., 2023
- 발행사항
- Ann Arbor : : ProQuest Dissertations & Theses,, 2023
- 형태사항
- 1 online resource(385 p.)
- 주기사항
- Source: Dissertations Abstracts International, Volume: 85-06, Section: B.
- 주기사항
- Advisor: Levine, Sergey.
- 학위논문주기
- Thesis (Ph.D.)--University of California, Berkeley, 2023.
- 사용제한주기
- This item must not be sold to any third party vendors.
- 초록/해제
- 요약Reinforcement learning (RL) provides a formalism for learning-based control. By attempting to learn behavioral policies that can optimize a user-specified reward function, RL methods have been able to acquire novel decision-making strategies that can outperform the best humans even with highly complex dynamics and even when the space of all possible outcomes is huge (e.g., robotic manipulation, chip floorplanning). Yet RL has had a limited applicability compared to standard machine learning (ML) in real-world scenarios. Why? The central issue with RL is that it relies crucially on running large amounts of trial-and-error active data collection for learning policies. Unfortunately though, in the real world, active data collection is generally very expensive (e.g., running wet lab experiments for drug design), and/or dangerous (e.g., robots operating around humans), and accurate simulators are hard to build. Overall, this means that while RL carries the potential to broadly unlock ML in real-world decision-making problems, we are unable to realize this potential via current RL techniques. To realize this potential of RL, in this dissertation, we develop an alternate paradigm that aims to utilizes static datasets of experience for learning policies. Such a ``dataset-driven'' paradigm broadens the applicability of RL to a variety of decision-making problems where historical datasets already exist or can be collected via domain-specific strategies. It also brings the scalability and reliability benefits that modern supervised and unsupervised ML methods enjoy into RL. That said, instantiating this paradigm is challenging as it requires reconciling the {static} nature of learning from a dataset with the traditionally active nature of RL, which results in challenges of distributional shift, generalization, and optimization. After theoretically and empirically understanding these challenges, we develop algorithmic ideas for addressing thee challenges and discuss several extensions to convert these ideas into practical methods that can train modern high-capacity neural network function approximators on large and diverse datasets. Finally, we show how the techniques can enable us to pre-train generalist policies for real robots and video games and enable fast and efficient hardware accelerator design.
- 일반주제명
- Computer science.
- 일반주제명
- Robotics.
- 키워드
- Control
- 키워드
- Decision-making
- 기타저자
- University of California, Berkeley Electrical Engineering & Computer Sciences
- 기본자료저록
- Dissertations Abstracts International. 85-06B.
- 기본자료저록
- Dissertation Abstract International
- 전자적 위치 및 접속
- 로그인 후 원문을 볼 수 있습니다.