본문

Deep Generative Models for Decision-Making and Control- [electronic resource]
Deep Generative Models for Decision-Making and Control - [electronic resource]
내용보기
Deep Generative Models for Decision-Making and Control- [electronic resource]
자료유형  
 학위논문파일 국외
최종처리일시  
20240214100500
ISBN  
9798380380911
DDC  
629.8
저자명  
Janner, Michael.
서명/저자  
Deep Generative Models for Decision-Making and Control - [electronic resource]
발행사항  
[S.l.]: : University of California, Berkeley., 2023
발행사항  
Ann Arbor : : ProQuest Dissertations & Theses,, 2023
형태사항  
1 online resource(98 p.)
주기사항  
Source: Dissertations Abstracts International, Volume: 85-03, Section: B.
주기사항  
Advisor: Levine, Sergey.
학위논문주기  
Thesis (Ph.D.)--University of California, Berkeley, 2023.
사용제한주기  
This item must not be sold to any third party vendors.
초록/해제  
요약Deep model-based reinforcement learning methods offer a conceptually simple approach to the decision-making and control problem: use learning for the purpose of estimating an approximate dynamics model, and offload the rest of the work to classical trajectory optimization. However, this combination has a number of empirical shortcomings, limiting the usefulness of model-based methods in practice. The dual purpose of this thesis is to study the reasons for these shortcomings and to propose solutions for the uncovered problems. We begin by generalizing the dynamics model itself, replacing the standard single-step formulation with a model that predicts over probabilistic latent horizons. The resulting model, trained with a generative reinterpretation of temporal difference learning, leads to infinite-horizon variants of the procedures central to model-based control, including the model rollout and model-based value estimation.Next, we show that poor predictive accuracy of commonly-used deep dynamics models is a major bottleneck to effective planning, and describe how to use high-capacity sequence models to overcome this limitation. Framing reinforcement learning as sequence modeling simplifies a range of design decisions, allowing us to dispense with many of the components normally integral to reinforcement learning algorithms. However, despite their predictive accuracy, such sequence models are limited by the search algorithms in which they are embedded. As such, we demonstrate how to fold the entire trajectory optimization pipeline into the generative model itself, such that sampling from the model and planning with it become nearly identical. The culmination of this endeavor is a method that improves its planning capabilities, and not just its predictive accuracy, with more data and experience. Along the way, we highlight how inference techniques from the contemporary generative modeling toolbox, including beam search, classifier-guided sampling, and image inpainting, can be reinterpreted as viable planning strategies for reinforcement learning problems.
일반주제명  
Robotics.
일반주제명  
Computer science.
키워드  
Reinforcement learning
키워드  
Dynamics model
키워드  
Single-step formulation
기타저자  
University of California, Berkeley Computer Science
기본자료저록  
Dissertations Abstracts International. 85-03B.
기본자료저록  
Dissertation Abstract International
전자적 위치 및 접속  
로그인 후 원문을 볼 수 있습니다.
신착도서 더보기
최근 3년간 통계입니다.

소장정보

  • 예약
  • 소재불명신고
  • 나의폴더
  • 우선정리요청
  • 비도서대출신청
  • 야간 도서대출신청
소장자료
등록번호 청구기호 소장처 대출가능여부 대출정보
TF06711 전자도서
마이폴더 부재도서신고 비도서대출신청

* 대출중인 자료에 한하여 예약이 가능합니다. 예약을 원하시면 예약버튼을 클릭하십시오.

해당 도서를 다른 이용자가 함께 대출한 도서

관련 인기도서

로그인 후 이용 가능합니다.