Statistical Mechanics of Bayesian Inference and Learning in Neural Networks
Statistical Mechanics of Bayesian Inference and Learning in Neural Networks
상세정보
- 자료유형
- 학위논문 서양
- 최종처리일시
- 20250211151132
- ISBN
- 9798382782546
- DDC
- 530.1
- 서명/저자
- Statistical Mechanics of Bayesian Inference and Learning in Neural Networks
- 발행사항
- [Sl] : Harvard University, 2024
- 발행사항
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- 형태사항
- 908 p
- 주기사항
- Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
- 주기사항
- Advisor: Pehlevan, Cengiz.
- 학위논문주기
- Thesis (Ph.D.)--Harvard University, 2024.
- 초록/해제
- 요약This thesis collects a few of my essays towards understanding representation learning and generalization in neural networks. I focus on the model setting of Bayesian learning and inference, where the problem of deep learning is naturally viewed through the lens of statistical mechanics. First, I consider properties of freshly-initialized deep networks, with all parameters drawn according to Gaussian priors. I provide exact solutions for the marginal prior predictive of networks with isotropic priors and linear or rectified-linear activation functions. I then study the effect of introducing structure to the priors of linear networks from the perspective of random matrix theory. Turning to memorization, I consider how the choice of nonlinear activation function affects the storage capacity of treelike neural networks. Then, we come at last to representation learning. I study the structure of learned representations in Bayesian neural networks at large but finite width, which are amenable to perturbative treatment. I then show how the ability of these networks to generalize when presented with unseen data is affected by representational flexibility, through precise comparison to models with frozen, random representations. In the final portion of this thesis, I bring a geometric perspective to bear on the structure of neural network representations. I first consider how the demand of fast inference shapes optimal representations in recurrent networks. Then, I consider the geometry of representations in deep object classification networks from a Riemannian perspective. In total, this thesis begins to elucidate the structure and function of optimally distributed neural codes in artificial neural networks.
- 일반주제명
- Theoretical physics
- 일반주제명
- Neurosciences
- 일반주제명
- Statistical physics
- 일반주제명
- Biophysics
- 키워드
- Deep learning
- 키워드
- Random matrices
- 기타저자
- Harvard University Physics
- 기본자료저록
- Dissertations Abstracts International. 85-12B.
- 전자적 위치 및 접속
- 로그인 후 원문을 볼 수 있습니다.
MARC
008250123s2024 us c eng d■001000017160898
■00520250211151132
■006m o d
■007cr#unu||||||||
■020 ▼a9798382782546
■035 ▼a(MiAaPQ)AAI31147823
■040 ▼aMiAaPQ▼cMiAaPQ
■0820 ▼a530.1
■1001 ▼aZavatone-Veth, Jacob Andreas.▼0(orcid)0000-0002-4060-1738
■24510▼aStatistical Mechanics of Bayesian Inference and Learning in Neural Networks
■260 ▼a[Sl]▼bHarvard University▼c2024
■260 1▼aAnn Arbor▼bProQuest Dissertations & Theses▼c2024
■300 ▼a908 p
■500 ▼aSource: Dissertations Abstracts International, Volume: 85-12, Section: B.
■500 ▼aAdvisor: Pehlevan, Cengiz.
■5021 ▼aThesis (Ph.D.)--Harvard University, 2024.
■520 ▼aThis thesis collects a few of my essays towards understanding representation learning and generalization in neural networks. I focus on the model setting of Bayesian learning and inference, where the problem of deep learning is naturally viewed through the lens of statistical mechanics. First, I consider properties of freshly-initialized deep networks, with all parameters drawn according to Gaussian priors. I provide exact solutions for the marginal prior predictive of networks with isotropic priors and linear or rectified-linear activation functions. I then study the effect of introducing structure to the priors of linear networks from the perspective of random matrix theory. Turning to memorization, I consider how the choice of nonlinear activation function affects the storage capacity of treelike neural networks. Then, we come at last to representation learning. I study the structure of learned representations in Bayesian neural networks at large but finite width, which are amenable to perturbative treatment. I then show how the ability of these networks to generalize when presented with unseen data is affected by representational flexibility, through precise comparison to models with frozen, random representations. In the final portion of this thesis, I bring a geometric perspective to bear on the structure of neural network representations. I first consider how the demand of fast inference shapes optimal representations in recurrent networks. Then, I consider the geometry of representations in deep object classification networks from a Riemannian perspective. In total, this thesis begins to elucidate the structure and function of optimally distributed neural codes in artificial neural networks.
■590 ▼aSchool code: 0084.
■650 4▼aTheoretical physics
■650 4▼aNeurosciences
■650 4▼aStatistical physics
■650 4▼aBiophysics
■653 ▼aDeep learning
■653 ▼aRandom matrices
■653 ▼aTheoretical neuroscience
■653 ▼aStatistical mechanics
■653 ▼aBayesian neural networks
■690 ▼a0753
■690 ▼a0317
■690 ▼a0217
■690 ▼a0786
■690 ▼a0800
■71020▼aHarvard University▼bPhysics.
■7730 ▼tDissertations Abstracts International▼g85-12B.
■790 ▼a0084
■791 ▼aPh.D.
■792 ▼a2024
■793 ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17160898▼nKERIS▼z이 자료의 원문은 한국교육학술정보원에서 제공합니다.


