본문

Making Neural Network Models More Efficient- [electronic resource]
Making Neural Network Models More Efficient - [electronic resource]
내용보기
Making Neural Network Models More Efficient- [electronic resource]
자료유형  
 학위논문파일 국외
최종처리일시  
20240214100445
ISBN  
9798379717254
DDC  
004
저자명  
Su, Yushan.
서명/저자  
Making Neural Network Models More Efficient - [electronic resource]
발행사항  
[S.l.]: : Princeton University., 2023
발행사항  
Ann Arbor : : ProQuest Dissertations & Theses,, 2023
형태사항  
1 online resource(94 p.)
주기사항  
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
주기사항  
Advisor: Li, Kai.
학위논문주기  
Thesis (Ph.D.)--Princeton University, 2023.
사용제한주기  
This item must not be sold to any third party vendors.
초록/해제  
요약Complex machine learning tasks typically require large neural network models. However, training and inference on neural models require substantial compute power and large memory foot-prints, and incur significant costs. My thesis studies methods to make neural networks efficient at a relatively low cost.First, we explore how to utilize CPU servers for training and inference. CPU servers are more readily available, have larger memories, and cost much less than GPUs or hardware accelerators. However, they are much less efficient for training and inference tasks. My thesis studies how to design efficient software kernels for sparse neural networks that allow unstructured pruning to achieve efficiency of training or inference . Our evaluation shows that our sparse kernels can achieve 6.4x-20.0x speedups for medium sparsities over the commonly used Intel MKL sparse library for CPUs and greatly reduce the performance gap with those for GPUs.Second, we study how to achieve high-throughput inference for large models. We propose PruMUX, a method to combine data multiplexing with model compression. We find that in most cases, PruMUX can achieve better throughput than using each approach alone for a given accuracy loss budget.Third, we study how to find best sets of parameters for PruMUX in order to make it practical. We propose Auto-PruMUX, which uses performance modeling based on a set of data points to predict multiplexing parameters for DataMUX and sparsity parameters for a given model compression technique. Our evaluation shows that Auto-PruMUX can successfully find or predict parameters to achieve the best throughput given an accuracy loss budget.This dissertation also proposes several future research directions in the areas of our studies.
일반주제명  
Computer science.
일반주제명  
Computer engineering.
키워드  
Neural network
키워드  
Compute power
키워드  
Hardware accelerators
키워드  
Multiplexing parameters
키워드  
Model compression
기타저자  
Princeton University Computer Science
기본자료저록  
Dissertations Abstracts International. 84-12B.
기본자료저록  
Dissertation Abstract International
전자적 위치 및 접속  
로그인 후 원문을 볼 수 있습니다.
신착도서 더보기
최근 3년간 통계입니다.

소장정보

  • 예약
  • 소재불명신고
  • 나의폴더
  • 우선정리요청
  • 비도서대출신청
  • 야간 도서대출신청
소장자료
등록번호 청구기호 소장처 대출가능여부 대출정보
TF05910 전자도서
마이폴더 부재도서신고 비도서대출신청

* 대출중인 자료에 한하여 예약이 가능합니다. 예약을 원하시면 예약버튼을 클릭하십시오.

해당 도서를 다른 이용자가 함께 대출한 도서

관련 인기도서

로그인 후 이용 가능합니다.