본문

Application of Distance Covariance to Time Series Modeling and Assessing Goodness-of-Fit
Application of Distance Covariance to Time Series Modeling and Assessing Goodness-of-Fit
Application of Distance Covariance to Time Series Modeling and Assessing Goodness-of-Fit

상세정보

자료유형  
 학위논문 서양
최종처리일시  
20250211152728
ISBN  
9798384017240
DDC  
310
저자명  
Fernandes, Leon.
서명/저자  
Application of Distance Covariance to Time Series Modeling and Assessing Goodness-of-Fit
발행사항  
[Sl] : Columbia University, 2024
발행사항  
Ann Arbor : ProQuest Dissertations & Theses, 2024
형태사항  
125 p
주기사항  
Source: Dissertations Abstracts International, Volume: 86-02, Section: B.
주기사항  
Advisor: Davis, Richard A.
학위논문주기  
Thesis (Ph.D.)--Columbia University, 2024.
초록/해제  
요약The overarching goal of this thesis is to use distance covariance based methods to extend asymptotic results from the i.i.d. case to general time series settings. Accounting for dependence may make already difficult statistical inference all the more challenging. The distance covariance is an increasingly popular measure of dependence between random vectors that goes beyond linear dependence as described by correlation. It is defined by a squared integral norm of the difference between the joint and marginal characteristic functions with respect to a specific weight function. Distance covariance has the advantage of being able to detect dependence even for uncorrelated data. The energy distance is a closely related quantity that measures distance between distributions of random vectors. These statistics can be used to establish asymptotic limit theory for stationary ergodic time series. The asymptotic results are driven by the limit theory for the empirical characteristic functions.In this thesis we apply the distance covariance to three problems in time series modeling: (i) Independent Component Analysis (ICA), (ii) multivariate time series clustering, and (iii) goodness-of-fit using residuals from a fitted model. The underlying statistical procedures for each topic uses the distance covariance function as a measure of dependence. The distance covariance arises in various ways in each of these topics; one as a measure of independence among the components of a vector, second as a measure of similarity of joint distributions and, third for assessing serial dependence among the fitted residuals. In each of these cases, limit theory is established for the corresponding empirical distance covariance statistics when the data comes from a stationary ergodic time series.For Topic (i) we consider an ICA framework, which is a popular tool used for blind source separation and has found application in fields such as financial time series, signal processing, feature extraction, and brain imaging. The Structural Vector Autogregression (SVAR) model is often the basic model used for modeling macro time series. The residuals in such a model are given by et = ASt , the classical ICA model. In certain applications, one of the components of St has infinite variance. This differs from the standard ICA model. Furthermore the ets are not observed directly but are only estimated from the SVAR modeling. Many of the ICA procedures require the existence of a finite second or even fourth moment. We derive consistency when using the distance covariance for measuring independence of residuals under the infinite variance case. Extensions to the ICA model with noise, which has a direct application to SVAR models when testing independence of residuals based on their estimated counterparts is also considered.In Topic (ii) we propose a novel methodology for clustering multivariate time series data using energy distance. Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure separation between the finite dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series.Topic (iii) considers the fundamental and often final step in time series modeling, assessing the quality of fit of a proposed model to the data. Since the underlying distribution of the innovations that generate a model is often not prescribed, goodness-of-fit tests typically take the form of testing the fitted residuals for serial independence. However, these fitted residuals are inherently dependent since they are based on the same parameter estimates and thus standard tests of serial independence, such as those based on the autocorrelation function (ACF) or distance correlation function (ADCF) of the fitted residuals need to be adjusted. We apply sample splitting in the time series setting to perform tests of serial dependence of fitted residuals using the sample ACF and ADCF. Here the first fn of the n data points in the time series are used to estimate the parameters of the model. Tests for serial independence are then based on all the n residuals. With fn = n/2 the ACF and ADCF tests of serial independence tests often have the same limit distributions as though the underlying residuals are indeed i.i.d. That is, if the first half of the data is used to estimate the parameters and the estimated residuals are computed for the entire data set based on these parameter estimates, then the ACF and ADCF can have the same limit distributions as though the residuals were i.i.d. This procedure ameliorates the need for adjustment in the construction of confidence bounds for both the ACF and ADCF, based on the fitted residuals, in goodness-of-fit testing. We also show that if fn n/2 then the asymptotic distribution of the tests stochastically dominate the corresponding asymptotic distributions for the true i.i.d. noise; the stochastic order gets reversed under fn n/2.
일반주제명  
Statistics
일반주제명  
Applied mathematics
일반주제명  
Computer science
일반주제명  
Systems science
키워드  
Characteristic function
키워드  
Clustering
키워드  
Distance covariance
키워드  
Goodness-of-fit
키워드  
Independent Component Analysis
키워드  
Time series
기타저자  
Columbia University Statistics
기본자료저록  
Dissertations Abstracts International. 86-02B.
전자적 위치 및 접속  
로그인 후 원문을 볼 수 있습니다.

MARC

 008250123s2024        us                              c    eng  d
■001000017163593
■00520250211152728
■006m          o    d                
■007cr#unu||||||||
■020    ▼a9798384017240
■035    ▼a(MiAaPQ)AAI31490505
■040    ▼aMiAaPQ▼cMiAaPQ
■0820  ▼a310
■1001  ▼aFernandes,  Leon.
■24510▼aApplication  of  Distance  Covariance  to  Time  Series  Modeling  and  Assessing  Goodness-of-Fit
■260    ▼a[Sl]▼bColumbia  University▼c2024
■260  1▼aAnn  Arbor▼bProQuest  Dissertations  &  Theses▼c2024
■300    ▼a125  p
■500    ▼aSource:  Dissertations  Abstracts  International,  Volume:  86-02,  Section:  B.
■500    ▼aAdvisor:  Davis,  Richard  A.
■5021  ▼aThesis  (Ph.D.)--Columbia  University,  2024.
■520    ▼aThe  overarching  goal  of  this  thesis  is  to  use  distance  covariance  based  methods  to  extend  asymptotic  results  from  the  i.i.d.  case  to  general  time  series  settings.  Accounting  for  dependence  may  make  already  difficult  statistical  inference  all  the  more  challenging.  The  distance  covariance  is  an  increasingly  popular  measure  of  dependence  between  random  vectors  that  goes  beyond  linear  dependence  as  described  by  correlation.  It  is  defined  by  a  squared  integral  norm  of  the  difference  between  the  joint  and  marginal  characteristic  functions  with  respect  to  a  specific  weight  function.  Distance  covariance  has  the  advantage  of  being  able  to  detect  dependence  even  for  uncorrelated  data.  The  energy  distance  is  a  closely  related  quantity  that  measures  distance  between  distributions  of  random  vectors.  These  statistics  can  be  used  to  establish  asymptotic  limit  theory  for  stationary  ergodic  time  series.  The  asymptotic  results  are  driven  by  the  limit  theory  for  the  empirical  characteristic  functions.In  this  thesis  we  apply  the  distance  covariance  to  three  problems  in  time  series  modeling:  (i)  Independent  Component  Analysis  (ICA),  (ii)  multivariate  time  series  clustering,  and  (iii)  goodness-of-fit  using  residuals  from  a  fitted  model.  The  underlying  statistical  procedures  for  each  topic  uses  the  distance  covariance  function  as  a  measure  of  dependence.  The  distance  covariance  arises  in  various  ways  in  each  of  these  topics;  one  as  a  measure  of  independence  among  the  components  of  a  vector,  second  as  a  measure  of  similarity  of  joint  distributions  and,  third  for  assessing  serial  dependence  among  the  fitted  residuals.  In  each  of  these  cases,  limit  theory  is  established  for  the  corresponding  empirical  distance  covariance  statistics  when  the  data  comes  from  a  stationary  ergodic  time  series.For  Topic  (i)  we  consider  an  ICA  framework,  which  is  a  popular  tool  used  for  blind  source  separation  and  has  found  application  in  fields  such  as  financial  time  series,  signal  processing,  feature  extraction,  and  brain  imaging.  The  Structural  Vector  Autogregression  (SVAR)  model  is  often  the  basic  model  used  for  modeling  macro  time  series.  The  residuals  in  such  a  model  are  given  by  et  =  ASt  ,  the  classical  ICA  model.  In  certain  applications,  one  of  the  components  of  St  has  infinite  variance.  This  differs  from  the  standard  ICA  model.  Furthermore  the  ets  are  not  observed  directly  but  are  only  estimated  from  the  SVAR  modeling.  Many  of  the  ICA  procedures  require  the  existence  of  a  finite  second  or  even  fourth  moment.  We  derive  consistency  when  using  the  distance  covariance  for  measuring  independence  of  residuals  under  the  infinite  variance  case.  Extensions  to  the  ICA  model  with  noise,  which  has  a  direct  application  to  SVAR  models  when  testing  independence  of  residuals  based  on  their  estimated  counterparts  is  also  considered.In  Topic  (ii)  we  propose  a  novel  methodology  for  clustering  multivariate  time  series  data  using  energy  distance.  Specifically,  a  dissimilarity  matrix  is  formed  using  the  energy  distance  statistic  to  measure  separation  between  the  finite  dimensional  distributions  for  the  component  time  series.  Once  the  pairwise  dissimilarity  matrix  is  calculated,  a  hierarchical  clustering  method  is  then  applied  to  obtain  the  dendrogram.  This  procedure  is  completely  nonparametric  as  the  dissimilarities  between  stationary  distributions  are  directly  calculated  without  making  any  model  assumptions.  In  order  to  justify  this  procedure,  asymptotic  properties  of  the  energy  distance  estimates  are  derived  for  general  stationary  and  ergodic  time  series.Topic  (iii)  considers  the  fundamental  and  often  final  step  in  time  series  modeling,  assessing  the  quality  of  fit  of  a  proposed  model  to  the  data.  Since  the  underlying  distribution  of  the  innovations  that  generate  a  model  is  often  not  prescribed,  goodness-of-fit  tests  typically  take  the  form  of  testing  the  fitted  residuals  for  serial  independence.  However,  these  fitted  residuals  are  inherently  dependent  since  they  are  based  on  the  same  parameter  estimates  and  thus  standard  tests  of  serial  independence,  such  as  those  based  on  the  autocorrelation  function  (ACF)  or  distance  correlation  function  (ADCF)  of  the  fitted  residuals  need  to  be  adjusted.  We  apply  sample  splitting  in  the  time  series  setting  to  perform  tests  of  serial  dependence  of  fitted  residuals  using  the  sample  ACF  and  ADCF.  Here  the  first  fn  of  the  n  data  points  in  the  time  series  are  used  to  estimate  the  parameters  of  the  model.  Tests  for  serial  independence  are  then  based  on  all  the  n  residuals.  With  fn  =  n/2  the  ACF  and  ADCF  tests  of  serial  independence  tests  often  have  the  same  limit  distributions  as  though  the  underlying  residuals  are  indeed  i.i.d.  That  is,  if  the  first  half  of  the  data  is  used  to  estimate  the  parameters  and  the  estimated  residuals  are  computed  for  the  entire  data  set  based  on  these  parameter  estimates,  then  the  ACF  and  ADCF  can  have  the  same  limit  distributions  as  though  the  residuals  were  i.i.d.  This  procedure  ameliorates  the  need  for  adjustment  in  the  construction  of  confidence  bounds  for  both  the  ACF  and  ADCF,  based  on  the  fitted  residuals,  in  goodness-of-fit  testing.  We  also  show  that  if  fn    n/2  then  the  asymptotic  distribution  of  the  tests  stochastically  dominate  the  corresponding  asymptotic  distributions  for  the  true  i.i.d.  noise;  the  stochastic  order  gets  reversed  under  fn    n/2.
■590    ▼aSchool  code:  0054.
■650  4▼aStatistics
■650  4▼aApplied  mathematics
■650  4▼aComputer  science
■650  4▼aSystems  science
■653    ▼aCharacteristic  function
■653    ▼aClustering
■653    ▼aDistance  covariance
■653    ▼aGoodness-of-fit
■653    ▼aIndependent  Component  Analysis
■653    ▼aTime  series
■690    ▼a0463
■690    ▼a0984
■690    ▼a0364
■690    ▼a0790
■71020▼aColumbia  University▼bStatistics.
■7730  ▼tDissertations  Abstracts  International▼g86-02B.
■790    ▼a0054
■791    ▼aPh.D.
■792    ▼a2024
■793    ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17163593▼nKERIS▼z이  자료의  원문은  한국교육학술정보원에서  제공합니다.

미리보기

내보내기

chatGPT토론

Ai 추천 관련 도서


    신착도서 더보기
    최근 3년간 통계입니다.

    소장정보

    • 예약
    • 소재불명신고
    • 나의폴더
    • 우선정리요청
    • 비도서대출신청
    • 야간 도서대출신청
    소장자료
    등록번호 청구기호 소장처 대출가능여부 대출정보
    TF10237 전자도서 대출가능 마이폴더 부재도서신고 비도서대출신청 야간 도서대출신청

    * 대출중인 자료에 한하여 예약이 가능합니다. 예약을 원하시면 예약버튼을 클릭하십시오.

    해당 도서를 다른 이용자가 함께 대출한 도서

    관련 인기도서

    로그인 후 이용 가능합니다.