r - Time Series - Splitting Data Using The timeSlice Method -
referring post:createtimeslices function in caret package in r createtimeslices suggested option cross-validating when using time series data. understand how go selecting values 'initialwindow', 'horizon' , 'fixedwindow' in traincontrol.
they defined within caret follows (?createtimeslices):
initialwindow - initial number of consecutive values in each training set sample
horizon - number of consecutive values in test set sample
fixedwindow - logical: if false, training set start @ first sample.
can please elaborate further on how go selecting right values initialwindow & horizon , actual implications of selecting true or false fixedwindow?
initialwindow
: size of training set/window first modeling iteration. how large should depends on complexity of model fitting, have research minimum sample size expected reliable fit. obviously, larger window needed more complex models, see example measuring forecast accuracy, p. 6.
fixedwindow
: if true
implies moving window (always equal size of initialwindow
), if false
implies growing window (in other words, starts @ first sample) used fit model. in usual output of models caret can observe sizes of training sample , whether growing or moving in (fixedwindow = false
, horizon = 1
):
resampling: rolling forecasting origin resampling (1 held-out no fixed window) summary of sample sizes: 100, 101, 102, 103, 104, 105, ...
horizon
: defines how many consecutive steps ahead model tested. output of caret model gives summary of model accuracy when predicting n steps ahead. value should chosen here depends on application, i.e. whether short-term or longer-term forecasts desired. see again measuring forecast accuracy, p. 7.
Comments
Post a Comment