Correlation Analysis for values D, S, C with flow

Published: Mar 28, 2020 by Hyun Ji Moon

find i for
 D ~ [D's trend, D's season, y_S_shift(i)]
df.epsilon = df.y - m.predict(df)['yhat']

for this we suggest the following

  1. sort corr(df.epsilon, y_S_shift(i))
  2. select a number of i who have high correlation.
  3. feature candidate : {y_S_shift(i)}

The reason behind comparing the error (\( y - \widehat{y} \)) and y_S, instead of y_D and y_S, is to eliminate the effect of highered correlation resulting from the same seasonality components.

Share