Correlation Analysis for values D, S, C with flow

Published: Mar 28, 2020 by Hyun Ji Moon

find i for
 D ~ [D's trend, D's season, y_S_shift(i)]

df.epsilon = df.y - m.predict(df)['yhat']

for this we suggest the following

sort corr(df.epsilon, y_S_shift(i))
select a number of i who have high correlation.
feature candidate : {y_S_shift(i)}

The reason behind comparing the error (\( y - \widehat{y} \)) and y_S, instead of y_D and y_S, is to eliminate the effect of highered correlation resulting from the same seasonality components.