Issue
I am struggling to understand the concept of p-value and the various other results of adfuller test.
The code I am using:
(I found this code in Stack Overflow)
import numpy as np
import os
import pandas as pd
import statsmodels.api as sm
import cython
import statsmodels.tsa.stattools as ts
loc = r"C:\Stock Study\Stock Research\Hist Data"
os.chdir(loc)
xl_file1 = pd.ExcelFile("HDFCBANK.xlsx")
xl_file2 = pd.ExcelFile("KOTAKBANK.xlsx")
y1 = xl_file1.parse("Sheet1")
x1 = xl_file2.parse("Sheet1")
x = x1['Close']
y = y1['Close']
def cointegration_test(y, x):
# Step 1: regress on variable on the other
ols_result = sm.OLS(y, x).fit()
# Step 2: obtain the residual (ols_resuld.resid)
# Step 3: apply Augmented Dickey-Fuller test to see whether
# the residual is unit root
return ts.adfuller(ols_result.resid)
The output:
(-1.8481210964862593, 0.35684591783869046, 0, 1954, {'10%': -2.5675580437891359, '1%': -3.4337010293693235, '5%': -2.863020285222162}, 21029.870846458849)
If I understand the test correctly:
Value | |
---|---|
adf : float | Test statistic |
pvalue : float | MacKinnon’s approximate p-value based on MacKinnon (1994, 2010) |
usedlag : int | Number of lags used |
nobs : int | Number of observations used for the ADF regression and calculation of the critical values |
critical values : dict | Critical values for the test statistic at the 1 %, 5 %, and 10 % levels. Based on MacKinnon (2010) |
icbest : float | The maximized information criterion if autolag is not None. |
resstore : ResultStore, optional |
I am unable to completely understand the results and was hoping someone would be willing to explain them in layman's language. All the explanations I am finding are very technical.
My interpretation is: they are cointegrated, i.e. we failed to disprove the null hypothesis(i.e. unit root exists). Confidence levels are the % numbers.
Am I completely wrong?
Solution
what you stated in your question is correct. Once you applied the Adfuller test over your OLS regression residue, you were checking whether your residue had any heterocedasticity, in another words, if your residue was stationary.
Since your adfuller p-value is lower than a certain specified alpha (i.e.: 5%), then you may reject the null hypothesis (Ho), because the probability of getting a p-value as low as that by mere luck (random chance) is very unlikely.
Once the Ho is rejected, the alternative hypothesis (Ha) can be accepted, which in this case would be: the residue series is stationary.
Here is the hypothesis relation for you:
Ho: the series is not stationary, it presents heterocedasticity. In another words, your residue depends on itself (i.e.: yt depends on yt-1, yt-1 depends on yt-2 ..., and so on)
Ha: the series is stationary (That is normally what we desire in regression analysis). Nothing more is needed to be done.
Answered By - Philipe Riskalla Leal Answer Checked By - Cary Denson (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.