How to perform two-sample one-tailed t-test in Python
In python, we can use ttest_ind to perform two-sample one-tailed test. Assuming that our hypothesis are:
Ho(Null Hypothesis): P1 >= P2
Ha(Alternative Hypothesis): P1< P2
In this case, we know that we have 1st normal distribution with mean equal to 3 and variance equal to 2 with 400 data points. The 2nd normal distribution has the mean equal to 6 but the same sigma and size as 1st normal distribution.
How can we interpret the results?
According the Stat Trek, when the null hypothesis is: 6>=3, the t score should be equal to 21.2 with degree freedom equal to 798 and SE equal to 0.1414. Stat Trek Calculator gives use the p-value equal to 1.
You might notice that no matter whether or not we write ttest_ind(P1,P2) or ttest_ind(P2,P1) , the t-statistics changes but the p-value does not change. Why? By default, Python Scipy library does not give an option for us to perform one-tailed two sample test. The p-value is computed based on the assumption of two-tailed two sample test.
Therefore, the correct way to perform our null hypothesis in Python should be as below.
Ttest_indResult(statistic=21.374858126615408, pvalue=1.6807582123709593e-80)
The real p-value for our null Hypothesis: P1>=P2 is
real_t_score=Ttest_indResult.statistic
real_pvalue=1-Ttest_indResult.pvalue/2 =1-1.6807582123709593e-80=1-0.84e-80=0.9999
As the real p value is so close to 1, we cannot reject the null hypothesis that P1>=P2 (6>=3).
Ho(Null Hypothesis): P1 >= P2
Ha(Alternative Hypothesis): P1< P2
In this case, we know that we have 1st normal distribution with mean equal to 3 and variance equal to 2 with 400 data points. The 2nd normal distribution has the mean equal to 6 but the same sigma and size as 1st normal distribution.
How can we interpret the results?
According the Stat Trek, when the null hypothesis is: 6>=3, the t score should be equal to 21.2 with degree freedom equal to 798 and SE equal to 0.1414. Stat Trek Calculator gives use the p-value equal to 1.
You might notice that no matter whether or not we write ttest_ind(P1,P2) or ttest_ind(P2,P1) , the t-statistics changes but the p-value does not change. Why? By default, Python Scipy library does not give an option for us to perform one-tailed two sample test. The p-value is computed based on the assumption of two-tailed two sample test.
Therefore, the correct way to perform our null hypothesis in Python should be as below.
P1 = np.random.normal(6,2,400)
P2 = np.random.normal(3,2,400)
stats.ttest_ind(P1, P2, axis=0, equal_var=True)
And you will the see the results as belowTtest_indResult(statistic=21.374858126615408, pvalue=1.6807582123709593e-80)
The real p-value for our null Hypothesis: P1>=P2 is
real_t_score=Ttest_indResult.statistic
real_pvalue=1-Ttest_indResult.pvalue/2 =1-1.6807582123709593e-80=1-0.84e-80=0.9999
As the real p value is so close to 1, we cannot reject the null hypothesis that P1>=P2 (6>=3).
Comments
Post a Comment