Why Test For Statistical Significance?
So we’ve done all of this analysis and we think we’ve found a relationship between two things that we think we can exploit. Perhaps it’s the motion of a stock or the performance of a team in bad weather. We’ve found what we think is a strong correlation.
We can’t look into the future. Nor can we always see everything that has happened in the past. But we have this data in front of us. How do we know if the relationship between the data that we discovered is coincidence or not? Perhaps it is just the sample that we took that has this relationship and not the population as a whole.
The human brain is good at finding patterns in randomness. We WANT there to be some relationship that we can use, something that makes sense, instead of just randomness. Testing for Statistical Significance tells us the probability that the patterns we are seeing are just wishful thinking.
The “Null Hypothesis”
If you start reading about statistical significance, you will quickly run into the term “Null Hypothesis.” In statistical testing, the Null Hypothesis refers to the case where the elements that you are comparing are not related. The results that you have achieved are completely due to chance.
What statistical tests do is to determine the probability that the Null Hypothesis is true. They translate the hypothesized relationship into a p-value.
What are these p-values?
When you are reading about statistical significance you will see people refer to “p-values.” The p-value is a (hopefully) small number that indicates the probability that the Null Hypothesis is true. More directly, it is the probability that the relationship that you have detected has occurred by pure chance.
What are the limits of these statistical tests?
First, it’s usually impossible to know for certain whether or not the results are by chance. The tests only give you the probability that the results are due to chance. It is common to say the results are statistically significant if p < 0.05 (5 percent chance the results are due to chance). Depending on the application, sometimes you will use p < 0.01, meaning that the probability is only 1 percent that the results are due to chance.
Tests for Statistical Significance also only tell us if two things are related. By themselves they do not tell us anything about how strong the relationships are. There are other methods that do, however. Weaker correlations do require a larger sample size in order for statistical tests to verify the correlation.
The standard tests for significance do not test for fraud, collusion, or errors in data collection.
There are more advanced techniques that statisticians can use to test for these things, including utilizing artificial intelligence in the form of artificial neural nets to detect fraud.
What tests are used to compute the p-values?
The type of test you do depends on what information you have about the population as a whole, how large the sample is, and what type of sampling that you have done.
Two of the main tests used to determine statistical significance are the Chi-Square test and the t-test.
The Chi-Square test is used to determine how significant the difference between the observed distribution and the expected distribution is. One way this can be used is to see if there are any biases in patterns of random numbers.
The T-test is done to evaluate averages and intervals against each other. It can be used to determine the probability that the means of two distributions are equal or whether a single population (that can only be sampled – not examined in full) has a particular mean.
Degrees of Freedom
In statistics, “degrees of freedom” are the number of variables that are allowed to vary. Here is an example. Suppose we want to determine if a random number generator is really picking each of the numbers from 0 to 9 in a uniform manner. We will use a Chi-Squared test on the 10 random variables representing the number of occurrences of each digit. At first glance it may seem there are 10 degrees of freedom, one for each random variable. There are actually 9 degrees of freedom because the value of the 10th random variable is determined exactly by the results of the first 9 (sample size minus the sum of the random variables).
Degrees of Freedom are an important part in computations and a difficult concept to master, especially for complicated situations.
Computing Statistical Significance
It is beyond the scope of this article to give a full explanation of how to compute statistical significance. If you are interested in the technical details here are a few very readable resouces; for the T-test and for the Chi-Square Test.
And if you want a REALLY technical discussion, check out the Wikipedia pages for the various statistical tests, just keep in mind they were written for technical precision, no readability.