Statistical hypothesis – using the t-test

T. Dhasaratharaman*

Statistician, Kauvery Hospitals, India

*Correspondence: Tel.: +91 90037 84310; email: dhasa.cst@kauveryhospital.com

The t-test is a statistical hypothesis test in which the test statistic follows a Student’s t-distribution under the null hypothesis.

A t-test is most commonly applied when the test statistic follows a normal distribution if the value of a scaling term in the test statistic is known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistics (under certain conditions) follow a Student’s t distribution.

The t-test can be used, for example, to determine if the means of two sets of data are significantly different from each other.

Among the most frequently used t-tests are:

  • one-sample location test of whether the mean of a population has a value specified in a null hypothesis.
  • two-sample location test of the null hypothesis such that the means of two populations are equal. All such tests are usually called Student’s t-tests, though strictly speaking that name should only be used if the variances of the two populations are also assumed to be equal; the form of the test used when this assumption is dropped is sometimes called Welch’s t-test. These tests are often referred to as unpaired or independent samples t-tests, as they are typically applied when the statistical units underlying the two samples being compared are non-overlapping.

Parametric statistics are used to compare samples of “normally distributed” data. If the data do not follow a normal distribution, these tests should not be used.

Interpretation

A parametric test is any test which requires the data to follow a specific distribution, usually a normal distribution. Common parametric tests we come across are the t-test and the χ2 test.

Analysis of variance (ANOVA)

This is a group of statistical techniques used to compare the means of two or more samples to see whether they come from the same population – the “null hypothesis”. These techniques can also allow for independent variables which may have an effect on the outcome.

t-test (also known as Student’s t)

t-tests are typically used to compare just two samples. They test the probability that the samples come from a population with the same mean value.

χ2 test

A frequently used parametric test is the χ2 test. It is discussed later.

Example

Two hundred adults seeing an asthma nurse specialist were randomly assigned to either a new type of bronchodilator or placebo.

After 3 months the peak flow rates in the treatment group had increased by a mean of 96 l/min (SD 58), and in the placebo group by 70 l/min (SD 52). The null hypothesis is that there is no difference between the bronchodilator and the placebo.

The t statistic is 11.14, resulting in a P value of 0.001. It is therefore very unlikely (1 in 1000 chance) that the null hypothesis is correct so we reject the hypothesis and conclude that the new bronchodilator is significantly better than the placebo.

(Here we are not bothered how we get these values. It is beyond the scope of this app!)

Caution

Parametric tests should only be used when the data follow a “normal” distribution. You may find reference to the “Kolmogorov Smirnov” test. This tests the hypothesis that the collected data are from a normal distribution and therefore assesses whether parametric statistics can be used.

Sometimes authors will say that they have “transformed” data and then analyzed it with a parametric test. This is quite legitimate – it is not cheating! For example, a skewed distribution might become normally distributed if the logarithm of the values is used.

Statistical-hypothesis-using-the-t-test

Mr. T. Dhasaratharaman
Statistician