ANOVA

25 May 2015

Curriculum Inferential Stats Statistics Video Distribution Exposition

Analysis of Variance

So far, we’ve used t-tests to compare the mean of a sample with the mean of a population. Then we saw the more practical application of using t-tests to compare two sample populations. The sample parameters we compared were continuous values.

Also, we’ve seen how to use a chi-squared test to compare the proportions of multiple groups. These groups’ data was structured in a contingency table, which is a simple way to express outcomes that fall into discrete categories. In the contigency table example, we looked at “sick” vs “not sick” categories.

Now we’ll move on to Analysis of Variance, which is like a t-test in that the samples will have continuous outcomes, but like a chi-squared test in that we’re comparing multiple groups simultaneously.

The following images, taken from a Stanford lecture’s slide deck, show the relation of t-tests, chi-squared tests, and ANOVA to each other and many other statistical methods that we may get to later.

Stat Methods Chart (Continuous): ANOVA falls into the category where outcome variables are continuous and observations are independent. T-tests fall into same category, but can only be used to compare two groups.

Stat Methods Chart (Discrete): Chi-squared tests falls into the category where outcome variables are categorical and observations are independent. Chi-squared tests can be used to compare two or more groups.

Anova is most easily described by way of example, so we will use the one from Khan Academy’s Inferential Stats video on ANOVA.

ANOVA Example

The data we will use is an arbitrary 3x3 grid of outcomes. The data set is small, so in practice one of the other methods from the above table should be used for small sample sizes, but we’ll keep it simple here and still use ANOVA so it’s easier to understand.

Group 1Group 2Group 3
355
236
147

ANOVA is calculated by – surprise – analyzing the variance of the data. This analysis is done within each group, and then between each of the groups, and these two variances are compared to judge if the outcomes of any group differs statistically. To calculate the variances, we’ll need the grand mean (\(\bar{\bar x}\)) and each group’s mean (\(\bar x_i\)).

\[\bar{\bar x} = 4\]

\[\bar x_1 = 2\] \[\bar x_2 = 4\] \[\bar x_3 = 6\]

The first variance we will calculate is the variance within all the groups. The idea is to add up all the variances within each group into a single value. The final value will be the Sum of Squares Within (SSW).

\[ \begin{eqnarray} SSW &=& \sum_i \sum_j (x_j - \bar x_i)^2 \cr &=& (3-2)^2 + (2-2)^2 + (1-2)^2 + (5-4)^2 + (3-4)^2 + (4-4)^2 + (5-6)^2 + (6-6)^2 + (7-6)^2 \cr &=& 1 + 0 + 1 + 1 + 1 + 0 + 1 + 0 + 1 \cr &=& 6 \end{eqnarray} \]

The second variance to calculate is the varience between each group. Basically, how does each group’s mean differ from the grand mean. This value is called the Sum of Squares Between (SSB).

\[ \begin{eqnarray} SSB &=& \sum_i n_i (\bar x_i - \bar{\bar x})^2, \text{where } n_i \text{ is the size of group i} \cr &=& 3 \times (2-4)^2 + 3 \times (4-4)^2 + 3 \times (6-4)^2 \cr &=& 12 + 0 + 12 \cr &=& 24 \end{eqnarray} \]

We also need the degrees of freedom (d.f.) for each of these variance.

The d.f. for SSW is m * (n - 1), where m is the number of groups and n is the size of each group. The reason is that for a group to have a given mean, the last value can be derived, and you can do this for each of the m groups. So for us, the d.f. of SSW is 6.

The d.f. for SSB is m - 1. The reason for this is because if the grand mean is fixed, then only m - 1 of the group means can vary. So in this case, the d.f. of SSB is 2.

Now, we need to indroduce a new distribution to test these statistics we’ve calculated, the F-distribution. We don’t need to know too much about the ins and outs of the F-distribution, but much like we used z-tables and t-tables for hypothesis testing, we can use an F-table when we’re dealing with ANOVA.

We will use an F-test to compare with the F-table. The F-test allows us to examine what is known as the “omnibus” null hypothesis, which just states that all the true means of our groups are equal.

\[ \begin{eqnarray} &H_0&: \mu_1 = \mu_2 = \mu_3 \cr &H_1&: \text{At least one mean differs from the others} \end{eqnarray} \]

Let’s use \(\alpha = 0.10\) for this example.

The definition of the F-test statistic is the following, where \(df_1\) is the d.f. of SSB and \(df_2\) is the d.f. of SSW.

\[F = \frac{\frac{SSB}{df_1}}{\frac{SSW}{df_2}} = \frac{\frac{24}{2}}{\frac{6}{6}} = 12\]

Next, we can find the critical F-statistic for our \(\alpha\) using the F-table. Each table is for a different \(\alpha\) value. The first one is for \(\alpha = 0.10\), so that’s the one we’ll use.

So our critical F-statistic is 3.46330, which is much less that our calculated F value of 12, so we can safely reject the null hypothesis. Note that a different statistical method is required to determine the specifics about which groups deviate from another group. For example, you could perform a t-test between groups 1 and 2 to see if their means differ.

Moving Forward

Now that we have some of the most common inferential statistical methods in our toolbelt, we can look a lot more critically at scientific studies. Next week we will revisit the study talked about in my post on the illusion of causality to see how well their data actually supports the paper’s claims.

comments powered by Disqus