The requirements for the fourth written response for my Statistics II course at UMass Amherst was: “Explain in your own terms the test for analysis of variance. What does an ANOVA test? Why is it useful? On what type of data would you apply it? What is the logic behind the test? Why does it work?” I have yet to receive a grade for this assignment but will update this post once it is issued. Here is my answer:
An ANOVA test, is in essence a method to determine of three or more population means are equal. It is useful because there are many scenarios for which we do not want to know the exact difference between population means, only if they are equivalent. It is also a relatively simple method to calculate if the population means are roughly equal or vastly different. The ANOVA test requires that the populations being tested are normally distributed, have equal variances and that the samples are independent of each other.
The ANOVA test uses the variation between samples within a category and between categories. For instance if we’re testing whether golf ball A, B and C travel the same distance, the ANOVA test utilizes the differences between the means of A, B, and C and also the differences in means of the samples within A, B, and C. The ANOVA test basically compares these two variations (between categories and within categories) and if the variation between categories is relatively high compared to the within categories variation, then the ANOVA test will lead us to reject the null hypothesis (that all population means are equal). This works because if the variation within categories is fairly clustered and the variation between categories is fairly spread out, then the categories cannot, logically, be equivalent, as the within variation shows that each sample is following some pattern.
Furthermore, the ANOVA test is more reliable than using three separate (for instance hypothesis) tests, as three unique tests will compound the confidence level, thus decreasing our confidence in the test.