Choosing Hypothesis Tests

November 11th, 2010

Tags: hypothesis testing, anova, t test, paired t test, chi square test

A question my students often ask is, “Which hypothesis test should I use and when?” In this article we will address some guidelines to answer the question.  The available hypothesis tests are:

  • Continuous Variable Outcomes
    • T Test
    • Paired T Test
    • ANOVA (Analysis of Variance)
    • Test for Equal Variances
  • Discrete Variable Outcomes
    • Chi-Square

The following examples will address which test to use given a certain set of circumstances.  In hypothesis testing we are faced with answering the question, “Do the variables in my process make a difference, or not, if they are changed?” 

Continuous Variable Outcomes

The output, or outcome, in the process is measured on a continuous scale.  We will refer to the outcomes as the “Y”.  The input variables, or the things we will be changing, are varied between discrete settings, or levels.  The variable could be continuous, but the settings are specific and can be considered discrete.

Case 1: T Test

The T Test allows testing of two items only, or two level settings only.  Let’s say we want to improve our gas mileage.  The output Y is miles per gallon.  The inputs for the T Test are gasoline additives.  The level settings could be Yes (use the additive) and No (plain gasoline without additives).  The sample size can be small using the T Test.  Run 5 tanks of fuel under each condition and measure the miles per gallon.  The null hypothesis for this test is regardless of whether or not we use the additive the gas mileage will remain the same evidenced by p values much greater than 0.05.  The alternative hypothesis is that there is a difference between Yes and No which is evidenced by p values that are less than or equal to 0.05.

Case 2: Paired T Test

In the Paired T Test only two items can be tested, but the tests are run concurrently, or in pairs of both items.  We use the pairing technique when environmental factors may influence the outcomes.  We want that “noise” to have an equal chance to affect the test subjects so running the test concurrently assures this equality of noise distribution.  In this case, we will test two hull designs for nautical speed.  Testing will be carried out over several days so the conditions in the ocean will definitely be changing such as wind speed, wind direction, wave height, and currents.  Both of the hull designs will be subjected to the same conditions when we conduct the tests simultaneously in pairs.  The plan is to conduct 5 races over the course of one week.  If the p values in the Paired T Test are less than or equal to 0.05 than the hull design with the greatest nautical speed can be declared the winner because the test shows a significant difference.  If the p value is much greater than 0.05 then we need to go back to the drawing board because there is no difference in the hull designs.

Case 3: ANOVA

Analysis of Variance, or ANOVA, is very powerful because there is essentially no limit to the number of items, or level settings that can be evaluated during the testing.  We are limited only by practicality.  In this case we want to determine if there is a difference in the distance a golf ball can travel.  The outcome Y is the distance in yards.  We will test Pinnacle, Nike, Titleist, Srixon, Bridgestone, and Callaway.  A robot with one type of golf club will be used to launch the golf balls.  Swing speed and force will be the same for each test subject.  Twenty of each ball will be launched and the driving distance will be measured.  As in all of these hypothesis tests, the p value is the measuring stick for declaring if a difference exists or not.  When the p value is < or = to 0.05 we have a 95% confidence that a significant difference exists.  When the p value is much, much greater than 0.05 we declare that no significant difference exists between the test subjects.

Case 4: Test for Equal Variances

In the three previous cases the concern was a difference in the average value of the outcome based upon the level setting of the input variable.  With Test for Equal Variances the evaluation is the variability of the outcomes about the average.  The standard deviations are evaluated to test for differences in variation.  In this case we will use the data from Case 3, the driving distance of the golf balls.  Which golf ball is most consistent in driving distance?  If I buy a dozen of these golf balls can I expect the same results?  The Test for Equal Variances provides the answer.  If the p value is low than the null must go, but if the p value is high the null applies.  The null hypothesis is always “There is no difference.”  Two tests are used, one is called Bartlett’s Test which requires the distributions to be normally distributed and the other is Levene’s Test which requires only that the data is continuous.

Discrete Variable Outcomes

The output, or outcome, in the process is measured by counting occurrences which is a discrete variable.  We will refer to the outcomes as the “Y”.  The input variables, or the things we will be changing, are varied between discrete settings, or levels.

Case 5: Chi Square

Chi Square testing compares discrete Y’s and discrete X’s.  In this type of analysis categories, or groups, are compared to other categories, or groups.  For example, “Which cruise line had the highest customer satisfaction?” The discrete X variables are (RCI, Carnival, and Princess Cruise Lines).  The discrete Y variables are the frequency of responses from passengers on their satisfaction surveys by category (poor, fair, good, very good, and excellent) that relate to their vacation experience.  Conduct a cross tab table analysis, or Chi Square analysis, to evaluate if there were differences in levels of satisfaction by passengers based upon the cruise line they vacationed on.  Percentages are used for the evaluation and the Chi Square analysis provides a p-value to further quantify whether or not the differences are significant. The overall p-value associated with the Chi Square analysis should be 0.05 or less.  The variables that have the largest contribution to the Chi Square statistic drive the observed differences.

Now you should have a good understanding of which hypothesis test to use and when it is most appropriate.  Remember that it is just as important to determine that there is no difference as well as that there is a difference.  Sound business decisions depend on making choices based on significance.

Want to learn more? Become a member of Educate Virtually and have access to over 100 courses. Subscriptions are either monthly or annual for extra savings. Sign Up Today!

Leave a Comment