Spread the love

**Hypothesis testing** is a statistical procedure used to provide evidence in favor of some statement (called a *hypothesis*). For instance, hypothesis testing might be used to assess whether a population parameter, such as a population mean, differs from a specified standard or previous value. In this chapter we discuss testing hypotheses about population means, proportions, and variances.

In order to illustrate how hypothesis testing works, we revisit several cases introduced in previous chapters and also introduce some new cases:

**The Payment Time Case:** The consulting firm uses hypothesis testing to provide strong evidence that the new electronic billing system has reduced the mean payment time by more than 50 percent.

**The Cheese Spread Case:** The cheese spread producer uses hypothesis testing to supply extremely strong evidence that fewer than 10 percent of all current purchasers would stop buying the cheese spread if the new spout were used.

**The Electronic Article Surveillance Case:** A company that sells and installs EAS systems claims that at most 5 percent of all consumers would never shop in a store again if the store subjected them to a false EAS alarm. A store considering the purchase of such a system uses hypothesis testing to provide extremely strong evidence that this claim is not true.

**The Trash Bag Case:** A marketer of trash bags uses hypothesis testing to support its claim that the mean breaking strength of its new trash bag is greater than 50 pounds. As a result, a television network approves use of this claim in a commercial.

**The Valentine’s Day Chocolate Case:** A candy company projects that this year’s sales of its special valentine box of assorted chocolates will be 10 percent higher than last year. The candy company uses hypothesis testing to assess whether it is reasonable to plan for a 10 percent increase in sales of the valentine box.

9.1: The Null and Alternative Hypotheses and Errors in Hypothesis Testing

One of the authors’ former students is employed by a major television network in the standards and practices division. One of the division’s responsibilities is to reduce the chances that advertisers will make false claims in commercials run on the network. Our former student reports that the network uses a statistical methodology called **hypothesis testing** to do this.

**Chapter 9**

To see how this might be done, suppose that a company wishes to advertise a claim, and suppose that the network has reason to doubt that this claim is true. The network assumes for the sake of argument that **the claim is not valid.** This assumption is called the **null hypothesis.** The statement that **the claim is valid** is called the **alternative,** or **research, hypothesis.** The network will run the commercial only if the company making the claim provides **sufficient sample evidence** to reject the null hypothesis that the claim is not valid in favor of the alternative hypothesis that the claim is valid. Explaining the exact meaning of *sufficient sample evidence* is quite involved and will be discussed in the next section.

The Null Hypothesis and the Alternative Hypothesis

In hypothesis testing:

**1** The **null hypothesis,** denoted *H*0, is the statement being tested. Usually this statement represents the *status quo* and is not rejected unless there is convincing sample evidence that it is false.

**2** The **alternative,** or **research, hypothesis,** de noted *Ha*, is a statement that will be accepted only if there is convincing sample evidence that it is true.

Setting up the null and alternative hypotheses in a practical situation can be tricky. In some situations there is a condition for which we need to attempt to find supportive evidence. We then formulate (1) the alternative hypothesis to be the statement that this condition exists and (2) the null hypothesis to be the statement that this condition does not exist. To illustrate this, we consider the following case studies.

EXAMPLE 9.1: The Trash Bag Case1

A leading manufacturer of trash bags produces the strongest trash bags on the market. The company has developed a new 30-gallon bag using a specially formulated plastic that is stronger and more biodegradable than other plastics. This plastic’s increased strength allows the bag’s thickness to be reduced, and the resulting cost savings will enable the company to lower its bag price by 25 percent. The company also believes the new bag is stronger than its current 30-gallon bag.

The manufacturer wants to advertise the new bag on a major television network. In addition to promoting its price reduction, the company also wants to claim the new bag is better for the environment and stronger than its current bag. The network is convinced of the bag’s environmental advantages on scientific grounds. However, the network questions the company’s claim of increased strength and requires statistical evidence to justify this claim. Although there are various measures of bag strength, the manufacturer and the network agree to employ “breaking strength.” A bag’s breaking strength is the amount of a representative trash mix (in pounds) that, when loaded into a bag suspended in the air, will cause the bag to rip or tear. Tests show that the current bag has a mean breaking strength that is very close to (but does not exceed) 50 pounds. The new bag’s mean breaking strength *μ* is unknown and in question. The alternative hypothesis *Ha* is the statement for which we wish to find supportive evidence. Because we hope the new bags are stronger than the current bags, *Ha* says that *μ* is greater than 50. The null hypothesis states that *Ha* is false. Therefore, *H*0 says that *μ* is less than or equal to 50. We summarize these hypotheses by stating that we are testing

*H*0: *μ* ≤ 50 versus *Ha*: *μ* > 50

The network will run the manufacturer’s commercial if a random sample of *n* new bags provides sufficient evidence to reject *H*0: *μ* ≤ 50 in favor of *Ha*: *μ* > 50.

EXAMPLE 9.2: The Payment Time Case

Recall that a management consulting firm has installed a new computer-based, electronic billing system for a Hamilton, Ohio, trucking company. Because of the system’s advantages, and because the trucking company’s clients are receptive to using this system, the management consulting firm believes that the new system will reduce the mean bill payment time by more than 50 percent. The mean payment time using the old billing system was approximately equal to, but no less than, 39 days. Therefore, if *μ* denotes the mean payment time using the new system, the consulting firm believes that *μ* will be less than 19.5 days. Because it is hoped that the new billing system *reduces* mean payment time, we formulate the alternative hypothesis as *Ha*: *μ* < 19.5 and the null hypothesis as *H*0: *μ* ≥ 19.5. The consulting firm will randomly select a sample of *n* invoices and determine if their payment times provide sufficient evidence to reject *H*0: *μ* ≥ 19.5 in favor of *Ha*: *μ* < 19.5. If such evidence exists, the consulting firm will conclude that the new electronic billing system has reduced the Hamilton trucking company’s mean bill payment time by more than 50 percent. This conclusion will be used to help demonstrate the benefits of the new billing system both to the Hamilton company and to other trucking companies that are considering using such a system.

EXAMPLE 9.3: The Valentine’s Day Chocolate Case 2

A candy company annually markets a special 18 ounce box of assorted chocolates to large retail stores for Valentine’s Day. This year the candy company has designed an extremely attractive new valentine box and will fill the box with an especially appealing assortment or chocolates. For this reason, the candy company subjectively projects—based on past experience and knowledge of the candy market—that sales of its valentine box will be 10 percent higher than last year. However, since the candy company must decide how many valentine boxes to produce, the company needs to assess whether it is reasonable to plan for a 10 percent increase in sales.

Before the beginning of each Valentine’s Day sales season, the candy company sends large retail stores information about its newest valentine box of assorted chocolates. This information includes a description of the box of chocolates, as well as a preview of advertising displays that the candy company will provide to help retail stores sell the chocolates. Each retail store then places a single (nonreturnable) order of valentine boxes to satisfy its anticipated customer demand for the Valentine’s Day sales season. Last year the mean order quantity of large retail stores was 300 boxes per store. If the projected 10 percent sales increase will occur, the mean order quantity, *μ*, of large retail stores this year will be 330 boxes per store. Therefore, the candy company wishes to test the null hypothesis *H*0: *μ* = 330 versus the alternative hypothesis *Ha*: *μ* ≠ 330.

To perform the hypothesis test, the candy company will randomly select a sample of *n* large retail stores and will make an early mailing to these stores promoting this year’s valentine box. The candy company will then ask each retail store to report how many valentine boxes it anticipates ordering. If the sample data do not provide sufficient evidence to reject *H*0: *μ* = 330 in favor of *Ha*: *μ* ≠ 330, the candy company will base its production on the projected 10 percent sales increase. On the other hand, if there is sufficient evidence to reject *H*0: *μ* = 330, the candy company will change its production plans.

We next summarize the sets of null and alternative hypotheses that we have thus far considered.

The alternative hypothesis *Ha*: *μ* > 50 is called a **one-sided, greater than alternative** hypothesis, whereas *Ha*: *μ* < 19.5 is called a **one-sided, less than alternative** hypothesis, and *Ha*: *μ* ≠ 330 is called a **two-sided, not equal to alternative** hypothesis. Many of the alternative hypotheses we consider in this book are one of these three types. Also, note that each null hypothesis we have considered involves an **equality.** For example, the null hypothesis *H*0: *μ* ≤ 50 says that *μ* is either less than or **equal to** 50. We will see that, in general, the approach we use to test a null hypothesis versus an alternative hypothesis requires that the null hypothesis involve an equality.

The idea of a test statistic

Suppose that in the trash bag case the manufacturer randomly selects a sample of *n* = 40 new trash bags. Each of these bags is tested for breaking strength, and the sample mean of the 40 breaking strengths is calculated. In order to test *H*0: *μ* ≤ 50 versus *Ha*: *μ* > 50, we utilize the **test statistic**

The test statistic *z* measures the distance between and 50. The division by says that this distance is measured in units of the standard deviation of all possible sample means. For example, a value of *z* equal to, say, 2.4 would tell us that is 2.4 such standard deviations above 50. In general, a value of the test statistic that is less than or equal to zero results when is less than or equal to 50. This provides no evidence to support rejecting *H*0 in favor of *Ha* because the point estimate indicates that *μ* is probably less than or equal to 50. However, a value of the test statistic that is greater than zero results when is greater than 50. This provides evidence to support rejecting *H*0 in favor of *Ha* because the point estimate indicates that *μ* might be greater than 50. Furthermore, the farther the value of the test statistic is above 0 (the farther is above 50), the stronger is the evidence to support rejecting *H*0 in favor of *Ha*.

Hypothesis testing and the legal system

If the value of the test statistic *z* is far enough above 0, we reject *H*0 in favor of *Ha*. To see how large *z* must be in order to reject *H*0, we must understand that **a hypothesis test rejects a null hypothesis H0 only if there is strong statistical evidence against H0.** This is similar to our legal system, which rejects the innocence of the accused only if evidence of guilt is beyond a reasonable doubt. For instance, the network will reject

Open chat

Hello,

How can we help you?

How can we help you?