**Chapter Summary**

When differences are found between groups in a sample, it is possible that these differences are due to errors, rather than reflecting genuine differences between groups in the broader population. In order to determine the whether these differences are genuine or the result of error, psychologists use **significance testing**. This allows researchers to determine the **probability** of finding a result if there is no real effect in the population. Whenever a researcher tests for an effect, he or she will determine an acceptable error rate—that is, what level of probability that they are wrong that is acceptable. This value is known as the **significance level**. In most psychological research, a significance level of .05, or a 5 per cent chance that an effect is not genuine, is standard.

After a significance level has been determined, a researcher can graph a **probability distribution**, and determine a region of significance. With a significance level of .05, 5 per cent of our distribution will fall in the region of significance. In addition, these regions of significance are associated with **critical values**, or set values that a statistic must reach in order to reject the null hypothesis. The specific critical value of a given tests depends upon one’s significance level and the type of test being conducted.

One continual concern of researchers is minimizing the likelihood that they will make improper decisions based on the results of their statistical tests. These mistakes can happen in two ways. First, researchers could reject a null hypothesis that is actually true, claiming a difference between groups that does not exist in the population. This is known as a **Type I error**. Researchers often use replication to avoid making Type I errors.

In the opposite direction, researchers might also fail to reject a null hypothesis that is not true, determining that there is not a significant effect, although there is a difference in our population. This is known as a **Type II error**. Much of the time, Type II errors are the result of insufficient power to detect an effect. One way to increase power is to increase sample size.

Another factor that influences significance level is the direction of a hypothesis. With a one-tailed hypothesis, researchers not only hypothesize a specific difference, but they hypothesize a direction of difference. For example, gifted students will have better working memory than non-gifted students. Sometimes researchers anticipate a difference but do not have a specific direction. For example, they may expect that gifted students will differ from non-gifted students in terms of creativity, but be unsure which group will score higher. With a **directional** (**one-tailed**) **hypothesis**, the entire region of significance is situated at one end of a distribution. With **non-directional** (**two-tailed**) **hypotheses**, the region of significance is divided between the two ends of a distribution. Due to this, the critical values for two-tailed tests will be greater than the critical values for a comparable one-tailed test.

**Additional Online Resources**

Online module about hypothesis testing: http://peoplelearn.homestead.com/MULTIVARIATE/STATSvlp/Module3b.html

Activity demonstrating effect size: http://onlinestatbook.com/stat_sim/group_diff.html

Short song about Type I error: https://www.youtube.com/watch?v=_mnEuMkdksI

Articles are rarely published when researchers fail to reject the null hypothesis. Here’s a journal that publishes exclusively articles that accept the null hypothesis: http://www.jasnh.com/

**Flashcards**

Test your knowledge of the keywords and definitions in the chapter.

## Interactive Quiz for Chapter 11

**Instructions:** For each question, click on the radio button beside your answer. When you have completed the entire quiz, click the “Submit my answers” button at the bottom of the page to receive your results.