Chapter Summary

When differences are found between groups in a sample, it is possible that these differences are due to errors, rather than reflecting genuine differences between groups in the broader population. In order to determine the whether these differences are genuine or the result of error, psychologists use significance testing. This allows researchers to determine the probability of finding a result if there is no real effect in the population. Whenever a researcher tests for an effect, he or she will determine an acceptable error rate—that is, what level of probability that they are wrong that is acceptable. This value is known as the significance level. In most psychological research, a significance level of .05, or a 5 per cent chance that an effect is not genuine, is standard.

After a significance level has been determined, a researcher can graph a probability distribution, and determine a region of significance. With a significance level of .05, 5 per cent of our distribution will fall in the region of significance. In addition, these regions of significance are associated with critical values, or set values that a statistic must reach in order to reject the null hypothesis. The specific critical value of a given tests depends upon one’s significance level and the type of test being conducted.

One continual concern of researchers is minimizing the likelihood that they will make improper decisions based on the results of their statistical tests. These mistakes can happen in two ways. First, researchers could reject a null hypothesis that is actually true, claiming a difference between groups that does not exist in the population. This is known as a Type I error. Researchers often use replication to avoid making Type I errors.

In the opposite direction, researchers might also fail to reject a null hypothesis that is not true, determining that there is not a significant effect, although there is a difference in our population. This is known as a Type II error. Much of the time, Type II errors are the result of insufficient power to detect an effect. One way to increase power is to increase sample size.

Another factor that influences significance level is the direction of a hypothesis. With a one-tailed hypothesis, researchers not only hypothesize a specific difference, but they hypothesize a direction of difference. For example, gifted students will have better working memory than non-gifted students. Sometimes researchers anticipate a difference but do not have a specific direction. For example, they may expect that gifted students will differ from non-gifted students in terms of creativity, but be unsure which group will score higher. With a directional (one-tailed) hypothesis, the entire region of significance is situated at one end of a distribution. With non-directional (two-tailed) hypotheses, the region of significance is divided between the two ends of a distribution. Due to this, the critical values for two-tailed tests will be greater than the critical values for a comparable one-tailed test.

Additional Online Resources

Online module about hypothesis testing:

Activity demonstrating effect size:

Short song about Type I error:

Articles are rarely published when researchers fail to reject the null hypothesis. Here’s a journal that publishes exclusively articles that accept the null hypothesis:


Test your knowledge of the keywords and definitions in the chapter.


Interactive Quiz for Chapter 11

Instructions: For each question, click on the radio button beside your answer. When you have completed the entire quiz, click the “Submit my answers” button at the bottom of the page to receive your results.

Question 1:

a) a training intervention will not affect problem solving skills of an experimental group
b) depression rates will not differ between part-time and full-time workers
c) empathy will be greater among pet owners versus non-pet owners
d) none of the above

Question 2:

a) empirical probability
b) logical probability
c) scientific probability
d) evidence-based probability

Question 3:

a) 0.001
b) 0.01
c) 0.05
d) 0.5

Question 4:

a) directional error
b) non-directional error
c) Type I error
d) Type II error

Question 5:

a) directional hypothesis
b) non-directional hypothesis
c) two-tailed hypothesis
d) null hypothesis

Question 6:

a) p < 0.01
b) p < 0.05
c) p < 0.10
d) None of the above; all these significance levels warrant rejection of the null hypothesis

Question 7:

a) Type I error
b) Type II error
c) power adjustment error
d) non-random sampling error

Question 8:

a) reject the null hypothesis
b) fail to reject the null hypothesis
c) collect more data
d) scrap the study and start over

Question 9:

a) conduct a two-tailed test
b) increase her significance level
c) have another researcher replicate her project
d) decrease her significance level

Question 10:

a) the critical value for a one-tailed test will be smaller than the critical values for a two-tailed test
b) the critical values for a two-tailed test will be smaller than the critical value for a one-tailed test
c) the critical values will not differ based on the type of test, but the conclusions drawn will
d) none of the above