Higher Education

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide.

Statistics Case Study and Dataset Resources

The philosophies of transparency and open access are becoming more widespread, more popular, and—with the ever-increasing expansion of the Internet—more attainable. Governments and institutions around the world are working to make more and more of their accumulated data available online for free. The datasets below are just a small sample of what is available. If you have a particular interest, do not hesitate to search for datasets on that topic. The table below will give you a quick visual representation of what each resource offers, while the annotated links below the table give a more descriptive explanation on what can be found at each link. In addition, the links below the table include links to lists of data sets, which would have been too numerous to include in this resource.

Source name and URL

Case study provided

Dataset provided

Dataset downloadable

Canadian content

Topics covered (in general)

Statistical Society of Canada

X

X

X

X

Biology, Climate, Environment, Geography, Health, Medicine, Methodology, Physics, Population, Sociology

Journal of Statistics Education

X

X

X

 

Biology, Culture, Economics, Education, Geography, Health, History, Inequality, Medicine, Methodology, Physics, Sports, Sociology

Economic and Social Data Services

X

X

X

 

Economics, Education, Environment, Health, History, Labour, Law, Media, Politics, Population, Psychology, Sociology, Technology, Travel

Rice Virtual Lab in Statistics

X

X

X

 

Computer Science, Crime, Economics, Health, Human Resources, Medicine, Psychology

United Nations

 

X

X

X

Agriculture, Crime, Development, Economics, Education, Energy, Environment, Food, Health, Labour, Population, Sociology, Technology

Government of Canada

 

X

X

X

Agriculture, Culture, Crime, Development, Economics, Education, Energy, Environment, Food, Geography, Government, Health, History, Labour, Law, Military, Population, Sociology, Technology

CANSIM

 

X

X

X

Economics, Population, Sociology

National Climate Data and Information Archive

 

X

X

X

Climate, Geography, Weather

Natural Resources Canada

 

X

X

X

Geography, GIS, Topography

World Bank

 

X

X

 

Economics, Population, Poverty, Sociology

UCLA

X

 

 

 

Biology, Climate, Economics, Health, History Education, Law, Media, Medicine, Politics, Sociology, Transportation

National Center for Case Study Teaching in Science

X

 

 

 

Economics, Environment, Health, Physics, Science, Sociology

Mathematics in Industry

X

 

 

 

Biology, Physics, Transportation



More Information on the Links Above and Further Sources of Data

Dataset resources below have been marked as Canadian-specific by a maple leaf where appropriate. The recommendations start general and get more specific to a single topic as the list goes on. Some of the dataset resources have been created with postsecondary statistics students specifically in mind, and these have gone at the top of the list.


Case Studies from the Statistical Society of Canada

Approximately two case studies per year have been featured at the Statistical Society of Canada Annual Meetings. This website includes all case studies since 1996. Case Studies vary widely in subject matter, from the cod fishery in Newfoundland, to the gender gap in earnings among young people, to the effect of genetic variation on the relationship between diet and cardiovascular disease risk. The data is given a context, is provided for download in multiple formats, and suggested questions to consider as well as references are provided for each data set. The case studies for the current year can be found by clicking on the current year under the “Meetings” tab in the navigation sidebar.


Journal of Statistics Education

This journal, published and accessible online for free, is international and includes at least two data sets with each volume. All volumes to 1993 are archived and available online. Each data set includes the context, methodology, questions asked, analysis, and references. The data for these data sets is included in the journal’s data archive, both linked on the webpage sidebar and at the end of each data set.


Case Studies, The Rice Virtual Lab in Statistics

The Rice Virtual Lab in Statistics is an initiative by the National Science Foundation in the United States created to provide free online statistics help and practice. The online case studies are fantastic not only because they provide context, datasets, and downloadable raw data where appropriate, but they also allow the user to search by type of statistical analysis required for the case study, allowing you to focus on t-tests, histograms, regression, ANOVA, or whatever you need the most practice with. There are a limited number of case studies on this site.


UNdata

The United Nations (UN) Statistics Division of the Department of Economic and Social Affairs has pooled major UN databases from the various divisions as accumulated over the past sixty or more years in order to allow users to access information from multiple UN sources simultaneously. This database of datasets includes over 60 million data points. The datasets can be searched, filtered, have columns changed, and downloaded for ease of use.


Open Data

Open Data is an initiative by the Government of Canada to provide free, easily navigable access to data collected by the Canadian Government in areas such as health, environment, agriculture, and natural resources. You can browse the datasets by subject, file format, or department, or use an advanced search to filter using all of the above as well as keywords. The site also includes links to Provincial and Municipal-level open data sites available across Canada (accessible in the “Links” section of the left-hand sidebar).


Finding Canadian Statistics

The University of Toronto Library has prepared this excellent and exhaustive list of sources for Canadian Statistics on a wide variety of topics, organized by topic. Some have restricted access; you may or may not be able to access these through your university library, depending on which online databases your institution is subscribed to. The restricted links are all clearly labelled in red. This resource also has an international section, accessible through the horizontal toolbar at the top of the page.


StatLib Datasets Archive

StatLib is a free collection of software, articles, and datasets provided by user submission and hosted by Carnegie Mellon University (in the United States). The datasets archive (accessible by the “Get Data” tab in the left-hand sidebar) has a large list of datasets, though many of them are from 2003 and earlier. The best way to access current data is to look at the “Recently Added or Updated Submissions” and “Popular Downloads” modules on the front page (linked above).


CANSIM: Canadian Socioeconomic Database from Statistics Canada

CANSIM is Statistics Canada’s key socioeconomic database, providing fast and easy access to a large range of the latest statistics available in Canada. The data is sorted both by category and survey in which the data was collected. The site not only allows you to access tables of data, but lets you customize your own table of data based on what information you would like CANSIM to display. You can add or remove content, change the way in which the information is summarized, and download your personalized data table.


Climate Data Online: Canada’s National Climate Archive

The National Climate Data and Information Archive provides historical climate data for major cities across Canada, both online and available for download, as collected by the Government of Canada Weather Office. The data can be displayed hourly for each day, or daily for each month. Other weather statistics are also available on the Products and Services page.


GeoGratis

GeoGratis is a portal provided by Natural Resources Canada which provides a single point of access to a broad collection of geospatial data, topographic and geoscience maps, images, and scientific publications that cover all of Canada at no cost and with no restrictions. Most of this data is in GIS format. You can use the Government of Canada’s GeoConnections website’s advanced search function to filter out only information that includes datasets available for download. Not all of the data that comes up on GeoConnections is available online for free, which is why we have linked to GeoGratis in this guide.


CARL Statistics, Statistical Survey of Canadian University Libraries

This website allows users to download datasets collected by the Canadian Association of Research Libraries (CARL) on collection size, emerging services, and salaries, by year, in excel format.


Online Sources of International Statistics Guide, University of Maryland

This online resource, provided by the University of Maryland’s Libraries website, has an impressive list of links to datasets organized by Country and Region, as well as by category (Economic, Environmental, Political, Social, and Population). Some of the datasets are only available through subscriptions to sites such as Proquest. Check with your institution’s library to see if you can access these resources.


Organization for Economic Co-Operation and Development (OECD) Better Life Index

The OECD’s mission is to promote policies that will improve the economic and social well-being of people around the world. Governments work together, using the OECD as a forum to share experiences and seek solutions to common problems. In service to this mission, the OECD created the Better Life Index, which uses United Nations statistics as well as national statistics, to represent all 34 member countries of the OECD in a relational survey of life satisfaction. The index is interactive, allowing you to set your own levels of importance and the website organizes the data to represent how each country does according to your rankings. The raw index data is also available for download on the website (see the link on the left-hand sidebar).


Human Development Index

The HDI, run by the United Nations Development Programme, combines indicators of life expectancy, educational attainment, and income into a composite index, providing a single statistic to serve as a frame of reference for both social and economic development. Under the “Getting and Using Data” tab in the left-hand sidebar, the HDI website provides downloads of the raw data sorted in various ways (including an option to build your own data table), as well as the statistical tables underlying the HDI report. In the “Tools and Rankings” section ( also in the left-hand side bar) you can also see various visualizations of the data and tools for readjusting the HDI.


The World Bank DataBank

The World Bank is an international financial institution that provides loans to developing countries towards the goal of worldwide reduction of poverty. DataBank is an analysis and visualization tool that allows you to generate charts, tables, and maps based on the data available in several databases. You can also access the raw data by country, topic, or by source on their Data page (also linked above).


Commission for Environmental Cooperation (CEC): North American Environmental Atlas

The CEC is a collaborative effort between Canada, the United States, and Mexico to address environmental issues of continental concern. The North American Environmental Atlas (first link above) is an interactive mapping tool to research, analyze, and manage environmental issues across the continent. You can also download the individual map files and data sets that comprise the interactive atlas on the CEC website. Most of the map layers are available in several mapping files, but also provide links to the source datasets that they use, which are largely available for download.


Population Reference Bureau DataFinder

The Population Reference Bureau informs people about population, health, and the environment, and empowers them to use that information to advance the well-being of current and future generations. It is based in the United States but has international data. The DataFinder website combines US Census Bureau data with international data from national surveys. It allows users to search and create custom tables comparing countries and variables of your choice.


Mathematics-in-Industry Case Studies Journal

This international online journal (run by the FIELDS Institute for Research in Mathematical Sciences, Toronto) is dedicated to stimulating innovative mathematics by the modelling and analysis of problems across the physical, biological, and social sciences. While the information in this journal is more about the process of modelling various industry-related issues, and so it does not explicitly provide case study data sets for students to explore on their own, this journal does provide examples of problems worked on by mathematicians in industry, and can give you an understanding of the myriad ways in which statistics and modelling can be applied in a variety of industries.


UCLA Department of Statistics Case Studies

The University of California Los Angeles offers HTML-based case studies for student perusal. Many of these include small datasets, a problem, and a worked solution. They are short and easy to use, but not formatted to allow students to try their hand before seeing the answer. This website has not been updated since 2001.


National Center for Case Study Teaching in Science

This website, maintained by the National Center for Case Study Teaching in Science out of the University of Buffalo, is a collection of over 450 peer-reviewed cases at the high school, undergraduate, and graduate school levels. The cases can be filtered by subject, and several are listed under “statistics.” In order to access the answer keys, you must be an instructor affiliated with an educational institution. If you would like to access the answer to a particular case study, you can ask your professor to register in order to access the answer key, if he or she will not be marking your case study his/herself.