Data Sources for Data Science
Non-GIS and non-statistical data sources, primarily to support the Data Science Certificate
GIS and numeric data
Sciences
- DataONE (Data Observation Network for Earth) Earth and environmental data from many member repositories
- DOE Data Explorer "...science, technology, and engineering data from the US Department of Energy." Click the "Datasets" tab to filter results to data.
- Dryad Data underlying scientific and medical research publications. Historically, much of the data focuses on life sciences.
- European Bioinformatics Institute data resources See the "data resources" on this site. Some will allow downloads of their whole database.
- Global Change Master Directory (GCMD) datasets Earth science data
- National Center for Biotechnology Information (NCBI) download Download via ftp, or download custom datasets using their download tools.Click the "Download" tab. Alternative site: https://www.ncbi.nlm.nih.gov/home/download/
- National Center for Health Statistics (NCHS, CDC) Resources for Researchers "data for researchers, teachers, and students who want to perform data analysis. This page compiles key sources of information found on the NCHS website for those who are interested in analysis of NCHS data as well as documentation and methodology of NCHS data systems."
- ProPublica Data Store ProPublica is an independent, non-profit newsroom. Their data store houses the data they have used in their stories. Some, but not all of the data is free. Categories include health, environment, criminal justice, education, politics, business, transportation and military.
- Zenodo Researcher-deposited science research data.
Social Sciences
- ICPSR This link opens in a new window Restricted to current UCalgary students, faculty, and staff only.ICPSR maintains a data archive of more than 500,000 files of research in the social sciences. It hosts 16 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields. The site also provides information about ICPSR and its services and programs, as well as other related resources and instructional documentation.
- IMF eLibrary data This link opens in a new window Click on "Download IMF Data" for bulk data downloadsProvides comprehensive access to International financial statistics (IFS). View and download predefined data reports or create and share your own views of IMF data.
- OECD iLibrary This link opens in a new window Click on "Statistics > Extract data ..." to access the data warehouseThe OECD iLibrary, the new platform giving seamless and comprehensive access to statistical data, books, journals and working papers, is now available. It replaces SourceOECD... OECD iLibrary contains all the publications and datasets released by OECD (Organisation for Economic Cooperation and Development), International Energy Agency (IEA), Nuclear Energy Agency (NEA), OECD Development Centre, PISA (Programme for International Student Assessment), and International Transport Forum (ITF) since 1998.
- Trade analyser This link opens in a new window Restricted to current UCalgary students, faculty, and staff onlyThis database provides web interfaces for Canadian International Merchandise Trade (CIMT) database and World trade database produced by Statistics Canada, International Trade Division. CIMT data contains monthly trade statistics, 1988 to latest available, showing quantity and value (CDN $), HS10 (imports) / HS8 (exports) (Harmonized Commodity Description and Coding System) classification, province or state of the U.S., country of origin/destination, and dutiable trade indicator and amount. World trade data contains annual trade statistics, showing volume of tra $), HS10 (imports) / HS8 (exports) (Harmonized Commodity Description and Coding System) classification, province or state of the U.S., country of origin/destination, and dutiable trade indicator and amount.cted for output.
- Canadian Hansard Dataset (1901-present) Click the "Data" tab.
- Gapminder We develop data visualization tools to let people explore the vast treasure of global statistics.
- LOBSTER sample files (Limit Order Book System - The Efficient Reconstructor) "access to reconstructed limit order book data for the entire universe of NASDAQ traded stocks. "
- Pew Research Center datasets "Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research." An account (free) is required to download their data.
- ProPublica Data Store ProPublica is an independent, non-profit newsroom. Their data store houses the data they have used in their stories. Some, but not all of the data is free. Categories include health, environment, criminal justice, education, politics, business, transportation and military.
- SNAP (Stanford Network Analysis Project) Large Network Dataset Collection "...more than 50 large network datasets from tens of thousands of nodes and edges to tens of millions of nodes and edges. I[t] includes social networks, web graphs, road networks, internet networks, citation networks, collaboration networks, and communication networks."
- World Input-Output Database From the Faculty of Economics and Business, University of Groningen, the WIOD "provides time series of world input-output tables. Data is included for more than 40 separate countries covering +/- 85% of world GDP." The site also links to other I/O databases.
- YYC Data Collective "The goal of YDC is to empower citizens of Calgary to collect and share data, contributing to the city’s existing open data ecosystem. YYC Data Collective complements existing open data portals, such as Open Calgary (also known as Open Data Catalogue), and Calgary Region Open Data. It focuses on the data that City does not disseminate. "
Industry
- AESO (Alberta Electric System Operator) Market and System Reporting Data on electricity in Alberta, including historical and current supply and demand, and forecasting.
- IBM Watson Community Use the filter to limit to data sets. Includes data from AirBnB, World Bank World Development Indicators, and fictional and image data
- Lending Club Statistics Loans and loan application data
- Quandl sample data An account is required to view the sample data. "Quandl is a platform for data that delivers a suite of unique, alpha-generating alternative datasets atop a strong base of core financial and economic data."
- Walt Disney Animation Studios Volumetric cloud data set and an island scene from the movie Moana.
- Last Updated: Oct 2, 2024 8:13 AM
- URL: https://libguides.ucalgary.ca/datasources
- Print Page