Ask Capitol Knowledge Base

Welcome to our Support Portal. Search for answers using the search box below,
or create a support ticket if you cannot find your answer.

Tip: Start typing in the input box for immediate search results.

< All Topics
Print

How to Find and Access Datasets for Research

Overview

Description: This article provides guidance on locating and accessing datasets for research. It covers different types of datasets, including open data repositories, government sources, and library databases. Users will learn how to search for datasets by topic, organization, or file type and how to access datasets through academic databases such as ACM Digital Library and IEEE Xplore. 

Estimated Time of Completion (ETOC): 10–15 minutes. 

Datasets

A dataset (data set) is a collection of raw statistics and information. Datasets produced by government agencies or non-profit organizations can usually be downloaded free of charge. However, datasets developed by for-profit companies may be available for a fee. 

You can locate datasets by identifying the agency or organization that focuses on a specific research area of interest. For example, if you are interested in learning about public opinion on social issues, Pew Research Center would be a good place to look. For data about population, the U.S. government’s Population Estimates Program from American Factfinder is an authoritative source. 

Open Data Resources 

Site 

Structure 

Source Type 

Topics 

Data.gov 

Repository 

Public 

U.S. Environment, Climate, Health, Government 

DataPlanet 

Repository 

Public 

Multidisciplinary 

Dept. of Education 

Website 

Public 

Education, Educational Institutions 

Dryad 

Repository 

Public 

Health, Biology 

GitHub 

Repository 

Public 

Computer Science, Tech 

Google Dataset Search 

Search Engine 

Public 

Multidisciplinary 

Harvard Dataverse 

Repository 

Public 

Multidisciplinary, Social Sciences 

Healthdata.gov 

Repository 

Public 

Health, Healthcare 

ICPSR 

Repository 

3rd Party 

Multidisciplinary, Social Sciences 

Kaggle 

Repository 

Public 

Multidisciplinary 

Mendeley Data 

Search Engine 

Public 

Multidisciplinary 

National Artificial Intelligence Research Resource Pilot (NAIRR Pilot) 

Repository 

Public 

AI, Computer Science, Multidisciplinary 

NCES 

Repository 

Public 

Education, Educational Institutions 

Pew Research Center 

Website 

Public 

Social Science Demographics, Trends 

Quandl 

Repository 

Mixed 

Financial, Business 

Re3 

Registry of Repositories 

Public 

n/a 

Registry of Open Data on AWS 

Search Engine 

Mixed 

Multidisciplinary 

Zenodo 

Repository 

Public 

Multidisciplinary 

Datasets by State or Country

To find open data for a particular U.S. state or country, use a search engine and search keywords: open data [name of state or country] 

Red line highlighting the text box

  • Search for your topic keywords followed by the word ‘dataset.’ Example: cybersecurity threats dataset
  • Search for your topic keywords and file type:xls, which will locate Excel documents that might contain raw data. Example: “artificial intelligence” filetype:xls [Note: There’s no space between filetype and xls] 

Library Databases

ACM Digital Library 

Datasets from a research article may be included as Zip files or Txt files. 

  1. Access the ACM Digital Library database from the Virtual Library Database Menu page. 
  2. Use the search box to locate relevant research articles on your topic. Include the word ‘dataset’ with your keywords. Example: llm dataset 
  3. Expand Media Formats on the search results page under Publications 
  4. Select Archive/Zip or Txt.
    Red arrows highlighting the media formats
  5. Select an article from your results page to review the Zip or Txt file under “Supplementary Material.”
    Red line highlighting the supplementary materials     

IEEE Xplore 

Use the following steps to locate a dataset used in a research article within the IEEE Xplore Digital Library database. 

  1. Access the IEEE Xplore Digital Library database from the Virtual Library Database Menu page. 
  2. Use the search box to locate relevant research articles on your topic. Include the word ‘dataset’ with your keywords. Example: llm dataset.
  3. Expand Supplemental Items on the search results page.
  4. Select Datasets and select Apply.
    Red arrow highlighting the supplemental items
  5. Your filtered results have an icon that indicates Dataset Available. Select an article title to view the dataset.
    Red boxes highlighting the dataset icons
  6. On the article’s page, locate the “Code & Datasets” section.
  7. Select the link to view the dataset in the data repository. 
    Red line highlighting the code and datasets

Still Need Help?

If have issues accessing these resources, contact Ask a Librarian for assistance.

Was this article helpful?
Please Share Your Feedback
How Can We Improve This Article?
Table of Contents
Scroll to Top