24 Free Public Datasets Sites Every Data Analyst Must Know

The Data Deluge: Why Reliable Public Datasets Matter for Data Analysts

Data is everywhere – but can you trust it? In today’s data-driven world, data analysts are like modern-day prospectors, sifting through mountains of data to unearth valuable insights. But just like prospectors need a reliable source of ore, data analysts need access to high-quality, trustworthy public datasets or open datasets.

Inaccurate or incomplete data can lead to misleading results and faulty conclusions. This is why knowing where to find reliable and well-maintained open datasets is a crucial skill for any data analyst. This article will equip you with the knowledge you need to navigate the vast landscape of public datasets, introducing you to some of the best free public datasets sites. So, grab your data analysis tools and get ready to dive into a world of trustworthy information!

Public Datasets Use Cases

Numerous websites offer a treasure trove of free open datasets across various domains, ready to fuel your exploration and experimentation for your data analysis and data science projects. The use cases for public datasets including:

Improve your Analytical Skills: Public datasets are a goldmine for aspiring data analysts. Experimenting with readily available data allows you to hone your skills in data wrangling, cleaning, and analysis. From exploring statistical techniques to practicing machine learning algorithms, public datasets provide a safe and cost-effective environment to learn by doing.

Dataset Enrichment: Even seasoned data analysts can benefit from the power of public data. Supplementing your private datasets with relevant public data can add depth and context to your analysis. Imagine analyzing customer demographics – enriching your data with public census data can paint a more comprehensive picture of your target audience.

Inspiration Ignition: Public datasets can spark creativity and fuel innovative projects. Imagine analyzing historical weather patterns alongside social media sentiment to understand public perception of climate change. The possibilities are endless! Public data can be the springboard for groundbreaking research and insightful discoveries.

Free Public Datasets sites

Following is a list of the best free public datasets sites not in any particular order. We’ve curated a diverse list spanning a vast array of domains, ensuring you have the data you need to tackle any analytical challenge.

  1. Kaggle

Description: Kaggle is a popular platform for data science competitions, courses, and collaboration for data scientists and data analysts. It also offers a vast collection of public datasets covering a wide range of domains, including finance, healthcare, marketing, social media, natural language processing, computer vision, and more.

  1. Google Cloud Public Datasets

Google Cloud Public Datasets offers a convenient platform to access a wide range of high-quality public datasets hosted on Google Cloud BigQuery. Analyze weather data, explore economic trends, or delve into social media information – all readily available for your data analysis projects. Plus, Google Cloud Public Datasets provides free storage for the data and allows you to analyze up to 1TB of data per month at no cost, making it a budget-friendly option for data enthusiasts.

  1. IBM Datasets

Explore a rich collection of datasets offered by IBM, encompassing various domains like weather, finance, healthcare, and social media. These datasets are ready to use in enterprise AI applications and often come with relevant tutorials and notebooks to jumpstart your analysis. Be sure to check the license terms associated with each dataset before using it in your projects.

  1. Azure Open Datasets

Explore a comprehensive collection of curated and preprocessed open datasets offered by Microsoft Azure. These datasets span various domains like weather, economics, social media, and machine learning, saving you time on data preparation and allowing you to focus on analysis. Azure Open Datasets integrates seamlessly with other Azure services for a smooth data science workflow.

  1. World Health Organization (WHO) Data

The World Health Organization (WHO) is a leading source of global health data. Explore datasets on causes of death, disease outbreaks, immunization coverage, and health workforce statistics. Gain valuable insights into global health trends and inform public health initiatives around the world.

  1. County Health Rankings & Roadmaps (CHR&R)

Explore public health data like health outcomes, social determinants, and clinical care across U.S. counties. Easy downloads, interactive maps, and insightful reports empower public health professionals and researchers.

  1. UCI Machine Learning Repository

The UCI Machine Learning Repository is a longstanding resource for machine learning datasets. While the interface might seem dated, it offers a curated collection of high-quality datasets for a variety of machine learning tasks, such as classification, regression, clustering, image, speech, text, and more.

  1. Data.gov

Data.gov is a US government website that serves as a central hub for open datasets from various government agencies. You can find datasets on topics like  US government data, economics, demographics, weather, healthcare, environment, and more.

  1. U.S. Census Bureau

The U.S. Census Bureau is a treasure trove of demographic and economic data for the United States. Explore population statistics, Demographics, economics, income levels, housing data, and more, providing valuable insights into various social and economic trends.

  1. U.S. Bureau of Labor Statistics (BLS)

The BLS is a leading source of economic data in the United States. Explore datasets on employment, unemployment, wages, prices, productivity, and more. Find the latest economic news releases, access charts and tables, and utilize data retrieval tools for in-depth analysis.

  1. Food and Nutrition Service (FNS)

Explore data on food assistance programs administered by the USDA’s Food and Nutrition Service (FNS). This includes information on program participation, food expenditures, and other metrics related to the Supplemental Nutrition Assistance Program (SNAP), Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), and various child nutrition programs.

  1. NASA Earthdata

Dive into a treasure trove of Earth science data from NASA! Explore atmospheric, oceanic, land, and cryospheric measurements to gain insights into our planet’s climate, ecosystems, and environmental changes. NASA Earthdata also offers data on human-environment interactions and solar radiation.

  1. World Bank Open Data

The World Bank Open Data platform provides access to a massive collection of development-related data from around the world. This includes data on economics, poverty, education, health, environment, demographics, and more.

  1. UNdata

Explore a vast collection of international data maintained by the United Nations. UNdata offers statistics and indicators on population, education, health, environment, economy, and more. You can search and download data by country or region.

  1. UK open datasets

Since 2010, data.gov.uk has been a central hub for finding and using open government data in the UK. Search and browse published datasets across various domains like business, crime, education, environment, and government spending.

  1. Open Government – Canada.ca

Delve into Canada’s Open Government portal to discover a wealth of open datasets published by the Government of Canada. Find datasets on various topics like benefits, spending, environment, and national security. Explore resources for searching and accessing data, and learn more about Canada’s commitment to open data transparency.

  1. Datahub.io

Datahub provide public datasets on variety of topics including climate change, entertainment, stock market data, property prices, inflation, and logistics. Mostly the data is updated monthly or daily.

  1. Quandl

Quandl offers financial and economic datasets, including stock prices, exchange rates, and economic indicators. It provides their own API for searching and downloading data.

  1. Open Weather Map

Open Weather Map is a popular platform that provides free weather data through an API. You can access historical and current weather data for various locations worldwide, including temperature, precipitation, wind speed, and more. It requires an API key, but obtaining a free tier key is relatively straightforward.

Learn how to connect to OpenWeather API in R.

  1. National Centers for Environmental Information (NCEI)

Access climate and weather data globally and in US provided by the National Oceanic and Atmospheric Administration (NOAA). NCEI offers information on precipitation, temperature, snow, and severe weather events.

  1. FiveThirtyEight Data

Dive into FiveThirtyEight’s data universe! Explore datasets and code used to create forecasts and graphics across various domains like sports, elections, economics, and culture. You’ll also find their pollster ratings here.

  1. Google Dataset Search

Google Dataset Search doesn’t provide open datasets directly but it is a powerful tool that lets you search for datasets across the web. It aggregates datasets from various sources and provides information about the data format, size, and license. This Dataset-search engine allows you to explore a vast range of domains based on your specific needs.

  1. Open Data Network

Open Data Network is a platform that connects users to a network of data providers offering various public datasets. you can find data on topics like finance, public safety, infrastructure, and housing and development.

24. GeoNames

Free and open-source resource for geographical information. It boasts a massive collection of over 25 million geographical names, including cities, towns, mountains, rivers, and more. Geonames data is sourced from various national and international geographic databases, gazetteers, and user contributions. This collaborative approach ensures the database’s comprehensiveness and accuracy.

With its rich dataset and user-friendly interface, Geonames has become a valuable resource for developers, researchers, and anyone working with geographical information.

Extra Bonus (Sample Datasets)

While not real-world data, Microsoft Power BI Samples provide a valuable resource for learning the functionalities of Power BI. Explore pre-built reports and datasets covering various scenarios to practice your data analysis and visualization skills within the Power BI interface.

Similar to Power BI samples, Tableau Public Sample Datasets offer pre-built datasets designed to help you learn the ropes of Tableau Public. Experiment with creating visualizations and dashboards using these sample datasets to get comfortable with the software’s features.

The Bottom Line

With these resources at your fingertips, you’ll have the data you need to fuel groundbreaking research, insightful visualizations, and powerful data-driven projects. So, happy exploring!

Recent Articles

Related Stories