|  | 

Database

Free Dataset Repositaries for Data Mining and Visualizations

Share Button

People in database, datamining, data visualizations and business intelligence require datasets (sets of data) to implement, run and test their algorithms. There are a lot of resources on the internet where you can get synthetic and real datasets for free. Some of the datasets can be benchmark datasets for testing algorithm performance with industry standards. Here I am documenting some of the free resources on the internet that will help you out in your data search for academics and industry needs. I will try to keep this list updated overtime.

P.S. : Before using any of below mentioned datasets please read their respective usage policies.

  • KDD Cup Datasets – This is a very famous knowledge discovery conference that releases data for the researchers and academia.
  • LETOR – This benchmark dataset from Microsoft is used for training, testing and validating your Learning to Rank (used for search engines) algorithm.
  • Yahoo Webscope – Yahoo has provided data here for different needs. Some of the examples are language data, graph and social data, ratings data, advertising and market data and competition data.
  • InfoChimps – Search engine for all your data needs. Consists of a large variety of free and paid datasets.
  • Reddit Opendata – Gives you news about open datasets
  • Google Public Data Explorer – Gives you access to different governmental and public datasets. Lets you visualize these datasets also in different ways. The place where you visualize this data leads you to the official hypertext document where you can download respective data.
  • FIMI repositary – Frequent Itemset Mining Implementation repositary. Most of the datasets here can be used for frequent pattern mining.
  • UCI machine learning repositary – contains databases and database generators contributed by many people overtime. As of today consists of 199 datasets.

You can also checkout KDNuggets for their collection of datasets. Note: not all datasets here are free

Please help me in keeping this list recent by letting me know more free datasets.

Joshua H.

ABOUT THE AUTHOR

Hey I am Josh and I've been into Journalism since the early ages. Its my passion to create reviews and provide content that people enjoy. I have a big vision for Tech2View and me and my co-writers are constantly working on giving you the latest and most desired content. Thanks for reading - Josh

Hey! Here's a few Posts that I picked for you:

1 Comment

  1. Malaya

    very useful

POST YOUR COMMENTS

Your email address will not be published. Required fields are marked *

Name *

Email *

Website

Tech2View.com

Tech2View - Simplifying Tech Since 2007

Hey I’m Joshua H!

Hey I am Josh and I've been into Journalism since the early ages. Its my passion to create reviews and provide content that people enjoy. I have a big vision for Tech2View and me and my co-writers are constantly working on giving you the latest and most desired content. Thanks for reading - Josh

Tech2View on Social Media

We’re on Facebook too!