patriciahoffmanphd

Drew Conway gave this list  http://www.quora.com/Programming-Challenges-1/What-are-some-good-toy-problems-in-data-science



  • World Bank - http://data.worldbank.org/ there are literally too many data sets here to count, but given the mission of WB most of them are focused on growth and development. A small project that did some basic time-series or correlation of this data could be interesting. Comparing post-earthquake metrics for the reliefs efforts in Haiti vs. Pakistan might be a cool place to start.
  • U.S. Census - http://www.census.gov/main/www/a... also Infochimps has a great set of APIs focused on census data (http://api.infochimps.com/), and if you are an R hacker you could use my wrapper to access it (http://cran.r-project.org/web/pa...). Census data is great for doing spatial analysis, e.g., compare the average level of education to mean household income for all US zipcodes and stick it on a map.
  • ICPSR - http://www.icpsr.umich.edu/icpsr... the Inter-university Consortium for Political and Social research is a treasure trove of socially relevant data, and includes current and past waves of the American National Election Study. This would be a good place to consider doing a mash-up, perhaps voting patterns in a given census track controlling for income and education.
  • Yelp - http://www.yelp.com/developers/d... people love to eat and be entertained, and the Yelp API has a decent set of tools for extracting these preferences. Recently, I have tried to play around with this API as part of a project involving health code violation data from in NYC (http://www.nyc.gov/html/datamine...) and found it to be a bit unruly to work with. But, if you had a smaller project in mind it certainly fits your description.
  • Local data - speaking of NYC Data Mine, some of the most useful toy data apps I have seen involve local open data. Check to see if your city, or one nearby, maintains an open data repository and start hacking. Hint: people love to know where buses and taxis are.
Make a Free Website with Yola.