Heritage Health Prize : $3 million

If you are really good at data mining there are big money to be made

As you may  know Kaggle is a company that provides match-making between companies in serious need of data mining and data scientists. Kaggle does this by hosting data mining competitions for the companies and letting the rest of us join these competitions.

Since its conception Kaggle has grown enormously and last year marked a milestone for the company when they together with Heritage Provider Network launched


The Heritage Health Prize competition is single competition running until April 2013 where the first price for the best prediction is $3 million. The competition has attracted a lot of interest. Currently there are more than 1200 teams in the competition.

The Heritage Provider Network has provided more than 70000 rows of data of real people and their history with the health-care system (in USA) over a two year period (giving more than 140000 rows) and the variable that we want to predict is the following: the total numbers of days spend in hospital by the patients.

The reason for the large price is that the ability to predict this number also gives the hospitals the ability to optimize their use of their capacity, and this is one of the holy grails in cost-reductions in the health-care domain.

The competition has several milestones, where the best prediction so far receive a small prize of money. If this has sparked your interest, then there is an excellent blog post from the winners from the milestone last year: http://anotherdataminingblog.blogspot.dk/2011/10/code-for-respectable-hhp-model.html

You can use the blog post as a starting point to get in the game for the prize money, or as a good read on how to do data mining on complex datasets.


Skriv kommentar

Netværkets aktiviteter er medfinansieret af Uddannelses- og Forskningsministeriet og drives af et konsortium bestående af:
Alexandra Instituttet . BrainsBusiness . CISS . Datalogisk Institut, Københavns Universitet . DELTA . DTU Compute, Danmarks Tekniske Universitet . Institut for Datalogi, Aarhus Universitet . IT-Universitetet . Knowledge Lab, Syddansk Universitet . Væksthus Hovedstadsregionen . Aalborg Universitet