Netflix Prize

[1] On September 21, 2009, the grand prize of US$1,000,000 was given to the BellKor's Pragmatic Chaos team which bested Netflix's own algorithm for predicting ratings by 10.06%.

[2] Netflix provided a training data set of 100,480,507 ratings that 480,189 users gave to 17,770 movies.

The other half is the test set of 1,408,789, and performance on this is used by the jury to determine potential prize winners.

The probe, quiz, and test data sets were chosen to have similar statistical properties.

In summary, the data used in the Netflix Prize looks as follows: For each movie, the title and year of release are provided in a separate dataset.

It has been claimed that even as small an improvement as 1% RMSE results in a significant difference in the ranking of the "top-10" most recommended movies for a user.

A trivial algorithm that predicts for each movie in the quiz set its average grade from the training data produces an RMSE of 1.0540.

In order to win the grand prize of $1,000,000, a participating team had to improve this by another 10%, to achieve 0.8572 on the test set.

To win a progress or grand prize a participant had to provide source code and a description of the algorithm to the jury within one week after being contacted by them.

Only then, the team with best submission was asked for the algorithm description, source code, and non-exclusive license, and, after successful verification; declared a grand prize winner.

The more prominent ones were:[10] On August 12, 2007, many contestants gathered at the KDD Cup and Workshop 2007, held at San Jose, California.

[13] On November 13, 2007, team KorBell (formerly BellKor) was declared the winner of the $50,000 Progress Prize with an RMSE of 0.8712 (8.43% improvement).

[14] The team consisted of three researchers from AT&T Labs, Yehuda Koren, Robert Bell, and Chris Volinsky.

Their submission combined with a different team, BigChaos achieved an RMSE of 0.8616 with 207 predictor sets.

In accord with the Rules, teams had thirty days, until July 26, 2009 18:42:37 UTC, to make submissions that will be considered for this Prize.

[24] The final standing of the Leaderboard at that time showed that two teams met the minimum requirements for the Grand Prize.

In 2007 two researchers from The University of Texas at Austin (Vitaly Shmatikov and Arvind Narayanan) were able to identify individual users by matching the data sets with film ratings on the Internet Movie Database.

On March 19, 2010, Netflix reached a settlement with the plaintiffs, after which they voluntarily dismissed the lawsuit.