Good stuff for programming geeks
[ start | index | login or register ]
start > 2004-07-23 > 1

Start/2004-07-23/1

Created by tmoertel. Last edited by tmoertel 1874 days ago. Viewed 3278 times. #6
[diff] [history] [edit] [rdf]
labels
attachments
imdb-ecdf.gif (2731)
imdb-hist.gif (2865)

IMDB movie-rating decoder ring

[Update: If you like this entry, be sure to see the more powerful Grand Unified Decoder Ring in the IMDB Movie-Rating Decoder Ring section of the site.]

Which piece of information is more useful?

  • Spider-Man 2 has an average rating of 8.0 on IMDB.
  • Spider-Man 2 is in the top 5 percent of movies ever made.

If you keep reading, I'll show you how to turn the first into the second.

The >>Internet Movie Database is my favorite source of movie information, but it has a failing: The ratings aren't particularly useful for finding the best movies.

For example, if you look up a movie on IMDB and find that it has an average rating of 5.0, what does that mean? Intuition suggests that because IMDB rates on a 10-scale, the movie should be near the middle of the pack – not the greatest movie in the world, but not an outright stinker, either.

Intuition, however, would be wrong. In reality, the movie is a stinker. It is, in fact, in the worst one-fourth of movies ever made.

How did our intuition lead us so far astray? The problem is that IMDB movie ratings don't reliably indicate a movie's "goodness" with respect to other movies. A 5.0 doesn't really have any particular meaning – other than being about halfway between awful and excellent, the two extremes on IMDB's rating scale. Yes, we know that a 5.0-rated movie is probably "better" than a 4.8-rated movie, but how much better? 0.2 better? What on earth does that mean?

If we want to ascribe a more useful meaning to that 5.0, we'll need to turn to descriptive statistics. And one of the most useful things to look at first is the distribution of ratings:

Histogram of IMDB movie ratings

From the histogram we can see that almost all movies are rated between 4 and 8. If a movie is rated lower than 4, it's one of the worst movies ever made; avoid it. If a movie is rated higher than 8, it's one of the best ever made – almost certainly worth viewing. Of that much, we can be fairly confident just by looking at the histogram.

But what about the ratings in between, the ratings in that big lump in the middle? How does our hypothetical 5.0-rated movie really stack up? To answer those questions, we must turn to the cumulative distribution function for the ratings:

Cumulative Distribution of IMDB movie ratings

Pinpoint a movie's rating on the "Rating" axis, and then trace a line straight up from that point until it intersects the stair-step CDF curve in the middle of the graph. From there, go straight left until you hit the "Proportion of movies ..." axis. Where you land on that axis gives you the magic number that tells you how your movie stacks up against all other movies.

For example, for a 6.0-rated movie, we trace up from the 6 on the Rating axis to the CDF curve and then straight left until we hit about 0.4 on the Proportion axis. That means that the movie is better than about 40% of all other movies, or to look at it another way, 60% of movies are better than our 6.0-rated movie. Repeating the process for our hypothetical 5.0-rated movie shows that it's at the 20% mark – pretty bad.

Since it's a pain in the neck to read the graph, I have made a small decoder ring that is more useful:

IMDB MOVIE RATING
DECODER RING

Movie's % of movies rating it beats ------- ------------

4.00- 9

5.00 21 5.25 24 5.50 30 5.75 35

6.00 42 6.25 48 6.50 57 6.75 63

7.00 72 7.25 78 7.50 87 7.75 91

8.00 95 8.25 97 8.50 98 8.75 99

9.00+ 100

With the decoder ring, we can turn a movie's nearly meaningless IMDB rating into genuinely useful information – a single percentage that tells us where that movie stands within the world of movies.

All you do is look up your movie's IMDB rating in the left-hand column and take the corresponding percentile rank from the right-hand column. For example, Spider-Man 2 currently has a rating of 8.0, which corresponds to 95% on the decoder ring. That's how I knew earlier it's in the top 5% of movies ever made.

I use the decoder ring all the time, and it has made it much easier to select movies that truly are worth watching. It's a great tool. I hope that you find it as useful as I have.

no comments | post comment
community.moertel.com | Copyright © 2003–07 Moertel Consulting