Last weekend I attended the #filmdatahack event at the Abandon Normal Devices festival. AND describes itself as a “festival of new cinema, digital culture and art” and is based in the north west of England. It’s been running since 2009 and commissions or stages a large number of exhibitions and events relating to digital/cinema/arts.
Anyway, the event I went to was a hackathon, where people turn up and create stuff with data. I’ve never been to one of these events before, so was a bit apprehensive as to what actually happens.
There was a social event on the friday evening, and then the saturday consisted of a collection of ‘lightning talks’ by various luminaries on the broad subject of film data, followed by a day of bashing away at the laptop, trying to make something.
Because of my borderline obsession with filming locations, I had an idea to do something with the IMDb’s film location data. I thought it might be cool to find out if the locations where a film is shot have any bearing on how well received the film is by the audience. Do people like films shot in the country or the city? Do people prefer films shot in the east or the west?
Anyway, I found a nice map of English counties in SVG format. SVG is a vector graphics format which stores the data as plain-text, which makes it easy to hack. So I spent a while massaging the source code into a usable format.
With the IMDb data, I tried to ignore TV shows, TV movies, stuff shot for DVD extras, etc. Trying to just use actual movie data. Plus I tried to ignore data about film studios, because I was just interested in locations. I probably wasn’t completely successful in this, because it was a bit of a hack due to time constraints.
This may look like hardly anything is shot in the UK, but these are relative values and just go to show that movie-making in the UK is dominated by London.
Blue counties have the fewest locations, increasing through green into red with the most locations.
This is pretty much how you’d expect, although I was surprised that Merseyside wasn’t more prominent. I’d read that Liverpool was the most filmed city outside London. This is probably because the data wasn’t prepared particularly well.
So I was going to continue on and produce a map of counties by average user movie rating, but I ran into data scaling problems. The original maps were generated live, and took about five seconds to render. To process any more data would have taken too long (and I would have had to process a lot more data!). I decided to leave this idea and do something else instead, which I’ll probably reveal at a later date.
It would have been nice to include Wales, Scotland, and Northern Ireland. The only reason I didn’t do this was because the most usable map I could find didn’t include those places. I’ve since found one which does.
Anyway, I won the award for “Best use of data”. The other projects included a randomly generated infinitely long action movie trailer, and a video player which detects when you yawn and increases the volume/skips the video.
At some point I’d like to revisit this project, using other IMDB metrics to sanitise the data properly (eg. Remove movies of less than 60 minutes, or with less than 1000 votes, etc.) Plus, actually go on to discover how locations affect audience appreciation, or box-office returns, etc.