The story of Pandora and her box goes a little something like this. Zeus gets pretty pissed off at this guy Prometheus for throwing fire in a game of RoShamBo, and in retaliation creates this woman Pandora to punish all of mankind (OK, I’m speculating about the game of Rock-Paper-Scissor, but I’d get pretty upset if I got beat by someone using their once in a lifetime throw of fire). Pandora was given many seductive gifts from the gods and one in particular, the gift of curiosity, led her to open a box releasing all the awfulness into the world (including the credit crisis). Realizing what she had just done, Pandora quickly slams the box shut, trapping only Hope inside … or maybe not. After using Pandora.com, it is easy to see why President Obama sees so much hope in the world.
Pandora is an online, streaming music player where users can “customize” their own radio stations. It’s absolutely brilliant since the only thing you really have to do is enter in a song or artist, and Pandora will automatically stream music that is similar to that song or artist. Now, instead of spending hours customizing the perfect 80s party playlist (or mix tape/CD for the romantics out there), we can just tell Pandora what song fits our mood at the time and get hours of music delivered too us. Best of all, if a song comes on that we’re not sure why it was there in the first place (Blame it on the Rain made it on all my mix tapes for some reason), we can easily skip it, and Pandora will exclude songs like it.
So how does Pandora do this? Well, they’ve hired a team of music analysts who essentially measure each song on 100+ musical characteristics, an idea inspired by the Music Genome Project. These characteristics, or metrics, make up the “genes” of a song, and their measurements are used to construct a song vector, a mathematical attempt to value the essence of a song. The similarity of two songs is figured out by measuring the differences between all the musical characteristic of the two songs. To do this well, Pandora uses a complex distance function, which is essentially saying “how far apart, or different, are these two songs.” The shorter the distance, the more similar the songs are, and the more likely that song will be played next in your Pandora station.
This is a very powerful framework, but there is one important assumption that shouldn’t be overlooked, and could be a major drawback to implementing this particular recommendation engine. That critical assumption is that we have identified every factor that captures the je ne sais quoi of a song, which for the non-French speaking means an intangible quality that makes something distinctive. Do you smell the conundrum brewing? How does one measure the intangible? Can you find all the right factors to accurately describe Kris Allen’s performance of Kayne West’s Heartless? Now while it might be next to impossible to figure out everything that makes a song click, it is very important that you catch the most influential ones in your recommendation model. Failure to do this could get you voted off.
Pandora is doing a pretty damn good job recommending songs using this framework and they understand that there are a lot of factors that make a song a unique piece of work. They have developed a framework where they have identified a lot of the measurable, tangible metrics, and have used them to effectively relate songs to each one another. The next big step in recommendation models would be to understand how each individual values a song, what aspects are more important on a case by case basis, and eventually delivering a personalized, Rathan and Rathan’s Infinite Playlist just for me.