The Hit Equation

So here it is! This page is a high-level description of our Hit Equation, and some of the fun we can have with it. If you're interested in a more technical description, see The Science Behind ScoreAHit. Here, we aim to provide some insight into these questions:

  • What are the features of a perfect hit song?
  • Is there an equation describing a hit?
  • Are there any features which are changing importance through the ages?



Above you can see some of the features we use to classify songs into 'hits' or 'flops'. You can see that we use the energy, tempo, danceability, loudness and other higher-level features such as harmonic simplicity (how simple the chord sequence is) and non-harmonicity (how 'noisy' the song is).

Note that we do not take into account factors external to the audio, such as marketing budget used to promote the song, the music video, the prior popularity of the artist or band, social factors, lyrics, etc. Of course these factors are extremely important, so not including them will inevitable limit the accuracy of our equation. However, our goal was to find out to what extent the audio itself is responsible for the hit potential of a song, so explicitly excluding these factors from the analysis was a conscious choice.

So, can we come up with a simple equation using these audio features which perfectly describes all hit songs? Well, not quite. It turns out that the complexity of music and the human mind don't let themselves be squeezed into an equation that easily!

Hit Potential Equation

Nevertheless, we can come up with a Hit Potential Equation that scores a song according to its audio features. How does this equation work? Well, we first looked at all the UK hits for a certain time (more accurately: a computer algorithm we designed did this for us!), and measured their audio features (loudness, tempo etc). From this we got a list of weights, telling us how important each of the 23 features are.

Once we have these weights (let's call them w = w1, w1, ... w23), we can multiply each of them by the features for a new song, and work out its score. So, if the new song has features f = f1, f2, ... f23, then the score is:



Note that we did not try to find these weights by hand, that would be far too hard and time-consuming for a human being! Instead, we make use of techniques from Artificial Intelligence for this, and more specifically techniques from a branch of AI called Machine Learning. This means that the equation is constructed entirely automatically by a computer, without any intervention by a music expert. See The Science Behind Score a Hit if you are interested in the details.

We can then classify a song into a 'hit' or 'not hit' based on it's score. How good is this equation? It turns out we can distinguish with an accuracy of 60% between songs that make it to the top 5 and those that don't reach above position 30 on the UK Top 40 Singles Chart. (See also the Results page.)

Time Machine

Musical tastes evolve, which means our hit potential equation needs to evolve as well. Indeed, we have found the hit potential of a song depends on the era, biased in different ways towards various audio features, such as tempo, danceability and loudness. This may be due to the varying dominant music style, culture and environment - do you listen to music on an mp3 player, on your phone, or in a club?



That is why we worked out the hit potential equation as a function of time. This shows us what is important in each given era, and the evolving priorities of music listeners. Check out this video from our Youtube Channel, showing the relative importance of each feature through time. You can really see the trends coming and going!




How do we read this video? The further towards the left a bar is, the more songs with this property appear in the top 5 at a given time. If a bar is towards the right, it means that this feature is detrimental to chart success. Bars in the middle mean this musical aspect has little influence on peak chart position. Interested in more evidence? Click to see some of our expected hits to see songs we correctly predicted to be hits using this equation.


Interesting Discussions

Looking at the UK chart from the last 50 years, we noticed some interesting patterns. Plotting feature values through time, we can look at hits and non-hits and see how they differ. In the plots below, hits are shown in blue, non-hits in red. Here's some of the things we noticed:


Around 1980 Seems a Creative Period of Pop Music

The prediction accuracy of our hit potential equation varies over time. Around 1980 it was particularly difficult to predict hits. In the first half of the nineties and from the year 2000 the equation performed best. This suggests that the late seventies and early eighties were particularly creative and innovative periods of pop music. (See here for a graph illustrating the varying prediction accuracy.)




In Early Decades, Harmonically Simple Songs Were Hits, Then...

In early decades, hits tended to be harmonically simpler than non-hits. However, nowadays the opposite is true, shown by the red line dominating towards the right of the graph.



This increase in relative complexity could perhaps be explained by the rise of new genres and subgenres, increasing the diversity of popular music through the ages. Interestingly, it seems that the trend is starting to reverse again, shown by the crossing of the red and blue lines just at the end of 2010.




Dance Songs Were More Popular in the 1980s

The popularity of Disco waned as the 1970s ended and the decade of 1980s dance began. Early electronic hits from Soft Cell (Tainted Love, 1981) and Yazoo (Don't Go, 1982), paved the way for mainstream artists such as David Bowie (Let's dance, 1983), Cyndi Lauper (Girls Just Want to Have Fun, 1984), Madonna (Like a Virgin, 1985) and Michael Jackson (The Way You Make Me Feel, 1987) to score massively danceable single successes.



This trend can be seen in the rise of dancability during the 1980s in the graph to the left. The trend actually begins in the late 1970s, and the gap in danceability between hits and non-hits increases in general until the mid 1990s. The late 90s saw continued high danceability for hits, with non-hit danceability plumetting.




Music of all Qualities is Getting Louder

It's just as your parents have always been telling you: music really is getting louder, but not only this - the increasing width of the bars throughout the time machine shows that the loudness of a song is becoming more useful at distinguishing hits from flops. Interestingly, it seems from the dip towards the right-most extremity that the loudness has peaked and is beginning to decline. Will this trend continue, and will artists use this trend to their advantage? Only time will tell...