Do impactful changes in our society affect our musical expression?

A statistical perspective towards the effect of (psycho)social impactful era’s on our musical expression.

Sem Janssen
4 min readAug 14, 2020

The art form known today as ‘music’ is often described as cultural expressions through the medium of sound. Music has played an important part in the cultural evolutions of our society. Changes in musical expressions fueled by major evolutions of societies can even be traced back to ancient Greece (12th–9th centuries BC) (Savage, 2019, p. 7). Other recent examples of music as an expression of our society might sound more familiar:

  • Franz Schubert (1797–1828) -> Romance art era 1815–1910
  • The Animals (House of The Rising Sun) -> Vietnam War 1955–1975
  • Sex Pistols -> Rise of UK anti-establishment 1975
  • Queen -> Live Aid 1985

As our technological capabilities improved over the years, music consumption has also changed dramatically with the rise of streaming services like Spotify and Apple Music. While turning the business model of artists and stakeholders upside down, it also paved the way towards new insights in music consumption behavior. Business intelligence and visualizations with streaming data have become the new normal. Although visualizations have become a guideline in music consumption research, without statistical methods, assumptions deriving from visualization tools like Tableau and Looker can still not be classified as academically valid. In order to make valid conclusions about the effect that (psycho)social impactful era’s might have on our musical expressions I’ve statistically analyzed public Spotify data.

Method: with the publicly available Spotify API tool I extracted a data-set of 2000 releases per year including de amount of words used in these songs (Speechiness). The time span of the data-set reached from 1920 to 2020, making this a data-set of 200K releases. After cleaning up the data, the set showed 169K (169,335 to be precise) songs that I could use to analyse (hit me up if you would like this data-set for statistical purposes). I looked at the variable Speechiness as a parameter for how much of a story a song wants to tell us. In order to make valid conclusions about our data I used the Pearson’s Correlation Coefficient. It allows you to quantify how closely related two variables are to one another. For the “non-geeks”: PCC is a statistic that measures linear correlation between two variables. It has a value between +1 and −1. A value of +1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation. In this case X = year Y = Speechiness.

Analysis: the data shows the following trend over time:

Year vs Speechiness scatter dot plot w/ trendline.

In order to get an idea of where in time we want to look for significant influences of era on speechiness I’ve conducted a Bivariate Correlation analysis in SPSS (hit me up for the full print) with the different era’s dummy coded in the data-set. Results show an interesting significant positive correlation in the 20’s (r(169333) = .151, p < .001) and 30’s(r(169333) = .116, p < .001) followed by a significant negative correlation in the 60’s (r(169333) = -.86, p < .001).

Twenties vs Speechiness scatter dot plot w/ trendline.
Thirties vs Speechiness scatter dot plot w/ trendline.
Sixties vs Speechiness scatter dot plot w/ trendline.

1920’s era: in 1926 and 1929 a significant influence of year on speechiness of music can be seen. When dive into history this year marks the top of the worldwide unemployment rate. This resulted into major demonstrations around the world for better employment conditions. Also 1929 marked the start of the Great American Depression.

1930’s era: effects of The Great Depression are felt all around the world, causing staggering poverty numbers with a peak in public dissatisfaction in 1935.

1960’s era: while impactful events like the Vietnam War are ongoing, demonstrations and social dissatisfaction is seen later on in the 70’s. Civil right protests are ongoing in the US in the year 1963, but they don’t have a worldwide impact.

Conclusion: when we look at our Bivariate Correlation analysis we can state that there is an influence of social era on the amount of words used in music for the 20’s, 30’s and 60's. Although there are more factors contributing towards the effect of social disruption on our musical expression, it shows a statistically valid trend.

Future Research: for further research one could look at methods like regression-, factor, and cluster analysis to make more
(valid) conclusions about WW streaming data that is available for free. One could also assess the influence of genres within the same method described above.

Limitations: although I’ve done my best to make this data-set as reliable as possible with randomization among others, the limit of 2000 songs in the free Spotify API could be of influence on the reliability.

Let this research be an inspiring story of the possibilities deriving from statistical analysis and public music data concerning research in music consumption behavior.

--

--

Sem Janssen

25. | Passionate about music & analytics| Research in music & entertainment consumption