I have been listening: Analysing my Last.fm stats

Last.fm is a great website for anyone who listens to music (which I guess is everyone) and loves Data. I have been using the site to track what I have been listening since May 2020. So after almost using the site for a year, I decided to decode how I have been listening. This article is mostly focused on the question of ‘how’ rather than ‘what’ because the website pretty much tells you what any user listens. If anyone is interested in extracting any insight of their listening habit then refer to the link provided below. I am sharing the link for my R Script file at the bottom. Only thing you have to do is to download the data from this website. The timeframe considered for this study is June 2020 to May 2021.

Distribution of songs per day:

As it can be seen the distribution for a number of songs listened to per day is right-skewed. A right-skewed data generally suggests that the lower bounds are very low with respect to the whole data. For this case, it will be translated into something like this: There have been many days when I have listened to very few songs. Very few here signifies a number that is very less compared to the daily mean and medium.

Distribution of songs per month:

This graph is also right skewed but it has also an added feature: it is multimodal. We have one peak that is larger than the other two. The other two are similar. This explains that the number of Scrobbles per month has 2-3 groups. One group where the number might be high, one medium, and one low. These high, medium and low distinctions are obviously with respect to the whole dataset. It can be summarised better if we take a look at the number of songs per month for one year:

As it is evident: the number has varied hugely. From 2000 in June to less than 750 in February. Listening habits based on the time of the day: Next up: I have gone a bit deeper and tried to decipher whether I have listened to more songs at day or at night. For that, I have classified Day and Night in the following way:

  1. If the time is after 5 am and before 5 pm then it is day
  2. If it is after 5 pm and before 5 am then it is night

Based on this assumption, this is how I have listened to songs with respect to each month:

Nothing is very obvious from this plot but a few points:

  1. For most of the months, the proportion of songs that were Scrobbled in day is higher.
  2. The proportion of songs for night is higher for June and July. It makes sense, because during the lockdown in 2020, I used to stay awake for long!
  3. For February to March, the proportion for night is lower than other months. During this time, I started going to the University, then had Covid and the got the Job. So, it is also expected! I try to follow a routine, you see.

From the box plot for the day and night, it can be seen that median number of total songs listened during that time is higher for the day than for the night. There have been also many outliers for night compared to that for the day. It suggests, there have been many nights when I have listened to more songs than I usually do at nights. Don’t we all have those nights where we immerse ourselves with the melody of our favourite artists, and lost the track of time?

GitHub link for the file: https://github.com/shibaprasadb/LastFM