An obvious follow on from my last post is to see what my top 25 albums of the year are. In the past I’ve tried to mentally travel over the releases of the past year to try to cook up a list. But this year I thought it would be fun to use the LastFM API to look at my music listening history for 2020, and let the data do the talking as it were.

The first problem that while LastFM is a good source of my listening history its metadata for albums seems quite sparse. The LastFM album.getInfo API call doesn’t seem to return the year the album was published. The LastFM docs indicate that a releasedate property is available, but I couldn’t seem to find it either in the XML or JSON responses. Maybe it was there once and now is gone? Maybe there’s some trick I was overlooking with the API? Who knows.

So to get around this I used LastFM to get my listening history, but then the Discogs API to fetch metadata for a specific album using their search endpoint. LastFM includes MusicBrainz identifiers for tracks and most artists and albums. So I could have used those to look up the album using the MusicBrainz API. But I wasn’t sure if I would find good release dates there either as their focus seems to be on recognizing tracks, and linking them to albums and artists. Discogs is a superb human curated database, like a Wikipedia for music aficionados. Their API returns a good amount of information for each album, for example:

So I created a small function that looks up an artist/album combination using the Discogs search API. I applied the function to the Pandas DataFrame of my listening history, which was grouped by artist and album. When I ran this across the 1,312 distinct albums I listened to in 2020 I actually ran into a handful of albums (86) that didn’t turn up at Discogs. I had actually listened to some of these albums quite often, and wanted to see if they were from 2020. I figured that these probably were obscure things I picked up on Bandcamp. Knowing the provenance of data is important.

Bandcamp is another wonderful site for music lovers. It has an API too, but you have to write to them to request a key because it’s mostly designed for publishers that need to integrate their music catalogs with Bandcamp. I figured this little experiment wouldn’t qualify so I wrote a quick little scraping function that does a search, finds a match, and extracts the release date from the album’s page on the Bandcamp website. This left just four things that I listened just a handful of times,which have since disappeared from Bandcamp (I think).

What I thought would be an easy little exercise with the LastFM API actually turned out to require me to talk to the Discogs API, and then scraping the Bandcamp website. So it goes with data analysis I suppose. If you want to see the details they are in this Jupyter notebook. And so, without further ado, here are my to 25 albums of 2020.