How to download an artists’ lyrics from Genius.com using Python

Lyricsgenius is a python module package written by John W. that wraps the Genius.com API and makes it easy to download song lyrics.

In summary, there are 3 main functions:

  • Search requests allow you to search Genius.com for any given string, just like you would in the search box on the website.
  • The Song and Artist requests allow you to directly access an item from the Genius database by providing the item’s corresponding API ID. After receiving a request, the database returns a JSON object containing search results, song names, etc.

Before we start, you will need Genius API credentials which you can get from registering here.
Next you need to install the lyricsgenius module.
Once all that is done you good to go.

#Assign your Genius.com credentials and select your artist
import lyricsgenius as genius
geniusCreds = "{Credentials}"
artist_name = "{Your Chosen Artist}"

Quickly test your credentials are working and that you’ve found the correct artist by viewing the first 5 songs.

#Connect your credentials and chosen artist to the genius object then test the first 5 songs
api = genius.Genius(geniusCreds)
artist = api.search_artist(artist_name, max_songs=5)

You will then re-run the same api.search_artist code without the max limit. This will take a while depending on how many songs your artist has.

artist = api.search_artist(artist_name)

Once the search is complete, check your current directory where your search results will eventually be stored.

import os
os.getcwd()

This single line of code will store all your artists’ lyrics and Genius song info in a json file in your current directory

artist.save_lyrics()

Check the json file exists which will probably be named Lyrics_{ArtistName}.json. Next you’ll use pandas to read the json file.

import pandas as pd
Artist=pd.read_json("Lyrics_{ArtistName}.json")

Firstly we will check the file is structured how we expected it to be by looking at some of the data individually.

Artist['songs']
Artist['songs'][5]['lyrics']

This function will extract the relevant data points from every song in our json file

#Create an empty dictionary to store your songs and related data
artist_dict = {}
def collectSongData(adic):
dps = list()
title = adic['title'] #song title
url = adic['raw']['url'] #spotify url
artist = adic['artist'] #artist name(s)
song_id = adic['raw']['id'] #spotify id
lyrics = adic['lyrics'] #song lyrics
year = adic['year'] #release date
upload_date = adic['raw']['description_annotation']['annotatable']['client_timestamps']['lyrics_updated_at'] #lyrics upload date
annotations = adic['raw']['annotation_count'] #total no. of annotations
descr = adic['raw']['description'] #song descriptions

dps.append((title,url,artist,song_id,lyrics,year,upload_date,annotations,descr)) #append all to one tuple list
artist_dict[title] = dps #assign list to song dictionary entry named after song title

collectSongData(Artist['songs'][5]) #check function works

Pick a song and test your dictionary has it. The below should return your chosen song’s lyrics.

artist_dict['{chosen song}'][0][4]

Now to store your dictionary in a CSV format so you can perform further analysis whether that’s in Excel or Python Pandas.

import csvdef updateCSV_file():
upload_count = 0 #Set upload counter
location = "{suitable_file_location}" #Pick file location
print("input filename of song file, please add .csv")
filename = input() #give your file a name
file = location + filename
with open(file, 'w', newline='', encoding='utf-8') as file: #open a new csv file
a = csv.writer(file, delimiter=',') #split by comma
#(title,url,artist,song_id,lyrics,year,upload_date,annotations,descr)
headers = ["Title","URL","Artist", "Song ID", "Lyrics", "Year", "Upload Date", "Annotations","Description"] #create header row
a.writerow(headers) #add header row
for song in artist_dict:
a.writerow(artist_dict[song][0])
upload_count+=1

print(str(upload_count) + " songs have been uploaded")
updateCSV_file()

Now your songs are in a CSV file you can perform further analysis however you like.

I’m a complete amateur at Python so any feedback would be welcome.

Also if you’re interested in learning Python and executing all the cool automated projects you have swirling in your head then DataCamp’s platform is perfect for you. I think the best part isn’t the 355+ courses from Python, SQL or Tableau but the mobile app to practice on, a live community to hold you accountable and skill assessments to perfect your skills.

Thanks

Experimenting with Python and Social Media APIs using web scraping, exploratory data analysis and amateur coding.