8000 GitHub - ysaatchi/genre-classifier: Artist genre classifier using text embeddings
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ysaatchi/genre-classifier

 
 

Repository files navigation

🎻 Genre Classifier

Train a classifier that accepts the name of a musician and predicts the most likely genre of their music out of the set:

{ jazz, opera, country, electronic, metal, rap, classical, reggae }

You will train this classifier to operate on top of a pretrained text encoder that converts artist names into embedding vectors. In support of this task, you will assemble a small dataset of artist names for each genre using GPT.

Steps

  1. Create a dataset of artists by genre and save to JSON. We suggest using the OpenAI GPT API and will provide you an access key.
python create_dataset.py --count 20 --output data.json
  1. Compute an embedding vector from each artist name and save to disk. We suggest using the text encoder from open_clip and saving as a pandas dataframe.
python compute_embeddings.py --input data.json --output embeddings.pkl
  1. Visualize the embeddings in a 2D projection space using umap. We provide the code for this.
python visualize_embeddings.py --input embeddings.pkl
  1. Train a simple classifier to predict the genre from the embedding vector of the artist's name. It is up to you to pick an architecture, train, and evaluate the model.
python train_classifier.py --input embeddings.pkl

About

Artist genre classifier using text embeddings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0