How Convolutional Neural Networks Learn Musical Similarity

https://towardsdatascience.com/how-convolutional-neural-networks-learn-musical-similarity/(towardsdatascience.com)

Audio embeddings for music recommendation are created by representing songs in a high-dimensional space that captures rhythm, timbre, and texture. The process involves converting raw audio files into mel-spectrograms, which are 2D representations suitable for processing by a Convolutional Neural Network (CNN). The model is trained using a contrastive learning objective, specifically InfoNCE loss, which does not require labeled data. This technique encourages the model to pull representations of the same audio sample closer together while pushing representations of different audio samples further apart. The resulting structured embedding space allows for efficient computation of musical similarity, which can be deployed in a recommendation app.

0 points•by hdt•1 month ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?