Back to All Events

Working with audio sounds easier than it is: a deep learning perspective

  • Lindholmen Conference Hall 5 Lindholmspiren Västra Götalands län, 417 56 Sweden (map)

Abstract

Convolutional neural networks emerged as a natural model for solving computer vision problems and are now accountable for many state-of-the-art results. Perhaps surprisingly, they have also been shown to be very effective in music applications. This talk describes some of the work we have done in this domain at Peltarion during the past year in collaboration with our partner Epidemic Sound. We also dive deeper into the representations commonly used for audio, both waveform and spectral. We compare audio with natural images, describing desirable properties, such as translational equivariance, that no longer persist in audio representations. Along the way, we will look at examples of how researchers have adapted models and representations to introduce similar properties in other applications. Music information retrieval is a large domain very well suited for deep learning and I hope to inspire more work in it from both academia and industry.

Agrin+Hilmkil.jpg

Agrim Hilmkil

AI Research Engineer @ Peltarion

No AI startup in Sweden might be more talked about at the moment than Peltarion. Trying to take the highly technical field of neural networks to the masses by creating a platform for operationalising and simplifying the use of AI, Peltarion is breaking new ground. Agrin Hilmkil, with Chalmers as alma mater, is part of an excellent team of AI research scientists, pushing the boundaries of what is done with machine learning in Sweden. Currently, Agrin is exploring how to use state of the art machine learning to understand sound.