In this post, I will outline an idea that I find interesting. It’s part of our process to research the problem areas in detail before building anything new. As of now, we’re not building yet.
What’s the idea? π
It’s creating music sheets from youtube videos using machine learning. If you want to learn how to play any musical instruments on Youtube, you will find an extensive collection of videos created using synthesia. In the comment section, the most common request is to ask for transcriptions. It’s time for this to get automated!
For this to work, the video will have to be broken down to spectograms, and I think the best is to start with a video that has just one instrument alone, such as piano, or violin. An app called Harmonic Analyzer does this. There’s still plenty of space for different types of solutions catering for other use cases.
Harmonic Analyzer is not released yet, and its tagline is to help jazz pianists. Meanwhile, I can imagine different apps to help with less popular instruments such as Tuba, Marimba, or Sitar. We as humans play each instrument with physical constraints like it’s not possible to play the cello like a piano because we only have one hand available to shape the harmonic melody. Similarly, it’s impossible to play the piano like playing the drum.
Similar ideas π
It turns out when I search for “Automatic Music Transcription (AMT)”, which is the term for the AI branch, I get lots of products already available to purchase:
- AnthemScore that will “Convert mp3, wav, and other audio formats into sheet music/guitar tab using a neural network trained on millions of data samples.” It’s 62 SGD for their pro version.
- Melody Scanner that will “automatically transcribe your favorite songs to sheet music.” It’s 43 USD per year.
I also think of my pixel phone’s feature to identify what music is playing near you when I think of AMT.
Tech behind this π
AMT has been an active research area in the last few years as AI is becoming more and more capable of producing human-recognisable outputs. Large companies such as Bytedance and Google contributes extensively. Cursory research on what’s the latest capability points me to Magenta, an open-source research project based on TensorFlow that explores creative projects using machine learning.
How sound is this idea personally? π
I find this idea interesting because I play piano and ukulele. I often look for good (and usually free) music sheets on Youtube. It’s surprisingly hard to find sheets with the correct difficulty level personalised to me. Too easy, it sounds horrible, too hard, it’s impossible to play and memorise.
Min’an does not play musical instruments, so it’s probably less exploration for him on the problem space. But I hope he’ll still find the AMT machine learning side interesting.
In terms of collaborating with other people, I only know several friends who are musicians, and none of them is full-time musicians. I don’t think this is a blocker, but it’s something I think about when choosing a project.