Generating media with deep learning is a dual-edge sword with both interesting and scary implications (e.g. deepfakes). However, one of the more uplifting domains is music generation. In this tutorial, you'll learn how to generate CoreML model that generates a rhythm sequence of MIDI notes.
Creating this kind of models doesn't fall under any of the task oriented models provided by either CreateML or TuriCreate, so you'll have to create and train the model using Keras, and then use a coremltools converter to convert the model into CoreML.
This tutorial was inspired by the Matthijs Hollemans' tutorial where he wrote an LSTM in Swift using the Accelerate framework. Its a very good tutorial, and many of the explanations won't be repeated here. Many tutorials have covered the topic of MIDI note generation before, however, one of the interesting approaches used by Matthijs in his tutorial is, he accounts for timing information during training.
Where Matthijs creates his model using tensorflow APIs, and implements his LSTM directly in Swift using the Accelerate framework, this tutorial will focus on creating a model using Keras, and then converting the model using coremltools.
In this tutorial, you'll learn how to
- Create an LSTM model using Keras
- Add a custom activation layer to the model
- Use a custom loss function to train the model
- Convert the model to CoreML
- Write a Swift application to generate a rhythm.
This tutorial makes use Google Colab, however, feel free to use whatever development environment you wish. You can always look at the completed notebook here