AI Training - Tutorial - Train a model to recognize marine mammal sound
Objective
The aim of the tutorial is to understand how to train a model with AI Training in order to classify sounds.
This the next step after you have designed the model with AI Notebooks. You can see the Notebook step in the tutorial: Audio analysis and classification with AI.
It's strongly recommended to read the Notebook tutorial before reading this tutorial.
Requirements
- Access to the OVHcloud Control Panel
- A Public Cloud project created
- The ovhai CLI interface installed on your system (more information here)
- Docker installed and configured to build images.
- An OCI / Docker image registry. You can use a public registry (such as Docker Hub for example) or a private registry. Refer to the Creating a private registry documentation to create a private registry based on Harbor. To make your registry compatible with AI Solutions usage, follow the Use & manage your registries guide.
- Knowledge about building images with Dockerfile
Instructions
Create object storage for data
To train the model you'll need data and a place where to save the trained model. You can reuse the previous object storage used in the Notebook tutorial Audio analysis and classification with AI or follow the step Uploading your dataset on Public Cloud Storage of this same tutorial.
Train your model
To train the model, we will use AI Training. This powerful tool will allow you to automate your pipelines and build fine-tuning phases easily.
AI Training allows you to train models directly from your own Docker images.
First, you need to create a Python script that is in charge of doing the training.
You can copy and paste the following code in a file named train-audio-classification.py:
The tensorboard step is not mandatory. It's just a way to monitor your training.
Then, create a requirements.txt file to declare the Python dependencies:
tensorflow
numpy==1.22.4
pandas
scikit-learn
keras
Then, create a Dockerfile compliant with AI Training.
You can copy and paste the following code in a file named Dockerfile:
Then, build the Docker image and push it in the registry:
The output should be similar to this:
Once your Docker image is created and pushed into the registry, you can directly use the ovhai command to create your model training.
You can launch the training specifying more or less GPU depending on the speed you want for your training.
If your images are stored in a private registry, please follow the documentation Registries - Use & manage your registries to add your registry.
The output should be similar to this:
You can access to the execution logs of your job with the CLI:
The output should be similar to this:
For more explanations about the CLI command for AI Training, please read this guide: CLI Reference.
Once you have your model ready, deploy the model to use it. This will be done with the AI Deploy tool.
Go further
All the source code is available on the OVHcloud GitHub organization.
To create the application using the trained model, you can follow this tutorial: Deploy an app for audio classification task using Streamlit.
Feedback
Please send us your questions, feedback and suggestions to improve the service:
- On the OVHcloud Discord server