Tuning and training machine learning models with Argo Workflows

by Nicolas Guinoiseau | 10.1.2024

Cover image for Training Machine Learning Models with Argo Workflows

Intro

This post is the second of a series on Argo Workflows, and a direct follow-up of the blog post Getting started with Training Models using Argo Workflows, in which we introduced Argo Workflows and how we use it for our Machine Learning experiments. I highly recommend reading it, as we will mention much of its content here! Here we'll focus on how easy it is to re-use WorkflowTemplates.

Why you want to train and tune your Machine Learning models regularly?

  1. Improve model accuracy Machine learning models often require updates to improve their accuracy over time. As new data becomes available, the model may need to be retrained to incorporate the new information and adjust its predictions.
  2. Avoid overfitting Regularly tuning a machine learning model can help prevent overfitting, which occurs when a model is too complex and begins to be influenced in its predictions by the noise in the data instead of the underlying patterns. Tuning can help simplify the model and improve its generalization ability.
  3. Keep up with changing data Regularly tuning and training machine learning models can help ensure that the model stays up-to-date with changes in the data and continues to provide accurate predictions.
  4. Adapt to changing business requirements As business requirements change, the machine learning model may need to be updated to reflect these changes. Regular tuning and training can help ensure that the model continues to meet the needs of the business.

Regularly tuning and training machine learning models is critical to ensure their continued accuracy and relevance. By keeping models up-to-date with changing data and business requirements, your organisation can make better decisions and achieve better results.

A little recap

Quick recap of the previous blog post. There we went through the subjects of:

  1. What is a WorkflowTemplate
  2. How to use WorkflowTemplates in a Workflow
  3. We made a Workflow capable of evaluating several hyperparameter sets.

A close to real life example of tuning and training

You strive to find the best hyperparameter set for your case to train the best machine-learning model. For instance, it can be which hyperparameter set maximizes the ROC-AUC. This process is we refered earlier as model tuning, and there are several possible techniques to get the best hyperparameter sets. For simplicity we will apply the "brute force" grid technique, which consists of trying several hyperparameter sets and picking the best-performing one.

A UML diagram of how to evaluate hyperparameter sets for a machine learning model

As you can see, it is very similar to the Workflow described in the previous blog post. The last steps of the Workflows are different because while the previous Workflow was meant for results observation, we now want to decide which hyperparameter set we should use to train a model with all the available training data. The great news is that even though this new Workflow has more steps than the previous one, we only have a few manifests to add to our project. The steps from "get-data" to "evaluate" are already declared, so we only need to create two new WorkflowTemplates: "select-best-performing-hypereparameter-set" and "train-model".

The WorkflowTemplate manifest

Only the last two WorkflowTemplates had to be created to create this new Workflow!

Your keen eye should have noticed that this time we are not describing a Workflow, but a WorkflowTemplate. Both syntaxes are almost the same, and having our Workflow as a WorkflowTemplate allows us to use it in various cases. For example, we can refer to it in an Argo Events Sensor (we shall talk about Argo Events in a future blog-post!) as well as in a CronWorkflow.

CronWorkflow

Just as Kubernetes, Argo Workflows offers the possibility to run Workflows on schedule with the kind CronWorkflow. Since we declared our workflow as a WorkflowTemplate we can simply refer to it as shown in this code block.

Multi use

Now we have a workflow capable of tuning and training a machine learning model! 🚀

Model training often proves itself very energy-consuming. At the same time, machine learning models can become obsolete fairly fast, so we want to train new models as new training data becomes available?


In most cases, it is fair to assume that a small addition of training data has little chance of bringing significant changes to the performance of a given hyperparameter set. Considering this, we suggest storing a set of best hyperparameter sets for periodic model tuning rather than regularly evaluating thousands or tens of thousands of models during each tuning session.
Whichever option you think best suits your case, from the Workflow point of view, only the amount of evaluated hyperparameter sets changes: the same Workflow can be used either way! We can simply pass various input arguments to make the difference.

I hope you had a pleasant read and more importantly that you learned something useful

Nicolas Guinoiseau

Data Scientist

nicolas@distrikt.fi

Related posts

Getting started with Training Models using Argo WorkflowsContinue

by | 3.1.2024

link icon

Argo Events: Event-Driven Workflow AutomationContinue

by | 24.1.2024

link icon

Got a project in mind that we could assist with?