The journey from Development to Production Part 1: Building with ‘The Right Approach’

This is a series on how to develop the ML system rapidly that too with maximum Production readiness. This will be a three-part series.

Why this series?

Building an ML System/App is not just developing a ‘Great’ model, it is just a tiny fraction. The above diagram is great in explaining it. I am avoiding the temptation in explaining it, though I would highly recommend you to read this paper.

So back to the question. A ‘Great’ ML System/App is not necessarily a ‘Great Model’ but a ‘Good supporting ecosystem’. Even a mediocre performing model can be transformed into a good system if supporting ecosystem is also good. So along the journey, we’ll discuss how to build such an ecosystem.

Part One: Building with ‘The Right Approach’

Let’s talk about the so-called ‘The Right approach’, shall we?

Before going ahead I would suggest you go through this article by Jeremy Howard

Creative Commons Attribution-ShareAlike 4.0 International. As we discussed in Designing…www.fast.ai

There are two major concepts which we are going to discuss

Microservices
Packaging

Microservice

Microservice needs no introduction as it has proven its great potential in legacy software development (though if you are not aware of it then first get some brief idea of it).

Microservices Architecture Benefits

Software built as microservices can be broken down into multiple component services so that each of these services can be deployed and then redeployed independently without compromising the integrity of an application. That means that microservice architecture gives developers the freedom to independently develop and deploy services.
Better fault isolation; if one microservice fails, the others will continue to work.
The microservice architecture enables continuous delivery.
Easy to understand since they represent a small piece of functionality, and easy to modify for developers, thus they can help a new team member become productive quickly.
The code is organized around business capabilities.
Scalability and reusability, as well as efficiency, are it’s by far the biggest benefits.
Work very well with containers, such as Docker.
Microservices simplify security monitoring because the various parts of an app are isolated. A security problem could happen in one section without affecting other areas of the project.
Increase the autonomy of individual development teams within an organization, as ideas can be implemented and deployed without having to coordinate with a wider IT delivery function.

The list will just go on but you must have gotten its idea.

How Microservices will help in ML development?

All the above-mentioned point are valid in ML development.

Let’s discuss two generic scenarios (these are the two most common scenarios that I see at my company)

Building a reusable End-to-End Pipeline:

The worst cursed for any developer is to reproduce something by rewriting it. Why use must be asking to rewrite instead of old school ctrl + c & ctrl + v. So if older code is monolithic then it might be tightly coupled with ‘other customised’ or ‘specific’ processes that we cannot break something from it & use it in Vanilla form.

Look at the diagram below, except for Model training & Hyperparameter Tuning all other processes can be reused for other pipelines. So by isolating all processes & then developing will make them highly portable & very easy to reuse.

Important Note:

Every process or service must be generic and parameterized, else it will defeat the purpose of microservice.
Some eg. for it would be the use of the relative path, common compliance, common design language, etc

2. Collaboration:

Microservice enables collaboration effectively. In a team of multiple developers, we can divide & isolate processes from each other and then design them in confined.

For eg. one person working on training logic and other working on inference logic; Multiple persons working on multiple processes of the same pipeline (as shown in the diagram); there can infinite number of scenarios.

Git is an extremely essential tool in all this which I will cover in the next part.

Packaging

It is a direct application of Microservice. For Packaging, we could Container technology like Docker (if you don’t docker then I would insist leave everything behind & first learn Docker). For me, Docker was one of the most important invention of the 2010s. Packaging and Docker go hand-in-hand.

Packaging Benefits

ROI: The nature of Packaging is that fewer resources are necessary to run the same application. Docker allows engineering teams to be smaller and more effective.
Standardization and Productivity: Packaging containers ensure consistency across multiple developments and release cycles, standardizing your environment. Packaging provides repeatable development, build, test, and production environments. Standardizing service infrastructure across the entire pipeline allows every team member to work in a production parity environment.
CI Efficiency: Packaging enables you to build a container image and use that same image across every step of the deployment process. A huge benefit of this is the ability to separate non-dependent steps and run them in parallel. The length of time it takes from build to production can be sped up notably.
Compatibility and Maintainability: Eliminate the “it works on my machine” problem once and for all. Parity, in terms of Docker, means that your images run the same no matter which server or whose laptop they are running on. For your developers, this means less time spent setting up environments, debugging environment-specific issues, and a more portable and easy-to-set-up codebase. Parity also means your production infrastructure will be more reliable and easier to maintain.
Rapid Deployment: Docker manages to reduce deployment to seconds. This is due to the fact that it creates a container for every process and does not boot an OS. Data can be created and destroyed without worry that the cost to bring it up again would be higher than what is affordable.
Multi-Cloud Platforms: One of Docker’s greatest benefits is portability. All major cloud computing providers, like AWS, GCP, etc. have embraced Docker’s availability and added individual support. Docker containers can be run in any cloud.
Isolation: Packaging ensures your applications and resources are isolated and segregated. Docker makes sure each container has its own resources that are isolated from other containers. You can have various containers for separate processes running completely different stacks.

On top of these benefits, Docker also ensures that each application only uses resources that have been assigned to them. A particular application won’t use all of your available resources, which would normally lead to performance degradation or complete downtime for other applications.

So when we combine Microservice and Packaging, we get highly Portable, Modular, Reuseable code.

I even have a one-liner for it: Break it → Package it → Move ahead.

This is the end of part one. In the next part, we’ll see Project governance

The Journey from Development to Production Part 2: Project Governance
Project Governance & compliance is the most important factor in developing any problem-solving ML System. Here I will…medium.com

Thanks to Jeremy Howard awesome startup guide.