Home Artificial Intelligence Philosophy of an Experimentation System — MLOPs Intro Intro Antipatterns Coping with changes DS “Experiment” DS “Experimentation System” philosophy Personal anti patterns

Philosophy of an Experimentation System — MLOPs Intro Intro Antipatterns Coping with changes DS “Experiment” DS “Experimentation System” philosophy Personal anti patterns

1
Philosophy of an Experimentation System — MLOPs Intro
Intro
Antipatterns
Coping with changes
DS “Experiment”
DS “Experimentation System” philosophy
Personal anti patterns

What project structure suits data-science “experiments”?

That is the primary a part of a five part series (1/5) on MLOps, dropped at you by the ML team at Loris.ai.

Loris ML team consists of engineers which have different skillsets, some are leaning more to ML and a few to DS.
First ML/DS roles in startups are inclined to require slightly little bit of each. That’s why I don’t attempt to differentiate between different roles.
For brevity I’ll follow just ML for the remainder of the series.

You would possibly think this just isn’t relevant for you, since you’re an algorithm developer, you don’t care about “systems” or “architecture”. Well then I actually have a surprise for you.

Did you understand that almost all ML projects don’t get to production?
That is as a result of varied reasons, certainly one of those is that production is a multitude and requirements change regularly and if you’ve slightly bit more knowhow of what “production” is, it could totally help your “baby” to be shipped. Another excuse that projects don’t get deployed is because their value just isn’t tangible to stake holders, regarding that, some interactivity can offer you a giant boost of believing.

The next list composed by my experience, meaning that I actually have once done each of the bullets of the list. , moderately a likelihood to view our behavior critically, a mandatory thing in an effort to improve in every practice.

  • Have you ever ever sent a notebook to an engineer?
  • You’re employed mostly locally in a single Conda env across all projects (Exception: M1 users)
  • Folders in your local machine are called “client_a_2023_feb” containing client data
  • You’re training models in a notebook and save them locally
  • Pre processing and training in the identical script/notebook
  • You’re doing error evaluation by static CSVs/ static W&B dashboards
  • Committed data or models in .git (unless you’re working with DVC like paradigms)
  • During business oriented meetings you’re showing off confusion matrices and classification reports
  • You’re employed in silo even when you’ve other team members beside you
  • You are attempting to avoid collaboration because explaining learn how to proceed in your results might take several days

Last disclaimer — although there are particular times that ad-hoc solutions are appropriate, mostly they aren’t and will be avoided for the higher.

Yes, it’s 2023 and we’re still talking about notebooks. 🐻 with me for a minute.

This is meant to impress you! Somewhat little bit of trolling keeps attention high 📈

System vs Ad Hoc

There may be some misconception about what’s the role of DS/ML. To start with we create business value, thus, improving on a KPI related issue for our customers. Right?

right?

Same goes for the remainder of the engineering teams; FS construct apps, backend create infrastructure and so forth. So if a FS creates a button using HTML with jQuery, then it doesn’t matter that’s a technology from 2000’s, so long as that button works. Right?
Same goes for backend engineer that writes an enormous pile of code without breaking it into functions, and same goes for a knowledge scientist developing in an enormous notebook.

By looking how different coding paradigms were progressed within the last 40 years (talking from my perspective only), .
Because backend/FS/frontend/game/system developers have learned that development just isn’t a one time thing. Requirements will change with time, and you’ll have to cope with it, and when your feature might be shipped, those latest requirements might be met with less time as a result of client expectations, catching you unprepared and juggling a number of other things as well.

First — what’s an experiment?

  1. Input data — data, hyper parameters, configurations, features
  2. Outputs — raw results, calculated metrics, models, charts

Most experiments will produce unsatisfying results, hence we should always track only a handful of successful experiments, although a failed experiment may be very interesting at its own.

Up to now, a notebook can totally suffice. You read some .csv, transform into something, train a model, calculate metrics and that’s it.

But when you wish to track several experiments, and even save them and share between the team?
Aren’t W&B or Neptune.ai enough as an experiment tracker?
Just send your dashboard over and we’ll speak about your metrics.
But what if one other member desires to run the very same thing, working on a distinct aspect of the project? Or you would possibly have to go to a vacation and someone must rerun your notebooks when you’re away? What if input has modified?

As I’ve mentioned before, going by the “I would like..” meme, we’re attempting to construct a system. I hope that I actually have convinced you with the previous paragraphs why systems, or proficient software paradigms are higher than “I’m just sending notebooks in slack to the eng team” ad-hoci attitude.

Here’s an inventory of our requirements for an experimentation system (a few of the software inspirations come from clean code):

  1. Reproducible — given same input — you get same output
  2. Robust to variations within the input — if data changes, code doesn’t (open/closed principle from software engineering)
  3. Modular — pre processing, and training are separate modules
  4. There may be an explicit important pipeline (pre processing, training, saving)
  5. Code changes are tracked — Using .git for code changes
  6. Persistent output storage — you upload outputs to your cloud provider
  7. You possibly can go to vacation — Every ML engineer can run this without you
  8. Ability to work in a team — multiple people can contribute code to this project
  9. Tracking and sharing specific experiments — One other engineer should find a way to poke on the experiments you selected to share
  10. Interactive outputs — you’ve a way for others to interact with the model (even before production)
  11. Consistent error evaluation — your experiment system marks some miss-predictions mechanically for you to envision

Although it’s possible to work with notebook based experimentation system, it requires numerous heavy lifting, even for probably the most basic requirements like tracking code changes. Modularity can be possible but mostly forgotten when using a monolith notebook. Are you able to go to vacation when you’ve local notebooks and no person knows where they’re? Can others contribute to your pre processing methods?

Why those matter? Find out how to achieve them? How do they relate to the anti-patterns mentioned?

I’ll share a story from my very own experience.

I used to be working at a certain company as an ML engineer, making an enormous progress, developing all the POC’s that we would have liked. After a 12 months, demand grew and we had to herald more people to assist out. Me as the primary ML engineer needed to welcome everyone and introduce them with the projects.
Up until that time I didn’t have the necessity to share code or working in a team, and to be honest I wasn’t prepared properly for this. Fortunately since day one I used to be committing code into .git and the code was structured to modules. Without this, it could be way more painful.

Having said that, when one other developer had to enhance a model based on my work, he couldn’t find the “data” I used to be working on, this wasn’t surprising, because I used to be working locally. This need sprung the move to saving all the pieces in S3, cleansing and improving our infrastructure.

One other need was consistency.
Each project had numerous configurations, hyper parameters, different features etc. It meant that no person else except the lead maintainer can produce a latest model or simply experiment.
We tackled this challenge by converting implicit expectations to explicit pipelines by introducing “python-invoke” methods which are run by the command line. Afterwards each project had 1 important method to run/construct it.
This transformation helped tremendously with collaboration. Engineers could go to vacations and their project might be utilized in the meantime.

Infrastructure became enabler of cooperation.

Since then, this became my default operating mode.
I are inclined to refine it given the various latest libraries and methodologies which are coming out. But that’s all the time the the case, philosophy stays although implementation changes.

In the subsequent episode we’ll iron out the small print:
1. What’s our project structure?
2. How did we implement pipelines?
3. Where will we store inputs/outputs?

And way more!

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here