in machine learning are the identical.
Coding, waiting for results, interpreting them, returning back to coding. Plus, some intermediate presentations of 1’s progress. But, things mostly being the identical doesn’t mean that there’s nothing to learn. Quite quite the opposite! Two to 3 years ago, I began a day by day habit of writing down lessons that I learned from my ML work. In looking back through a number of the lessons from this month, I discovered three practical lessons that stand out:
- Keep logging easy
- Use an experimental notebook
- Keep overnight runs in mind
Keep logging easy
For years, I used Weights & Biases (W&B)* as my go-to experiment logger. In actual fact, I actually have once been in the highest 5% of all energetic users. The stats in below figure tell me that, at the moment, I’ve trained near 25000 models, used a cumulative 5000 hours of compute, and did greater than 500 hyperparameter searches. I used it for papers, for large projects like weather prediction with large datasets, and for tracking countless small-scale experiments.
And W&B really is an important tool: if you happen to want beautiful dashboards and are collaborating** with a team, W&B shines. And, until recently, while reconstructing data from trained neural networks, I ran multiple hyperparameter sweeps and W&B’s visualization capabilities were invaluable. I could directly compare reconstructions across runs.
But I spotted that for many of my research projects, W&B was overkill. I rarely revisited individual runs, and once a project was done, the logs just sat there, and I did nothing with them ever after. Once I then refactored the mentioned data reconstruction project, I thus explicitly removed the W&B integration. Not because anything was unsuitable with it, but since it wasn’t mandatory.
Now, my setup is way simpler. I just log chosen metrics to CSV and text files, writing on to disk. For hyperparameter searches, I depend on Optuna. Not even the distributed version with a central server — just local Optuna, saving study states to a pickle file. If something crashes, I reload and proceed. Pragmatic and sufficient (for my use cases).
The important thing insight here is that this: logging shouldn’t be the work. It’s a support system. Spending 99% of your time deciding on what you desire to log — gradients? weights? distributions? and at which frequency? — can easily distract you from the actual research. For me, easy, local logging covers all needs, with minimal setup effort.
Maintain experimental lab notebooks
In December 1939, William Shockley wrote down an idea into his lab notebook: replace vacuum tubes with semiconductors. Roughly 20 years later, Shockley and two colleagues at Bell Labs were awarded Nobel Prizes for the invention of the fashionable transistor.
While most of us aren’t writing Nobel-worthy entries into our notebooks, we will still learn from the principle. Granted, in machine learning, our laboraties don’t have chemicals or test tubes, as all of us envision after we take into consideration a laboratory. As a substitute, our labs often are our computers; the identical device that I exploit to put in writing these lines has trained countless models over time. And these labs are inherently portably, especially after we are developing remotely on high-performance compute clusters. Even higher, due to highly-skilled administrative stuff, these clusters are running 24/7 — so there’s all the time time to run an experiment!
But, the query is, which experiment? Here, a former colleague introduced me to the thought of mainting a lab notebook, and currently I’ve returned to it in the only form possible. Before starting long-running experiments, I write down:
what I’m testing, and why I’m testing it.
Then, once I come back later — normally the subsequent morning — I can immediately see which ends are ready and what I had hoped to learn. It’s easy, but it surely changes the workflow. As a substitute of just “rerun until it really works,” these dedicated experiments change into a part of a documented feedback loop. Failures are easier to interpret. Successes are easier to copy.
Run experiments overnight
That’s a small, but painful lessons that I (re-)learned this month.
On a Friday evening, I discovered a bug that may affect my experiment results. I patched it and reran the experiments to validate. By Saturday morning, the runs had finished — but once I inspected the outcomes, I spotted I had forgotten to incorporate a key ablation. Which meant … one other full day of waiting.
In ML, overnight time is precious. For us programmers, it’s rest. For our experiments, it’s work. If we don’t have an experiment running while we sleep, we’re effectively wasting free compute cycles.
That doesn’t mean you must run experiments only for the sake of it. But at any time when there may be a meaningful one to launch, starting them within the evening is the proper time. Clusters are sometimes under-utilized and resources are more quickly available, and — most significantly — you’ll have results to analyse the subsequent morning.
A straightforward trick is to plan this deliberately. As Cal Newport mentions in his book “Deep Work”, good workdays start the night before. When you know tomorrow’s tasks today, you may arrange the proper experiments in time.
* That ain’t bashing W&B (it will have been the identical with, e.g., MLFlow), but moderately asking users to judge what their project goals are, after which spend nearly all of time on pursuing that goals with utmost focus.
** Footnote: mere collaborating is in my eyes not enough to warrant using such shared dashboards. It’s essential to gain more insights from such shared tools than the time spent setting them up.
