Managing large-scale data science and machine learning projects is difficult because they differ significantly from software engineering. Since we aim to find patterns in data without explicitly coding them, there may be more uncertainty involved, which may lead to varied issues corresponding to:
- Stakeholders’ high expectations may go unmet
- Projects can take longer than initially planned
The uncertainty arising from ML projects is major reason for setbacks. And in terms of large-scale projects — that normally have higher expectations attached to them — these setbacks might be amplified and have catastrophic consequences for organizations and teams.
This blog post was born after my experience managing large-scale data science projects with DareData. I’ve had the chance to administer diverse projects across various industries, collaborating with talented teams who’ve contributed to my growth and success along the best way — its because of them that I could gather the following pointers and lay them out in writing.
Below are some core principles which have guided me in making a lot of my projects successful. I hope you discover them invaluable…