on a regular basis:
This query is flawed from the start.
An excellent project is personal to you, which implies any project I suggest will routinely be a “bad” selection.
In this text, I aim to interrupt down the varieties of projects that truly enable you get hired and the framework you possibly can follow to seek out them.
4–5 easy projects
Start by constructing 4–5 smaller projects to provide your portfolio some initial weight.
The first goal here is especially for “optics” and to make sure that your resume/CV, GitHub, and LinkedIn profiles appear lively and well-populated.
Please take just a few weeks to construct these smaller projects, ensuring they’re of sufficient quality and never something you rapidly generated with ChatGPT.
Aim to construct a wide selection of projects, each using different tools, datasets, and machine learning algorithms.
Algorithms and ML models
I like to recommend you have got projects with the next algorithms:
- Gradient Boosted Trees — The gold standard algorithm for tabular data, so it’s something you will certainly use on the job.
- Neural Networks — Good understanding of deep learning frameworks like TensorFlow or PyTorch is precious, especially if you ought to work in computer vision, NLP or AI.
- Clustering Algorithms — Models like K-Means and DBSCAN display your grasp of unsupervised learning, which is required for some roles.
Getting exciting and novel data
It’s significantly better to acquire a messier and more realistic dataset that reflects the information you’ll encounter in the true world. This may impress employers and interviewers much more, directly demonstrating your abilities as a knowledge scientist.
When choosing datasets in your projects, avoid using overused datasets akin to MNIST, Titanic, or Iris. If I saw these, it will be an fast rejection, or on the very least, put me off so much.
Some good places to get data:
- Use public and free APIs — you possibly can take a look at the free-apis site for some ideas.
- Web scrape data from relevant sites (be certain you might be allowed to do this primary!) — Here is a listing of internet sites that allow web scraping.
- Public government data sources — data.gov is an example you should utilize.
- Gather your individual data through surveys and questionnaires.
To choose what your projects needs to be on, it’s best to begin by answering specific questions you think that can be interesting to find from the information.
I like to recommend showcasing your results using tools like Streamlit or deploying an easy model via GitHub Actions.
Nonetheless, don’t stress about constructing a totally end-to-end production system using something like AWS or its services, akin to EC2 or ECS. At this stage, it’s completely positive if you happen to don’t know find out how to try this, and it’s not the goal of those small projects.
One big project
That is where you actually need to focus and take your time.
After you’ve built your smaller projects, it’s time to make one big project. This one might take a few months if you happen to’re working on it for an hour or two every day.
This may occasionally intimidate you, but it’s worthwhile to put in the trouble if you happen to need a project that stands out from the remainder.
As I discussed earlier, I can’t select this project for you, but I can provide a framework to follow, allowing you to seek out an important project yourself.
Example project
At my previous company, we were hiring for a junior data scientist to work on optimisation and operations research problems.
The candidate we hired stood out for one predominant reason: they’d a highly relevant and deeply personal project that closely matched the role.
They were keen about NFL fantasy football and wanted to enhance how they built their weekly lineups (this is comparable to the Fantasy Premier League within the UK).
So, they developed their very own optimisation engine to allocate players more effectively throughout the constraints of this system.
It wasn’t just the engine itself; they read academic papers on optimisation strategies and studied how others were approaching the identical problem.
- It was a private problem that they were excited about.
- It was unique, and we hadn’t seen anything prefer it before or since.
- It showed their passion and interest in optimisation and operations research.
- It was directly relevant to the job for which they were applying.
My framework
Here’s an easy framework so that you can follow to provide you with great project ideas:
- List not less than five belongings you’re excited about outside of labor and the information science or machine learning field.
- For every thing, provide you with questions you prefer to answers to or other people may find interesting.
- Take into consideration how machine learning could help answer those questions. Don’t worry if the query seems unattainable; be as creative as possible.
- Pick one query that excites you probably the most. Ideally, select something that feels just barely out of your reach ; that way, you’ll really learn and push yourself out of your comfort zone.
Constructing complexity and scale
To make this project stand out, we’d like so as to add some complexity and scale to it. This implies various things, and there are numerous ways to include this.
In the event you’re aiming for a task as a machine learning engineer, it’s especially precious to construct and deploy the project end-to-end.
Your project should ideally include the next:
- Data collection and storage.
- Data preprocessing.
- Model training and evaluation.
- Model deployment (via API, web app, etc).
- Evaluation and presentation of your results.
To do that, you will have to learn a number of the following:
It could seem to be so much, but you don’t have to do all the things on this list.
The predominant thing is to begin and learn this stuff along the way in which; don’t attempt to learn all the things without delay; that’s procrastination.
Document and communicate
The ultimate and arguably most essential part is to document your learning.
Communication is some of the essential skills to have as a machine learning engineer or data scientist, especially while you move up the ranks.
Show your project by:
- Adding your projects to GitHub and having a well-documented README.
- Including instructions for setup and usage to enable users to explore and interact together with your project.
- Write a blog post explaining your projects and the way you probably did it.
- Share it on LinkedIn, Twitter, Reddit, Discord, YouTube, or wherever individuals who could also be excited about trying it are.
The more you share your work, the more visible you turn into to potential employers and collaborators.
It’s actually not that onerous to create a solid portfolio of projects; it just requires consistent work and patience, which most persons are unwilling to do.
There is no such thing as a “quick” project that gets you hired; what’s going to get you hired is taking the time to construct something personal, of excellent quality, and novel.
That’s the key.
One other thing!
I offer 1:1 coaching calls where we are able to chat about whatever you wish — whether it’s projects, profession advice, or simply determining the next step. I’m here to enable you move forward!
1:1 Mentoring Call with Egor Howell
topmate.io