How to Kickstart Your Career as a Data Scientist
Answering frequently asked questions by junior data scientists and data scientists-to-be about first
In case you missed it, there’s a pandemic out there, and it forces all of us to shut down all public events. As time goes by, we all begin to understand the impact of the lockdowns, social distancing and absence of gatherings.
One of the things we realized, and by “we” I refer to the Algo group at Taboola, where I work, is the impact this has on those who are just beginning their career path or are about to shift it.
We used to host and attend many data science meetups and conferences, and noticed that many junior data scientists and data scientists-to-be used these gatherings to ask for and receive guidance and unofficial consulting regarding their career paths. And now, when all these are canceled, they have no one to reach out.
And so, we came up with a new initiative, which we named Algo Boost (algoboost.me), to allow everyone to schedule a 30-minutes, one-on-one Zoom session with us, and get the guidance they seek.
We were stunned by how much the data science community here in Israel, where we’re located, was in need for this. All of our volunteers where fully booked for an entire month within six hours of launch.
I personally have already spent a few hours answering questions and providing guidance where I could, and found that there are certain questions and anxieties that are common to most — to be honest, I had them too when I started my career.
Therefore I thought it will be a good idea to write them all down, including my personal thoughts, as I believe more people will find them useful at these times, in others places across the globe too.
I would still like to emphasize — these are my personal view of things, and are nothing but my own advices.
The actual meaning of data scientist varies a lot. You might have noticed that when going over data science job descriptions — each company interprets what data science is in a different way. In some places it means you’re going to work on deep-learning models, in others the role involves mostly SQL and Excel.
Make sure you understand what the specific role you’re looking at actually is. And if you’re definition of data science is working on machine-learning and deep-learning models, then —
Your first job won’t be as a data scientist, and that’s OK. If there’s anything I wish someone would have told me when I just started my career, it’s this one.
A data scientist is a person who knows how to model a problem, how to analyze both data and results, and can implement the code that performs it — and obviously tune it. If it sounds like a lot of skills, that’s because it is.
This is also why it’s not a first job.
You become a data scientist after you gained experience (from your previous role) as either an analyst or a software developer, and then filling the gaps of the other role as part of your first data scientist job.
So if you’re looking for your first job, start by being an analyst or a software developer (preferably big-data related) — whichever suits you best.
This is how most of us begun.
For example, I started my career as a data engineer at Appsflyer, and this was certainly one of the best things that could have happened to me on my career path, and one of the things I’m proud of till today. And so, just in case it wasn’t clear enough, allow me to emphasize —
Programming is part of the job, a major part of it. Implementing machine-learning models means you need to code them. And test them. And deploy them.
And fix bugs, and upgrade them — and we haven’t even touched the input data processing and feature design. Coding is what data scientists do most of the day — and not necessarily coding a state-of-the-art model.
Not everyone is into coding, and that’s perfectly fine, because while I’m probably stating the obvious, I would like to make this clear —
It’s OK not to be a data scientist. These days, it seems like there’s this big halo surrounding data science. This is the hottest trend, and people sometimes get the feeling that being a data scientist is the best career path you can have. That is absolutely false.
The best career path for you is what suits you best, because this where you will thrive.
My first question to anyone who asks me how to become a data scientist is: describe to me your working-day in five years from now.
If you’re into analyzing, figuring out data and using statistics to uncover interesting insights, but the coding is something you’d prefer to avoid — then go be an analyst. If you want to talk to people, present ideas and make decisions based on data — then you should be a product manager.
This is not “letting yourself down” or “settling for second best” — these are meaningful, demanding and extremely challenging roles with a lot of impact, and if this is what you actually want to do — go do it. A title is just a title. But if it really is data science you’re after, here are some of my personal tips:
Focus on models which are relevant to the industry. There are a ton of different model-types and fields under machine-learning and deep-learning, but only some of them are really being used in today’s industry.
These are, mostly, image recognition, natural language processing (NLP) and recommendation systems.
And so, while reinforcement learning might be the coolest thing you’ve ever seen (and I couldn’t agree with you more), this isn’t what I would recommend you focus on when kickstarting your career.
Go to Kaggle, pick up challenges of the types I mentioned, and try to solve them — yourself.
And by that I mean that using external libraries to do the hard technical stuff for you is how we actually do things in the industry, but try to implement a simple version of them yourself at least once.
For example, using NLTK for NLP stemming is great, but try to see if you can implement a basic version of it yourself.
Andrew Ng’s machine-learning course on Coursera even has an exercise where you implement back-propagation form scratch.
These stuff will really make you understand how things work, and for sure will reflect on your job interviews. And if you’re not sure you know how you should tackle these challenges —
Read, and make sure you understand. One of the most important skills of a data scientists is the ability to go look for solutions by themselves.
Many of the challenges we face, we face for the first time. Knowing what are the relevant sources and being able to read academic papers & technical blogposts is a must for this job.
Practice this, and is there’s anything in the paper you don’t fully understand, go find the answers. And once you figured out the answers,
Write blogposts, with code. This tip is one of my personal favorites, as this is one of the things I used to do when started looking for my first job as a data scientist, and still do till today.
The audience to which I write my blogposts is always the same — me, in six months from now, after I’ve forgotten everything that is written in that blogpost. So whenever I write a blogpost, I make sure to explain everything I do starting from the very basics, provide examples, and make sure not to leave any holes or open questions.
I found this to be the best way to make sure I really understand what I think I understand, by following Einstein’s quote: if you can’t explain it simply, you don’t understand it well enough.
Adding code as examples only makes it better, as it forces you to turn theory to practice.
Do you really need a higher degree (M.Sc/Ph.D)? The answer to this question varies by the country you live in. I can tell you that here in Israel, the answer will be: probably yes, and there’s a reason why. As the name implies, data scientists are scientists— meaning, we perform research.
That means that mastering fields which were unknown to you just a month ago, keeping up-to-date with academic papers and designing & performing experiments is the core of data science.
These are the exact same things people do when pursuing to complete their academic thesis, which is why having one is a major advantage. That being said, there are data scientists holding only a bachelor degree.
To be honest, my first manager at Taboola had only a B.Sc, and was still considered one of the brightest in the group.
Do you need an academic degree in machine-learning? The short answer is no. The longer answer will be: no, but you’ll have to work harder to fill in the gaps.
Truth be told, you don’t even need a degree in computer science, but it comes with a cost. I never studied computer science — my academic background is physics, yet here I am.
But I realized I had holes in my data science knowledge and experience, and worked extra hard to fill those. If you too don’t come from an academic data science background, you’ll have to catch up — learn to code, learn to work with data, learn to model, learn to analyze, and really understand what you’re doing and why.
The cost of not having the formal background means the path to your first data science job may be longer than others, and will require more effort, but it’s absolutely possible.
Wondering what steps you should be taking to land your first job as a data scientist? Well, read from the top :). Good luck!