This is a guest post by Matt S., founder @ atadataco.com, an analytics engineering freelancer marketplace. Sharing today his insights from 12 years working in data and 5 years of being a hiring director at top tech companies in San Francisco such as Airbnb and a former coach at tryexponent.com.
Hiring managers are looking foremost for proof of basic Analytics Engineering skills in candidates applying for their first Data Scientist job. A portfolio with a data cleaning project and a data storytelling project will get you hired quicker than only machine learning or competition projects.
Stop Competing in Kaggle Competitions
Don't get me wrong, Kaggle is great. But put yourself in the shoes of the hiring manager. Competitions do not show you can set up an environment or that you can handle any other file format besides a CSV of cleaned data.
Do one or two competitions, then focus on getting projects that are as close to the daily tasks of an analytics engineer as possible. I say this because your primary goal is to have 2-3 projects on your resume that demonstrate that you can tackle a problem from start to finish. Completing a 'full-stack' data project will always impress me more than demonstrating your ability to apply a model. I will give my opinion on the single best place to get this experience, but first, let's talk about what is needed for a good resume.
Show You Can Clean Data First
Hireability (or a good resume) comes down to proof of experience. I don't need to see how you tried 10 over-trained models to handle one training dataset. Think about how this works in production.
When would you ever expect to build a model that isn't retrained or used beyond a single run? This single-run model development is more like a Proof-Of-Concept (POC) project than anything else, which is prone to creating over-fitted models that would never survive on real data.
Everywhere I look, everyone says data cleaning is the biggest part of data science, so I know I'm not overstating it here. Ask someone farther along in the industry than you:
"What are the core Analytics Engineering skills I would need in my first 90 days at startup?"
Take the advice of experienced people and layout the skills you would need for a project. For example, if I were trying to land a job at a SAAS B2C company, I would know the company is likely to have a set of needed skills. Here are some example projects:
Creating subscription or Life-Time-Value (LTV) analysis in cohort form
This one will be at least semi-relevant to all businesses, even those operating outside of the subscription model. Life-Time-Value measures the total revenue (or net income) a company expects to earn from a given customer segment.
LTV calculations are helpful when:
- Budgeting for sales and marketing
- Revenue forecasting
- Investing in customer retention strategies
This metric, along with Customer Acquisition Cost (CAC) is absolutely critical for businesses developing sustainable growth strategies. Why? Because gaining new customers is expensive. If LTV isn't significantly higher than CAC, then any growth will not be profitable and the company is in trouble.
Companies with complex product portfolios and user behavior patterns might not get much out of a single LTV calculation. It’s much better to first group users who display similar behavior, and then run your LTV calculations. A popular way to do this is cohort analysis.
Here are a few resources to get you started:
- Estimating Customer Lifetime Value Via Cohort Retention, Part I and Part II: A two-part piece covering both theory and application using Python with Pandas and Numpy.
- Cohort Analysis Explained with an Excel Example: Exactly what it claims - a clean example of Cohort Analysis done in Excel.
Cleaning financial trading data and presenting industry averages
If you’re looking to join a fintech company (or anything finance-adjacent) trading data offers endless opportunities for data cleaning and storytelling projects. For example, this project pulls together years of financial filings submitted by S&P 500 companies and allows for both aggregate and company-level queries on average revenue, cost, and important financial ratios like P/E.
Users can also build their own plots to compare industry segments and companies. This project stands out as it showcases competency in a variety of technologies including SQL, Python, Flask, and Kubernetes while serving up genuinely useful and relevant insights.
When I, as a hiring manager, look at the resume of the analytics engineer that built the Economic Industry Dashboard project and see someone is just starting in the field, I do not consider this a red flag against them. I immediately look at their school projects, internships, or alternate learning opportunities to see if they would be a good fit. I will always suggest a minimum of two projects for the candidate applying for their first analytics engineering job:
- a data cleaning project, and
- a data storytelling project.
Your Best Resume Option: Build a Portfolio with Consulting Projects
I've tried so many types of projects and presentations of project summaries on resumes over the years. My experience has taught me that the best experience can be found in consulting jobs. They are often already divided into that perfect, 1-6 weeks time frame. Consulting projects seek to solve an actual business problem with actual business mess (helps a lot when writing about it on a resume). Additionally, they are partially constrained by the fact that the project manager had to spend some time thinking about the definition and steps of the project.
My advice is to try to find 2-3 consulting projects/externships while looking and applying for jobs. As you finish up each stage of the project, stop to write down a little summary. Think about what skills a hiring manager would be looking for and craft a couple of sentences that show you understood the business problem and know how to technically address it. By the time you have a couple of these summaries, you will have a complete project section for your resume and your new job.