View profile

We all need a tape measure

We all need a tape measure
By Santiago • Issue #5 • View online
This is the 5th issue, and I’m getting a nice rhythm going on here!
A lot of you have reached out with questions, ideas, and general feedback. I appreciate your comments, and I’m happy these short stories are helpful!
I’ll keep iterating. Trying to find my voice and hit on the message and value that I want you to get from this newsletter. Please keep the feedback coming, and let’s get this thing going!

The story behind a baseline
Finding whether your machine learning model is providing any value is not that simple. Yeah, of course, the loss is going down, and accuracy is through the roof, but that’s not enough.
Is this thing actually any good?
Yes, I’m one of those who had bragged before about a model that did worse than a pair of nested if-else conditions. Focus too much on the trees 🌳 , and you’ll certainly miss the forest.
Let’s get that fixed.
We all need a target. A finish line. A place to get to.
We all need a target. A finish line. A place to get to.
How good am I at doing this thing?
I’ve been building traditional software my entire life, and there’s something nice about it: it either works, or it doesn’t.
When you are building a machine learning model, things change quite a bit. Models make predictions, and understanding their quality requires a little bit more setup.
So I have an entire process before I start writing code. I like to start by measuring how good I am at solving the task. Manually. Like an animal 🐘 😎.
This is called a human baseline, and ideally, my model will have super-human abilities and kick my butt at some point. In the meantime, this baseline represents my North Star ✨.
An interesting outcome of trying to come up with a human baseline is finding out when I can’t do the task at all! Sometimes, there’s not enough information in the data for me to solve the problem. Finding this out early on saves a ton of pain and money trying to build a model to solve an impossible task.
To get the human baseline, I annotate the data twice—ideally between cups of coffee ☕️—. I treat one set of annotations as my ground truth and the other set as my predictions. Then I can quickly compute any metrics I want as the baseline.
If you have a friend willing to help, your baseline will be even better. Yours can be the ground truth, and your friend’s the predictions. But of course, you will have to convince somebody for a long, boring night of looking at data, so good luck with that!
Now it’s time to build something dumb.
Now that I know how good I am at solving the problem, I try to develop a data-independent baseline. This is easy to do but very important. You’ll be surprised at how frequently a dumb baseline is often enough to give you a run for your money!
This baseline should be independent of the data, so I usually run a couple of experiments and pick up the one with the best performance:
  • For each example, return a random 🎲 prediction.
  • For each example, return the same prediction.
I can hear you screaming at me right now, but trust me: this baseline is a great measuring stick for your model. Beating this baseline will become your first goal.
A start, a finish line, and a process to breach the gap
This is the time to take a bathroom break 🚽, get back to the computer, and start putting together your first actual model.
You already have a good idea of where you are going: the human baseline. Your starting point is the dumb, data-independent baseline. We need to follow a process to connect the dots!
I start as simple as I can. This is not the time to throw everything I know at the problem. Instead, I focus on building something as simple as possible that beats my dumb baseline.
I start small, explore the possibilities, then exploit the ideas I think will pay off.
From here on out, it’s a matter of iterating until I’m happy with the final model. Every time I beat the lower-end baseline that becomes my next target.
This is fun. Sometimes it’s even addictive.
Too long; didn’t read.
You can’t make progress unless you know where you stand and what’s your target. Building baseline models before you start putting in the hard work will give you the visibility you need.
Or you can also read a summary on Twitter. Fewer words, same message.
The boring stuff
This week I ran a promotion in Twitter: $5 for my “How to get started with Machine Learning” course. That’s less than what Starbucks charges for coffee, and you can even get a refund for my course if you don’t like it! (You cannot return your coffee.)
The promotion is not longer open in Twitter, but if you are a subscriber of this newsletter, you can get it here.
Did you enjoy this issue?

Every week, one story that tries really hard not to be boring and teach you something new about machine learning.

Underfitted is for people looking for a bit less theory and a bit more practicality. There's enough mathematical complexity out there already, and you won't find any here.

Come for a journey as I navigate a space that's becoming more popular than Apollo 13, Area 51, and the lousy sequel of Star Wars combined.

If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue