How good am I at doing this thing?
I’ve been building traditional software my entire life, and there’s something nice about it: it either works, or it doesn’t.
When you are building a machine learning model, things change quite a bit. Models make predictions, and understanding their quality requires a little bit more setup.
So I have an entire process before I start writing code. I like to start by measuring how good I am at solving the task. Manually. Like an animal 🐘 😎.
This is called a human baseline, and ideally, my model will have super-human abilities and kick my butt at some point. In the meantime, this baseline represents my North Star ✨.
An interesting outcome of trying to come up with a human baseline is finding out when I can’t do the task at all! Sometimes, there’s not enough information in the data for me to solve the problem. Finding this out early on saves a ton of pain and money trying to build a model to solve an impossible task.
To get the human baseline, I annotate the data twice—ideally between cups of coffee ☕️—. I treat one set of annotations as my ground truth and the other set as my predictions. Then I can quickly compute any metrics I want as the baseline.
If you have a friend willing to help, your baseline will be even better. Yours can be the ground truth, and your friend’s the predictions. But of course, you will have to convince somebody for a long, boring night of looking at data, so good luck with that!
Now it’s time to build something dumb.
Now that I know how good I am at solving the problem, I try to develop a data-independent baseline. This is easy to do but very important. You’ll be surprised at how frequently a dumb baseline is often enough to give you a run for your money!
This baseline should be independent of the data, so I usually run a couple of experiments and pick up the one with the best performance:
- For each example, return a random 🎲 prediction.
- For each example, return the same prediction.
I can hear you screaming at me right now, but trust me: this baseline is a great measuring stick for your model. Beating this baseline will become your first goal.
A start, a finish line, and a process to breach the gap
This is the time to take a bathroom break 🚽, get back to the computer, and start putting together your first actual model.
You already have a good idea of where you are going: the human baseline. Your starting point is the dumb, data-independent baseline. We need to follow a process to connect the dots!
I start as simple as I can. This is not the time to throw everything I know at the problem. Instead, I focus on building something as simple as possible that beats my dumb baseline.
I start small, explore the possibilities, then exploit the ideas I think will pay off.
From here on out, it’s a matter of iterating until I’m happy with the final model. Every time I beat the lower-end baseline that becomes my next target.
This is fun. Sometimes it’s even addictive.
Too long; didn’t read.
You can’t make progress unless you know where you stand and what’s your target. Building baseline models before you start putting in the hard work will give you the visibility you need.