View profile

Christmas in March

Christmas in March
By Santiago • Issue #2 • View online
This caught me off guard.
I’m beyond words with the reception you gave to the first issue, Autoencoders and rotten bananas. The emails, the feedback, the tweets… I wasn’t expecting that much.
Thank you for that. I’m humbled, proud, and ready to punish you with another story 😜.
Let’s get into it!

A model that smokes everyone else
I didn’t pay too much attention when I heard about CLIP, a new neural network that learns visual concepts from natural language supervision. At some point, however, my social media feed was all about it. I mostly have to blame this guy. I had to look into it 👀 .
It immediately felt like Christmas in March (CLIP was released in January, but I was late to the party.) Here you had this new technique that kicked everyone’s butts with a zero-shot approach!
Let’s try to unpack this a little bit.
What in the world is “zero-shot”?
If your model can predict classes that you didn’t see during training, you have a zero-shot-capable model.
For example, you might have heard of ImageNet, a 14+ million image dataset organized in more than 22,000 categories. CLIP can correctly classify those images with 76.2% accuracy without training on that dataset and its classes.
A specific example, straight from Wikipedia:
(…), given a set of images of animals to be classified, along with auxiliary textual descriptions of what animals look like, an AI which has been trained to recognize horses, but has never seen a zebra, can still recognize a zebra if it also knows that zebras look like striped horses.
Mic-drop, mind-blown moment. Take a minute and try to appreciate this.
Think about this: zero-shot capabilities indicate that the model is learning to relate visual concepts to categories at a much deeper level than what we have seen.
This changes the game, and here is why
There are three key advantages of CLIP over existing supervised techniques:
  1. Putting together a good dataset and labeling it is a pain in the rear end. We don’t need this with CLIP, and I can’t describe my happiness because of it.
  2. Even if we collect and label a good dataset, existing models don’t generalize very well outside of that. That’s not the case with CLIP, which we can use for all sorts of tasks unrelated to a specific dataset.
  3. And the cherry 🍒 on top is that CLIP’s real-world performance is consistent with its performance in vision benchmarks. Just in case you didn’t know, most of the current deep learning models do much better with toy problems than out in the wild. This sucks, but CLIP takes care of it.
This is a big deal! Not in the “oh-wow-we-just-discovered-something-that-will-be-useful-someday” sense, but more in the “holy-crap-we-can-use-this-now-and-it’s-awesome” way.
Well, this all dandy. Now what?
Following their steps with GPT-3, OpenAI didn’t publish the full model, but just a smaller version. They also warned about using this in production, citing the need for more specific tests and potential bias in the model.
"A cityscape in the style of Van Gogh" using CLIP — @advadnoun.
"A cityscape in the style of Van Gogh" using CLIP — @advadnoun.
This hasn’t stopped a lot of cool experiments with CLIP. Vladimir put together a notebook to do image searches on the Unsplash dataset. And here is another notebook for generating images using CLIP and BigGAN courtesy of Ryan Murdock.
But of course, cool samples aren’t real applications. I’m sure we’ll start seeing them pop up in the coming weeks and months. Even the smaller CLIP version is powerful enough to be valuable, and I can’t wait to see what people do with it.
In the meantime, I took the model, containerized and put a RESTful API around it so you can deploy it and use it anywhere. You can give it online images and ask it to select the best label that represents each one of them. Even this smaller version is pretty impressive!
Too long; didn’t read.
CLIP is new. CLIP is awesome. CLIP is mind-blowing stuff.
Hordes of state-of-the-art computer vision models seem now arcane thanks to CLIP. Can’t wait to see where this goes and what people build with it.
The future of computer vision is bright.
I've been up to something new
For the last three years, I’ve been working with Amazon SageMaker. That’s where I go to train, experiment, and deploy production-ready machine learning systems. It’s a very comprehensive platform, but it has many moving parts and can become somewhat overwhelming for anyone starting.
MLOps architecture of a machine learning system deployed in SageMaker.
MLOps architecture of a machine learning system deployed in SageMaker.
For the last few weeks, I’ve been working on a course to help with this. I’m trying to cover as many details as possible, and I’m confident it will be the most comprehensive course on SageMaker out there.
I’m not rushing it. It’s still a few weeks out, but I’m already proud of the way it’s coming together. I’ll keep you posted on my progress, but if this is something that interests you, reply to this email and let me know. I’d love to talk to you and understand how a course like this could help with your work.
Closing comments
Yeah, sponsoring my own newsletter sounds funny 🙃, but I still want to give you a 30% off my course “How To Get Started With Machine Learning.” I’ll probably make this a permanent thing.
And before I say goodbye for today—if you have nothing better to do—feel free to reply and let me know what you like and dislike about the newsletter.
I’m still trying to find my voice and a format that resonates with you, so hearing what you think is the most important feature of this dataset.
Love ya, and see you in the next one!
Did you enjoy this issue?

Every week, I’ll teach you something new about machine learning.

Underfitted is for people looking for a bit less theory and a bit more practicality. There's enough mathematical complexity out there already, and you won't find any here.

Come for a journey as I navigate a space that's becoming more popular than Apollo 13, Area 51, and the lousy sequel of Star Wars combined.

If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue