AI/ML

Notes and experiments on AI/ML and design.

InstructGPT

New alignment paper from OpenAI. A couple of fascinating things in it:

Prompt engineering, one of the most interesting approaches to getting good results from GPT-3, is apparently no longer needed for many tasks with InstructGPT models. Before, you would need to “prime” GPT-3 with a few question/answer examples so it knew what you were getting at. Now it seems just the question will do.

InstructGPT is 100x smaller (!) than GPT-3 (1.3B params vs 175B) but its outputs are rated higher by human labellers for these tasks. It’s somehow nice that the team that kicked off the current GIANT MODELS arms race is also taking steps in a different direction.

Thirdly – this is an academic paper showing good results in aligning language models to certain human requirements. And that’s great! But if you log in to your OpenAI account… the model is right there, launched, and is actually the default model for all requests. OpenAI is very much a product company now, not just a research company.

(Original Twitter thread here).

Weak Labelling

Clear and accessible overview by Raza Habib on a technique that generates labelled data from unlabelled data that sounds almost too good to be true.

When I first heard about weak labelling a little over a year ago, I was initially sceptical. The premise of weak labelling is that you can replace manually annotated data with data created from heuristic rules written by domain experts. This didn’t make any sense to me. If you can create a really good rules based system, then why wouldn’t you just use that system?!

Over the past year, I’ve had my mind completely changed.

The technique works by:

  1. an expert user writing a number of simple rules to classify samples that seem broadly reasonable but don’t have to be 100% correct (e.g. one rule might be “if a headline is in all caps, it is clickbait”, and another might be “if a headline starts with a number, it is clickbait”)
  2. then training a model to see how different rules agree/disagree/interact with each other when they are applied to unlabelled samples.

Then you just throw huge amounts of unlabelled data at this process, and it seems to create a classifier that is almost as good as one created from a carefully curated dataset, with no labelling required.

Given how access to sufficient high-quality labelled data is such a bottleneck to training reliable models, this appears to be a huge deal.