In-context learning by LLMs
Large language models like OpenAI’s GPT-3 can learn to accomplish tasks after seeing only a few examples, thanks to a curious phenomenon known as in-context learning. MIT, Google Research, and Stanford University scientists seek to unravel the mystery behind how these models can learn without updating parameters. They have discovered that these massive neural network models can contain smaller, simpler linear models buried inside them. The large model can then implement a simple learning algorithm to train this smaller, linear model to complete a new task, using only information already contained within the larger model without retraining. With a better understanding of in-context learning, researchers could enable models to complete new tasks more efficiently.