IBL News | New York
The big data analytics firm Databricks open-sourced last week a new AI model called Dolly, along with all of its training code and instructions on how to recreate it.
“Dolly is a cheap-to-build LLM (large language model) that exhibits a surprising degree of the instruction following capabilities exhibited by ChatGPT,” the company announced in a blog post.
The model underlying Dolly has only 6 billion parameters, compared to 175 billion in GPT-3. It is only two years old, “making it particularly surprising that it works so well.”
In February 2023, Meta released the weights for a set of high-quality language models called LLaMA for academic researchers.
In March 2023, Stanford University built the Alpaca model, which was based on LLaMA, but tuned on a small dataset of 50,000 human-like questions and answers.
Databricks evaluated Dolly on the instruction-following capabilities described in the InstructGPT paper on which ChatGPT is based.
Dolly — named after Dolly the sheep, the first cloned mammal — is an open-source clone of an Alpaca, inspired by a LLaMA.
Instead of creating its own model from scratch or using LLaMA, Databricks took a much older and open-source LLM called GPT-J, which was created by EleutherAI several years earlier.
GTP-J was the foundation on which Dolly was built.
Databricks was able to take the EleutherAI model and make it “highly approachable” simply by training it with a small, 50,000-word dataset in less than three hours using a single machine.
“This shows that the magic of instruction following does not lie in training models on gigantic datasets using massive hardware,” Databricks explained.
“Rather, the magic lies in showing these powerful open-source models specific examples of how to talk to humans, something anybody can do for a hundred dollars using this small 50.000 dataset of Q&A examples.”
“It exhibits many of the same qualitative capabilities, including text generation, brainstorming, and open Q&A.”
“We believe models like Dolly will help democratize LLMs, transforming them from something very few companies can afford into a commodity every company can own and customize to improve their products,” Databricks said.