Cerebras Releases as Open Source Seven Large LLMs with 13 Billion Parameters

IBL News | New York

Silicon Valley–based maker of a dedicated AI computer and the world’s largest computer chip, Cerebras Systems released a series of seven GPT large language models (LLMs), methodology, training weights, and a recipe for open use via the permissive industry-standard Apache 2.0 license. This solution, called Cerebras-GPT, means that these models can be used for research or commercial ventures without royalties.

The company used non-Nvidia GPU-based systems to train LLMs up to 13 billion parameters. All seven models were trained on the sixteen CS-2 systems in the Cerebras Andromeda AI supercomputer using the Chinchilla formula.

“These are the highest accuracy models for a computing budget and are available today open-source,” said the company.

In a first among AI hardware companies, Cerebras researchers trained a series of seven GPT models with 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B parameters.

“Typically a multi-month undertaking, this work was completed in a few weeks thanks to the incredible speed of the Cerebras CS-2 systems that make up Andromeda, and the ability of Cerebras’ weight streaming architecture to eliminate the pain of distributed computing. These results demonstrate that Cerebras’ systems can train the largest and most complex AI workloads today.”

  • “The training weights provide a highly accurate pre-trained model for fine-tuning. By applying a modest amount of custom data, anyone can create powerful, industry-specific applications with minimal work.”
  • “The models’ various sizes and their accompanying checkpoints allow AI researchers to create and test new optimizations and workflows that broadly benefit the community.”

Traditional LLM training on GPUs requires a complex amalgam of pipeline, model, and data parallelism techniques. Cerebras’ weight streaming architecture is a data-parallel-only model that requires no code or model modification to scale to arbitrarily large models.

“We’ve worked to make this task easier with releases such as the Pile and the Eval Harness, and we are very excited to see Cerebras build on our work to produce a family of open models that will be useful to researchers around the world,” said Stella Biderman, Executive Director at EleutherAI.

All seven Cerebras-GPT models are available on Hugging Face and Cerebras Model Zoo on GitHub. The Andromeda AI supercomputer used to train these models is available on-demand in this URL.

Cerebras published a technical blog post with the details of the seven models and the scaling laws that they produce. A research paper will be released shortly.

The company posted not just the programs’ source, in Python and TensorFlow format, but also the details of the training regimen by which the programs were brought to a developed state of functionality.

Currently, a handful of companies hold the keys to LLMs. OpenAI is closed, with GTP-4 operating as a black box for the public. Meta’s LLAMA is closed to for-profit organizations, and Google is closed to a varying degree.

Cerebras, echoing the researchers’ community, says that AI needs to be open and reproducible for it to broadly benefit humanity.

• ZDNet: AI pioneer Cerebras opens up generative AI where OpenAI goes dark


Creative Commons License
This work is licensed under a Creative Commons.
IBL News is a nonprofit initiative.
This initiative is part of IBL Education.

New! IBL Education Partnered with