Training a Custom Model with OpenAI’s GPT-4 Takes $2-$3 Million

IBL News | New York

Training a custom model from scratch using OpenAI’s GPT-4 may take several months, with pricing starting at $2 to $3 million, according to the company.

This high price sparked a discussion among practitioners on Twitter, now known as X. Many users agreed that a much smaller pre-trained base model with fine-tuning on top of it would cost ten times less.

OpenAI justifies the price by stating:

“The Custom Models program gives selected organizations an opportunity to work with a dedicated group of OpenAI researchers to train custom GPT-4 models to their specific domain.”

“This includes modifying every step of the model training process, from doing additional domain-specific pre-training to running a custom RL post-training process tailored for the specific domain.”

“Organizations will have exclusive access to their custom models. This program is particularly applicable to domains with extremely large proprietary datasets—billions of tokens at minimum.”

On the other hand, OpenAI announced Data Partnerships, an initiative intended to work with organizations to produce public and private datasets for training AI models as a way to combat models that contain toxic language and biases.

To work with data and PDFs in those large-scale datasets, OpenAI says that it uses world-class OCR technology and automatic speech recognition (ASR) to transcribe spoken words.