Researchers Build OpenAI o1 Reasoning Model In $50

AI researchers from Stanford and the University of Washington have developed a new AI model called s1, which they trained for less than $50 using cloud computing resources, as stated in a recent research paper.

This model performs comparably to advanced reasoning models, like OpenAI’s o1 and DeepSeek’s R1, particularly in math and coding tests. You can find the s1 model, along with its training data and code, on GitHub.

Researchers Build OpenAI o1 Reasoning Model In $50

The team created s1 model by starting with a basic AI model and then refining it through a method called distillation, which helps extract reasoning skills from another AI model by training it on the answers given by that model.

They used a reasoning model from Google, called Gemini 2.0 Flash Thinking Experimental, as the basis for this process. This distillation method is similar to one used by Berkeley researchers to build another reasoning model for about $450 recently.

The fact that a small group of researchers can innovate in AI without huge funding is exciting for many. However, it raises concerns about how easily AI models can be replicated, even those that cost millions to develop.

This situation has upset larger AI companies, as OpenAI has accused DeepSeek of improperly using data from its API for distillation purposes.

The s1 research team aimed to find a straightforward way to achieve strong reasoning skills and improve the model’s ability to think before answering questions.

These goals were part of the breakthroughs seen in OpenAI’s o1, which other AI labs, including DeepSeek, have tried to mimic using different methods.

The s1 paper indicates that reasoning models can be effectively distilled with a relatively small dataset through a technique called supervised fine-tuning (SFT), which is generally less expensive than the large-scale reinforcement learning method that DeepSeek used for its R1 model.

Google provides free access to its Gemini 2.0 Flash Thinking Experimental model, although there are daily usage limits.

However, Google’s terms prohibit reverse-engineering its models to create competing AI services. The researchers have reached out to Google for further comments on this matter.

The s1 model is based on a compact AI model from Qwen, a Chinese AI lab owned by Alibaba, which is available for free download.

To train s1, the researchers compiled a dataset of just 1,000 carefully selected questions along with their answers and the reasoning behind those answers, sourced from Google’s Gemini 2.0 Flash Thinking Experimental.

After training s1, which took less than half an hour using 16 Nvidia H100 GPUs, the model showed strong results on various AI tests, according to the researchers.

Niklas Muennighoff, one of the Stanford researchers involved, mentioned that he could rent the necessary computing power today for around $20.

The researchers used a clever technique to help s1 verify its answers and think longer: they instructed it to “wait.” By adding the word “wait” during its reasoning process, the model was able to produce slightly more accurate answers, as noted in the paper.

In 2025, major companies like Meta, Google, and Microsoft plan to invest hundreds of billions of dollars into AI infrastructure, part of which will support the training of next-generation AI models.

This level of investment may still be crucial for advancing AI innovation. While distillation has proven effective for recreating an AI model’s abilities at a lower cost, it doesn’t lead to the development of significantly better AI models than those currently available.

Relevant AI Stories You May Like

Help Someone By Sharing This Article