Join our daily and weekly newsletters for the latest updates and exclusive content on our industry-leading AI coverage. He learns more
On the ninth day of a series of holiday product announcements known as the “12 Days of OpenAI,” OpenAI is rolling out its most advanced model, o1, to third-party developers through its API.
This represents a huge step forward for developers looking to create new advanced AI applications or integrate the most advanced OpenAI technology into their existing applications and workflows, whether enterprise or consumer-oriented.
If you’re not familiar with OpenAI’s o1 series, here’s the summary: It’s… It was announced back in September 2024, The first in a new “family” of models from ChatGPT, going beyond the large language models (LLMs) of the GPT family series and introducing “inference” capabilities.
Essentially, the o1 model group — o1 and o1 mini — takes longer to respond to user prompts for answers, but checks in on itself As they formulate the answer To find out if it is true and avoid hallucinations. At the time, OpenAI said the o1 could handle more complex PhD-level problems — something Real world users endure it as well.
While developers previously had access to a preview version of o1 on top of which they could build their own applications – for example, a PhD advisor or lab assistant – the production-ready release of the full o1 model through the API provides improved performance, lower latency, and new features that make integration easier In real world applications.
OpenAI o1 has already been available to consumers on its ChatGPT Plus and Pro plans for about two and a half weeksand added the ability for forms to analyze and respond to images and files uploaded by users as well.
Along with today’s launch, OpenAI announced important updates to its Realtime API, along with price cuts and a new tuning method that gives developers greater control over their models.
The full o1 model is now available to developers through the OpenAI API
The new o1 model, available as o1-2024-12-17, is designed to excel at complex, multi-step reasoning tasks. Compared with the previous o1 preview version, this version improves accuracy, efficiency and flexibility.
OpenAI recorded significant gains across a range of benchmarks, including programming, mathematics, and visual reasoning tasks.
For example, programming scores on the SWE-bench Verified test increased from 41.3 to 48.9, while performance on the math-focused AIME test jumped from 42 to 79.2. These improvements make o1 well-suited for building tools that simplify customer support, improve logistics, or solve difficult analytical problems.
Several new features improve o1’s functionality for developers. Structured output allows responses to reliably match custom formats such as JSON schemas, ensuring consistency when interacting with external systems. Function calling simplifies the process of connecting o1 to APIs and databases. The ability to reason with visual input opens up use cases in manufacturing, science, and programming.
Developers can also fine-tune o1’s behavior using the new Reasoning_effort parameter, which controls how long the model spends on the task to balance performance and response time.
OpenAI’s Realtime API is getting a boost to power intelligent voice/audio AI assistants
OpenAI also announced updates to its Realtime API, designed to power natural, low-latency conversational experiences such as voice assistants, live translation tools, or virtual teachers.
The new WebRTC integration simplifies the creation of voice-based applications by providing direct support for audio streaming, noise suppression, and congestion control. Developers can now integrate real-time capabilities with minimal setup, even in variable network conditions.
OpenAI is also introducing new pricing for its Realtime API, reducing costs by 60% for GPT-4o audio to $40 per million input codes and $80 per million output codes.
Cached audio input costs have been reduced by 87.5%, and are now priced at $2.50 per million input tokens. To further improve affordability, OpenAI is adding GPT-4o mini, a smaller, cost-effective model priced at $10 per million input codes and $20 per million output codes.
GPT-4o mini’s text symbol prices are also significantly lower, starting at $0.60 for input symbols and $2.40 for output symbols.
In addition to pricing, OpenAI gives developers more control over responses in the Realtime API. Features like out-of-band synchronous responses allow background tasks, such as content moderation, to run without interrupting the user experience. Developers can also customize input contexts to focus on specific parts of the conversation and control when voice responses are triggered for more accurate and smooth interactions.
Setting preferences provides new customization options
Another major addition is Set preferenceswhich is a way to customize forms based on user and developer preferences.
Unlike supervised fine-tuning, which relies on exact input-output pairs, preference fine-tuning uses pairwise comparisons to teach the model preferred responses. This style is especially effective for interpersonal tasks, such as summarizing, creative writing, or scenarios where tone and tone are important.
Early testing with partners like Rogo AI, which builds assistants for financial analysts, shows promising results. Rogo reported that fine-tuning preferences helped their model handle complex out-of-distribution queries better than traditional fine-tuning, improving task accuracy by more than 5%. The feature is available now for gpt-4o-2024-08-06 and gpt-4o-mini-2024-07-18, with plans to expand support to newer models early next year.
New SDKs for Go and Java developers
To simplify integration, OpenAI is expanding its official SDK offerings with beta releases for Go and Java. These SDKs join existing Python, Node.js, and .NET libraries, making it easier for developers to interact with OpenAI models across more programming environments. The Go SDK is especially useful for building scalable back-end systems, while the Java SDK is designed for enterprise-level applications that rely on strong typing and robust ecosystems.
With these updates, OpenAI offers developers an expanded toolset for creating advanced, customizable AI-powered applications. Whether through o1’s enhanced inference capabilities, Realtime API improvements, or fine-tuning options, OpenAI’s latest offerings aim to deliver enhanced performance and cost-efficiency for companies pushing the boundaries of AI integration.
https://venturebeat.com/wp-content/uploads/2024/12/o1-api.png?w=1024?w=1200&strip=all
Source link