The final day of OpenAI’s “12 Days of Shipmas” has arrived with the unveiling of o3, a new train of thought “inference” model that the company claims is its most advanced to date. The model is not yet available for public use, but safety researchers may be Sign up to view Starting today.
OpenAI and others hope that inference models will go a long way toward solving the insidious problem of chatbots repeatedly producing wrong answers. Chatbots fundamentally do not “think” like humans, and different techniques are needed to try to create the best simulation of the human thought process.
When asked a question, inference models pause and consider relevant claims that can help produce an accurate answer. For example, if you ask the o3 model, “Can habaneros be grown in the Pacific Northwest,” the model might ask a series of questions that it will investigate to reach a conclusion, such as “Where do habaneros typically grow,” “What are the ideal conditions for growing habaneros?” Habaneros, and what type of climate does the Pacific Northwest have. Anyone who has used chatbots knows that you sometimes have to ask the chatbot for additional follow-ups until it finally gets the right result. Inference models are supposed to do this extra work for you.
o3 is the successor to o1, OpenAI’s first thought chain logic model. Representatives said they decided to skip the “o2” naming convention “out of respect” for British Telecom, but it certainly doesn’t hurt to make the product feel more advanced. The company says the new model comes with the ability to set thinking time. Users can choose low, medium, or high thinking time; The higher the account, the better o3’s performance. OpenAI says it will spend some time “red teaming” the new model with researchers to prevent it from happening Producing responses that may be harmful (Because he is not human and does not know right from wrong).
Heuristics is today’s buzzword in generative AI, with industry insiders believing it to be the next necessary solution to improving the performance of large language models. More computing does not ultimately provide equivalent performance gains, so new technologies are needed. Google DeepMind recently unveiled its own inference model called Gemini deep researchwhich can take 5 to 10 minutes to generate a report that analyzes many sources across the web to arrive at its findings.
OpenAI is confident in o3, and offers great benchmarks — it says that on the Codeforceing test, which measures programming ability, o3 received a score of 2727. For context, a score of 2400 would put an engineer in the 99th percentile of programmers. He received a score of 96.7% on the 2024 American Advocacy Mathematics Test, missing only one question. We’ll have to see how the model holds up in real-world tests, and it’s still generally not a good idea to rely too much on AI models to do important work where accuracy is essential. But optimists are confident that the accuracy problem has been solved. We hope so, because for now, Google’s AI overview in search remains the subject of frequent ridicule on social media.
Model AI companies, such as OpenAI and Perplexity, are vying to become the next Google, aggregating the world’s knowledge and helping users make sense of it all. They also have search products now that aim to directly replicate Google Access real-time web results.
However, all of these players seem to outdo each other with each passing day. This feeling is somewhat reminiscent of the late 1990s when there were countless search engines to choose from – Google, Yahoo, AltaVista, and Ask Jeeves, to name a few, all of which collected internet data and presented it with just a different user experience. Most of them disappeared after a product that was much better than the rest came along: Google.
OpenAI clearly has a strong lead at the moment with hundreds of millions of monthly active users and a partnership with Apple, but Google has received a lot of plaudits recently for developments in its Gemini models. The Verge reports that the company will soon integrate Gemini Deeper into its search interface.
https://gizmodo.com/app/uploads/2024/12/OpenAI-CEO-Sam-Altman-on-Fox-News.jpg
Source link