Openai says that Chatbots disciplining lying makes it worse

As many people know, Chatbots have a tendency to lie. Perhaps it is one of the worst use of Amnesty International, trained in the production of camels that seem reliable but can provide fully fabricated information; The models are bound to provide an answer even when they are not confident. Now, researchers at Openai says that the supervision and discipline of Chatbots makes the problem only worse, as Chatbots made more effort to hide their behavior.

in Blog post Regarding the work, the researchers describe Openai’s use of its GPT-4O model for another “supervision” of its large language models, which leads to its discipline when it tried to lie. But this did not succeed, as the model will continue to lie, “Now it can only be discovered by the screen because he learned to hide his intention in the series of ideas.”

Multiple “Thinking” models use multi -steps to answer information. If a user requests an appreciation of the amount of Americans spending on pet food every year, for example, these models will divide the query into multiple steps – how many dogs are in America, the amount of the cost of feeding the ordinary dog, etc.

These models often reveal its logic, or “the series of idea”, for the user so that they can see how it came to his answer. Enjoyable enough, they will often compensate the facts and details. In the “Pre -Training” step, when companies like Openai initially teach a model of what kinds of answers you should produce, it seems that the models learn that it is easier to earn a cookie by taking shortcuts. Think of hostility to the marathon race Hogs in a car Most of the race exceeds.

Users on social media have shown that Claude, when they are asked to conduct data analysis from different research papers, admit that they sometimes include filling data instead of actually looking at all papers.

Openai’s researchers are highlighted in their test in which the model requested, writing some tests to ensure that part of the code is done, and they hiddenly decided to write some bad tests, then skip them on the purpose of even the code eventually passes.

I lost nearly 100 million dollars due to ANTHROOPICAIClaude infiltrated the “creation of random data” as a celebration in the code of the market maker without news pic.twitter.com/j3mlgsl5lq

Martinshkreli March 5, 2025

Artificial intelligence companies try to solve the malicious issue from lying or “hallucinations”, as it is called in this field, and finally it reaches AGI, or the point where artificial intelligence can exceed human ability. But Openai researchers mainly say that after tens of billions of investments, they still do not know how to control models to act appropriately. They added: “If the strong supervision is applied directly to the series of ideas, you can learn the models to hide their intention while continuing to mislead.” Currently, companies should not implement the models that appear to be a big solution. Ergo, let them continue to lie at the present time, otherwise they will shine for you.

TFW Claude Code 739 seconds was spent, “Show”, failed to make the change you requested, broke 3 other things that used to work well, then charge you $ 11.14 pic.twitter.com/ap2jlq0ui8

Adam (Personofswag) March 19, 2025

The search should be a reminder of cautious exercise when relying on Chatbots, especially when it comes to decisive work. It has been improved to produce a Confident-Appearance Answer but don’t care much about real accuracy. “Since we have trained the border thinking models more capable, we have found that they have increased in an increasingly innovation to use defects in their tasks and institutions in reward jobs, which led to models that could perform complex bonuses in coding tasks,” the researchers concluded Openai.

Several reports have suggested that most institutions have After to find the value In all new artificial intelligence products that come to the market, with tools like Microsoft Copilot and Apple Intelligence He suffers from problemswith Scathing reviews In detail, the accuracy of their weakness and lack of real benefit. According to a recent report from Boston Consulting GroupThere was a survey of 1,000 senior executives in 10 main industries that 74 % showed any concrete value of artificial intelligence. What makes it more attractive is that these “thinking” models are slow, and slightly more expensive than smaller models. Do you want companies to pay $ 5 for a query that will return with makeup information?

There is always a lot of noise in the technology industry for things, then get out of it and realize that most people are still not using them. Currently, it is not worth the trouble, and reliable information sources are more important than ever.

https://gizmodo.com/app/uploads/2024/12/GettyImages-21653483551.jpg

Source link