The latest AI models of Openai has a new protection to prevent Borisks

Photo of author

By [email protected]


Openai says it has published a new system to monitor the latest models of male thinking, O3 and O4-MiniFor demands related to biological and chemical threats. The system aims to prevent models from advising a person who can direct someone about the implementation of possible harmful attacks, According to the Openai Safety Report.

O3 and O4-MINI represents a meaningful increase in the previous models of Openai, as the company says, and thus new risk in the hands of bad actors. According to Openai’s internal standards, O3 is more skilled in answering questions about creating certain types of biological threats in particular. For this reason-and reduce the other risks-OpenAi the new monitoring system, which the company describes as “monitoring of safety thinking.”

The screen, which is trained on request, is operated on Openai content policies, at the head of O3 and O4-MINI. It is designed to determine the claims related to biological and chemical risks and to direct models to refuse advice on these topics.

To create a foundation, Openai had spent a red difference about 1000 hours in a sign of “unsafe” conversations related to the O3 and O4-MINI. During a test in which Openai simulated “the logic of blocking” to monitor its safety, the models refused to respond to the risky demands of 98.7 % of the time, according to Openai.

Openai admits that his test did not explain people who may try new claims after being banned by the screen, which is why the company says it will continue to be partially dependent on human monitoring.

O3 and O4-MINI do not cross the “high risk” threshold for biomedics, according to the company. However, compared to O1 and GPT-4, Openai says that early versions of O3 and O4-MINI have proven to be more useful in answering questions about the development of biological weapons.

The graph from the O3 and O4-MINI system (screenshot: Openai)

The company is actively following how its models can facilitate malicious users to develop chemical and biological threats, according to the updated Openai recently. Reading framework.

Openai is increasingly dependent on automated systems to reduce risk from their models. For example, to prevent GPT-4O’s original photo generator from the Creation of Children’s sexual assault (CSAM)Openai says it is used on the thinking screen similar to the company that the company published for O3 and O4-MINI.

However, many researchers raised Oblay’s concerns that do not give priority to safety as much as it should. Metr, one of the company’s red partners, said he had little time to test O3 on a criterion for deception. At the same time, Openai decided not to launch a Safety Report for GPT-4.1 modelIt was launched earlier this week.



https://techcrunch.com/wp-content/uploads/2025/01/GettyImages-2191707579.jpg?w=1024

Source link

Leave a Comment