On Monday, Openai launched a new family of models called GPT-4.1. Yes, “4.1” – as if naming the company was not really sufficiently confusing.
There is GPT-4.1, GPT-4.1 Mini and GPT-4.1 Nano, all of which say Openai “Excel” in coding and education. Available through the API from Openai but not ChatgptThe multimedia models have a window of millions of millions, which means that they can take nearly 750,000 words in one (longer than “war and peace”).
GPT-4.1 is up to Openai’s competitors such as Google and Ratchet’s efforts to build advanced programming models. Google has been released recently Gemini 2.5 Pro,, Which also has a context window of millions of millions, ranks high on common coding standards. Likewise, Antarbur does Claude 3.7 Sonata And the start of the start of the Chinese Amnesty International Deepseek V3 upgrade.
It is the goal of many technology giants, including Openai, to train artificial intelligence coding models capable of performing complex software engineering tasks. Openai’s great ambition is to create a “agent engineer engineer”, such as It was placed by the financial director, Sarah Al -Raheb During the technology summit in London last month. The company confirms that its future models will be able to program the entire applications from one to the party, and deal with aspects such as quality assurance, error test, and document writing.
GPT-4.1 is a step in this direction.
“We have improved GPT-4.1 for use in the real world based on direct reactions to improve in the areas that developers are more interested in: the front end coding, a fewer external adjustments, and the formatting formatting, adherence to the structure Electronic. “These improvements enable developers to create much better agents in the tasks of software engineering in the real world.”
Openai claims that the full GPT-4.1 model is outperforming it GPT-4O and GPT-4O Mini Models on coding standards, including Swe-Bench. GPT-4.1 Mini and Nano are said to be more efficient and faster at the expense of some accuracy, as Openai says that GPT-4.1 Nano is the fastest and cheapest model ever.
GPT-4.1 costs $ 2 per million input symbols and $ 8 per million output symbols. The cost of GPT-4.1 Mini is the advantages of entering 0.40 dollars/million dollars, 1.60 million dollar output symbols, and GPT-4.1 Nano is $ 0.10/million icons, entering $ 0.40/million dollars.
According to the internal test of Openaii, GPT-4.1, which can generate more symbols from the GPT-4O (32,768 versus 16,384), and recorded between 52 % and 54.6 % on SWE-Bench, a sub-group of human gun. Openai noticed in a blog publication that some solutions to SWE problems are on the infrastructure that cannot be operated on their infrastructure, so the scope of grades.) These numbers are slightly under the grades reported by Google and Nothropic for Gemini 2.5 Pro (63.8 %) and Claude 3.7 Sonnet (62.3 %), Respectively, in the same manner.
In a separate evaluation, Openai GPT-4.1 search using the MME video, which is designed to measure the model’s ability to “understand” the content in videos. GPT-4.1 has a resolution of 72 % in the “Long, No Subtitles” category, claiming Openai.
Although GPT-4.1 is reasonably well recorded on the standards and has the most modern “cutting”, which gives it a better reference frame for current events (until June 2024), it is important to keep in mind that even some of the best models today are struggling with tasks that will not wander in experts. For example, a lot studies Ownership He appears The models generated in the symbol often fail to repair, and even provide security gaps and errors.
Openai also admits that GPT-4.1 becomes less reliable (that is, he is certainly making mistakes) the more icons of the inputs that must be dealt with. In one of the company’s private tests, Openai-MRCR, the resolution of the model decreased from about 84 % with 8000 icons to 50 % with a million icons. GPT-4.1 also says to be “craftsman” more than GPT-4O, which sometimes requires more specific and frank claims.
https://techcrunch.com/wp-content/uploads/2024/12/GettyImages-2021258442.jpg?resize=1200,800
Source link