What you need to know about Amazon Nova Act: The new artificial intelligence agent SDK challenges Openai, Microsoft, Salesforce

Photo of author

By [email protected]


Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more


Sleep giant woke up!

For a while, Amazon seemed to have been playing with a knee in the race to provide its users-especially millions of developers who are building the cloud structure of Amazon Web Services (AWS)-on artificial models and tools from their first side.

But in late 2024, For the first time the family of the inner basis model, Amazon Nova appearedWith text, image, and even the possibilities of generating video, and Last month, a new assistant to Amazon Alexa witnessed Partly supported by the Claude family of models.

Then, on Monday, giant e -commerce and junction The Amazon Aji General Intelligence Department He has It announced the release of Amazon Nova ActA group of experimental developers to build artificial intelligence agents who can move in the web and complete the tasks independently, supported by a dedicated version of the Nova Great Language Model (LLM). Oh, the SDK Open Source Group (SDK) group under the permitted APACHE 2.0 license, although SDK is designed to work only with the designated Nova from the Amazon, not any third party.

The goal is to enable the third party developers to build artificial intelligence agents who are able to reliably perform tasks inside web browsers.

But how Nova law accumulates from Amazon to the building platforms of other agents on the market, such as Microsoft’s Autogenand Salesforce’s AgentFORCEOf course, the recent open source of Openai SDK agents?

A different and more thinking approach to artificial intelligence agents

Since the general height of the LLMS models, most “agent” systems are limited to responding to the natural language or providing information by inquiring about the rules of knowledge.

Act Nova is part of the biggest shift in the industry towards procedures-based agents-secret systems that can complete actual tasks through digital environments on behalf of the user. API new responses for OpenaiWhich allows users to access the independent browser explorer, is one of the main examples of this, which developers can integrate into artificial intelligence agents through Openai Agents SDK.

Amazon Agi emphasizes that the current agents systems, despite promising, are struggling with reliability and often requires human supervision, especially when dealing with multi -step or complex work flows.

Act Nova is specially designed to process these restrictions by providing a set of descriptive atomic orders that can be linked together in a reliable workflow.

Deniz Perkelikshi, a member of the Amazon Technical Personnel Authority, described the broader vision of A. Video presentation of the Nova Law: Soon, there will be more than artificial intelligence agents more than people browsing the web, and they carry out tasks on behalf of users.

https://www.youtube.com/watch?

David Luan, Vice President of Amazon’s Autonomy Team and AGI SF LAB, framing the task directly in a recent video call interview with Venturebeat:

Luan, previously Associate founder and CEO of Adept AIJoin Amazon in 2024 as Part of Aqcui rental. Luan said he had long been in support of artificial intelligence factors. He added: “With ADEPT, we were the first company to start working on artificial intelligence agents. At this stage, everyone knows how important the agents are. It was great to be a little before our time.”

What the Nova Devs Law offers

The Nova Act SDK law provides a working framework for the creation of web -based automation agents using the natural language claims divided into clear and controlled steps.

Unlike the model LLM model factors that try to work in the entire work of one router-often lead to unreliable behavior-NOVA is designed to perform smaller tasks that can be gradually verified.

Some of the main features of the Nova Law include:

  • Deficiency of the accurate task: The developers can divide complex digital workflow into a smaller ACT call, and each of them directs the agent to perform the specific user interface reactions.
  • Direct browser processing through the playwright: The Nova Law is integrated with playwrightOpen source browser automation frame developed by Microsoft. The theatrical writer allows developers to control web browsers programming – dissolving elements or forms filling or transporting pages – without relying only on artificial intelligence predictions. This integration is especially useful for dealing with sensitive tasks such as entering passwords or credit card details. For example, instead of sending sensitive information to the form, developers can guide Act Nova to focus on the password field and then use the theatrical writer’s application programming facades to enter the password safely without “seeing it” at all. This approach helps to enhance safety and privacy when automating web reactions.
  • Bethon integration: SDK allows developers to intertwine the Python icon with ACT NOVA orders, including standard Python tools such as stopping points, assurances, or assembling the bonding indicators for parallel implementation.
  • Organized information extract: SDK supports the extraction of organized data through Pydantic plans, allowing agents to convert screen content into organized formats.
  • Parallel and schedule: Developers can run many ACT NOVA counterparts simultaneously and scheduling automatic workflow without the need for continuous human control.

Luan emphasized that ACT NOVA is a tool for developers instead of a chatbot for general purposes. He said: “Nova Act is designed for developers. It is not a chatbot that you talk to for fun. It is designed to allow developers to start building useful products.”

For example, one of the workflows shown in Amazon documents explains how Nova ACT can automate searches in apartments by stripping the rental lists and calculating the distance of cycling to train stations, then sorting the results in an organized schedule.

Another example is used on Nova Act to order a specific authority of SweetGreen every Tuesday, completely free of hands and on a timetable, explaining how developers can automate digital tasks that can be repetitive and customized.

Standard performance and focus on reliability

A central message in the Amazon Declaration is that reliability, not just intelligence, is the main obstacle to the adoption of a broad agent.

Current modern models are in fact very fragile in the operation of artificial intelligence agents, as agents usually achieve between 30 % to 60 % of success rates in multi -steps based on the browser, according to Amazon.

However, ACT NOVA emphasizes the construction block approach, as more than 90 % recorded the internal assessments of tasks that challenge other models-such as interaction with a drop-down, dates, or pop-up windows.

Luan stressed the importance of focusing on reliability. He said: “What we really focused on is how to make the agents already reliable?

Amazon Agi evaluated the NOVA law against competing models including Claude 3.7 Sonnet’s HotHROPIC and Openai. On Screenspot Web Text Benchmark, which tests the instructions tracking on text screen elements, ACT NOVA has scored 0.939, outperforms Claude 3.7 Sonnet (0.900) and Openai Cua (0.883).

Amazon Nova standards. Credit: Amazon

On Screenspot Web ICON StinMark, which focuses on the visual user interface elements, Nova Act score 0.879, again before other models.

However, on Groundui Web Penchmark, which tests General UI Interaction, ACT NOVA 0.805 scored a slightly behind his competitors.

These grades were internally measured by Amazon using fixed claims and evaluation criteria.

Amazon also highlighted the early results of ACT’s ability to generalize standard environments.

For example, team member Rick Leo showed how the agent, without an outright training, interacted with a successful web game-statistics compensation, opponents, and progress in the game.

According to Luanan, this generalization is essential for long -term vision. He said: “Our goal in Act Nova is to be a global solution to use the browser. We want an agent who can do anything you want to do on a computer for you.”

Flexible to use in different clouds, but closed to the Nova model from Amazon

While Act Nova is available to developers worldwide through Nova.amazon.comLuan explained that the regime is tightly associated with the models of the Nova Foundation in Amazon.

Developers cannot connect external LLMS like Openai’s GPT-4O or Claude 3.7 Sonit AnthropUnlike Openai SDK agents, and to a lower limit, Microsoft’s Autogen and Salesforce’s AgentFORCE Platforms (which allow a few of the available companies and the model families).

“Act Nova is a dedicated trained version of the Nova model,” he said. “It is not just a scaffold on LLM year. He is originally trained to behave on the Internet on your behalf.”

However, Nova Law is not limited to AWS environments. The developer can download and operate SDK locally, in the cloud, or anywhere they choose. “You do not need to be on AWS to use,” Luan stated.

Thus, for companies looking for maximum flexibility of their agents, ACT Nova may not be the best option. However, for those looking for a specially designed style specifically designed to move in the web and perform procedures through a wide range of web sites that contain very different user interfaces (UIS), it is likely that it is worthy of appearance-especially if you are already in Amazon or Aws Developer Ecosystem.

Security, licensing and pricing

Nova Act SDK is released under APache license, version 2.0 (January 2004), which is an open source license. However, this only applies to SDK.

The NOVA law model itself, along with its weights and training data, is ownership and remains closed. This is what is meant, according to Luanan, who made it clear that the model is tightly integrated and trained with SDK to achieve reliability.

At launch, ACT Nova is offered as a free research inspection. There is no pricing about the use of production yet.

Luan described this stage as an opportunity for developers for experience and construction with technology. He said: “Our faith is that the majority of the most useful agents products have not yet been built. We want to enable anyone to build a truly useful agent, whether for themselves or as a product.”

In the long run, Amazon plans to provide terms of the production category, including invoices and expansion guarantees based on use, but these are not yet available.

What is the following to do Nova?

The Act Nova version reflects the broader Amazon ambition to make artificial intelligence customers directed to work as a key computing component.

Luan summarized the upcoming opportunity: “My personal dream is for the agents to become a building block of computing, and the most wonderful startups and new products are built on the highest of our team.”

Nova SDK law is now available for experience and premium models Amazon website On Jaytab.



https://venturebeat.com/wp-content/uploads/2025/04/cfr0z3n_a_diverse_group_of_scientists_in_white_lab_coats_stan_ac9142cc-1eb4-4e10-a891-29182615ad56_0.png?w=1024?w=1200&strip=all
Source link

Leave a Comment