The Nova Act AI Agent System Revealed by Amazon AGI Labs Can Manage Browsers to Carry Out Tasks

In Brief

In a recent announcement, Amazon AGI Labs has rolled out the Nova Act AI model specifically engineered to handle tasks directly within web browsers. Alongside this, the company has shared a research preview of the SDK, providing developers with an early opportunity to test its capabilities.

Amazon AGI Labs The dedicated division of the company focusing on the advancement of Artificial General Intelligence (AGI) has just launched the Amazon Nova Act, a cutting-edge AI model crafted for executing tasks within a web browser environment.

In this initiative, Amazon AGI Labs has also provided a research preview of the Amazon Nova Act SDK, which enables developers to explore the early iterations of the model. This toolkit allows developers to create agents that can manage an array of tasks in a browser, such as submitting out-of-office requests within internal systems, scheduling calendar appointments, or sending notifications about being away from the office.

The Nova Act SDK equips developers with the means to decompose intricate workflows into smaller, more manageable commands, which could include actions like searching, checking out items, or answering queries based on visible screen content. Moreover, developers have the option to embed specific instructions within these commands (for example, 'do not accept insurance upsell offers'), call APIs, and utilize Playwright for direct browser manipulation, improving reliability especially for tasks like entering passwords. This SDK also supports Python code integration to facilitate testing, breakpoints, assertions, or even parallel threading, effectively tackling the real limitations posed by web page load times, even for the speediest agents.

Nova Act: An AI Model Promising Over 90% Accuracy for Complicated Web Interactions

Designed to offer dependable foundational elements that can be assembled into more sophisticated workflows, the Nova Act distinguishes itself from many agent evaluation benchmarks that prioritize high-level tasks. While state-of-the-art models usually achieve a mere 30% to 60% in task completion accuracy within web browsers, Nova Act's main objective is reliability. Amazon AGI Labs sets a benchmark of exceeding 90% accuracy in internal evaluations, tackling common hurdles that can hinder other models, such as selecting dates, navigating dropdowns, and managing pop-up interfaces. The model is meticulously crafted to shine in benchmark tests like ScreenSpot and GroundUI Web, which emphasize an AI’s interaction capabilities on the internet. For instance, it boasts a score of 0.939 for handling text on screenshots, 0.879 for engaging with visual components, and 0.805 for comprehending and interfacing with various UI elements across web pages.

Beyond just functionality, Nova Act places a premium on reliability. Once users have set up the model, they won't need to monitor it constantly. Users can toggle on headless mode, effectively transforming the agent into an API that integrates flawlessly with other systems, or even schedule it to execute tasks asynchronously.

Moreover, despite being in its preliminary phases, Amazon AGI Labs is optimistic about Nova Act's potential to adapt its understanding of user interfaces in diverse environments. Initial tests indicate that Nova Act performs commendably in unfamiliar settings, like web-based games, even without any prior exposure to gaming.

Furthermore, thanks to its solid foundational elements and adaptability, Nova Act is already being integrated into Alexa+ to autonomously navigate the web and handle tasks in instances where integrated services do not provide the requisite APIs.

I’m excited to announce the debut release from the AGI SF Lab: Meet Nova Act — the easiest way to create agents that can efficiently utilize browsers, unlocking access to a significant portion of our digital universe.

This initiative brings us a step closer to developing universal agents that can operate seamlessly in both digital and physical realms. pic.twitter.com/WNj0xprETp
— Pieter Abbeel (@pabbeel) March 31, 2025

Nova Act marks a pivotal advancement in Amazon AGI Labs' mission to cultivate essential capabilities needed for scalable and effective agents. This initial milestone is just a segment of a broader training plan aimed at enhancing the model. To empower agents to become genuinely intelligent and reliable when handling complex, multi-step tasks, Amazon AGI Labs advocates for the use of reinforcement learning across varied real-world environments for training agents, rather than relying merely on supervised fine-tuning through straightforward demonstrations. The team is enthusiastic about sharing additional research and updates as the model progresses.

Tags:

Disclaimer

In line with the Trust Project guidelines Please be advised that the details presented on this page are not intended to serve as legal, tax, investment, financial, or any other form of advice. It’s critical to invest only what you can afford to lose and to consult independent financial advisors if you have any uncertainties. For more insights, we recommend reviewing the terms and conditions along with the help and support resources provided by the issuer or advertiser. MetaversePost strives to deliver precise and impartial reporting; however, market circumstances can change without notice.

The Nova Act AI Agent System Revealed by Amazon AGI Labs Can Manage Browsers to Carry Out Tasks

Nova Act: An AI Model Promising Over 90% Accuracy for Complicated Web Interactions

Disclaimer

From Ripple to The Big Green DAO: Examining How Cryptocurrency Initiatives Support Charitable Efforts

AlphaFold 3, Med-Gemini, and More: How AI is Revolutionizing Healthcare in 2024