AI
A Closer Look at the ARC-AGI Test for AGI and Its Flaws
2024-12-10
In the realm of artificial general intelligence (AGI), a significant milestone has been reached. A well-regarded test, the ARC-AGI benchmark, which was introduced by Francois Chollet in 2019, is now closer to being solved. However, Chollet and his team claim that this development actually highlights flaws in the test's design rather than marking a genuine research breakthrough.

ARC-AGI: A Key Test for AGI Progress

In 2019, Francois Chollet, a prominent figure in the AI world, introduced the ARC-AGI benchmark. This benchmark, short for “Abstract and Reasoning Corpus for Artificial General Intelligence,” was designed to assess whether an AI system can effectively acquire new skills outside the data it was trained on. Chollet asserts that ARC-AGI remains the only AI test that measures progress towards general intelligence, although other tests have been proposed.Before this year, the top-performing AI could only solve approximately two-thirds of the tasks in ARC-AGI. Chollet attributed this to the industry's overemphasis on large language models (LLMs), which he believes lack true “reasoning” capabilities.“As LLMs mainly rely on memorization, they struggle with generalization,” he stated in a series of posts on X in February. “They fail when faced with anything not in their training data.”Indeed, LLMs are statistical machines. By training on a large number of examples, they learn patterns to make predictions, such as the typical sequence of “to whom” followed by “it may concern” in an email.Chollet argues that although LLMs may be able to memorize “reasoning patterns,” it is unlikely that they can generate “new reasoning” in novel situations. “If you need to be trained on numerous examples of a pattern to learn a reusable representation, you are essentially memorizing,” he contended in another post.To encourage research beyond LLMs, Chollet and Mike Knoop, the co-founder of Zapier, launched a $1 million competition in June to develop open-source AI capable of surpassing ARC-AGI. Out of 17,789 submissions, the best-performing model scored 55.5%, which is about 20% higher than the 2023 top scorer but still short of the 85% “human-level” threshold required to win.However, Knoop emphasizes that this does not mean we are approximately 20% closer to achieving AGI.Today, we are excited to announce the winners of the ARC Prize 2024. We are also publishing an extensive technical report detailing what we have learned from the competition (link in the next tweet).The state-of-the-art performance in ARC-AGI has seen a remarkable increase from 33% to 55.5%, the largest single-year improvement since 2020. This significant leap showcases the potential and progress in the field.In a blog post, Knoop pointed out that many of the submissions to ARC-AGI have achieved results through “brute force” rather than true intelligence. He suggested that a “large fraction” of ARC-AGI tasks “[do] not carry much useful signal towards general intelligence.”ARC-AGI consists of puzzle-like problems where an AI must generate the correct “answer” grid given a grid of different-colored squares. These problems are designed to force the AI to adapt to new problems it has not encountered before. But it remains unclear whether they are truly achieving this goal.Tasks in the ARC-AGI benchmark. Models must solve ‘problems’ in the top row; the bottom row shows solutions. Image Credits: ARC-AGIKnoop acknowledged that “[ARC-AGI] has been unchanged since 2019 and is not perfect.”Francois and Knoop have also faced criticism for overemphasizing ARC-AGI as a benchmark for AGI at a time when the definition of AGI is highly debated. One OpenAI staff member recently claimed that if AGI is defined as AI “better than most humans at most tasks,” then AGI has “already” been achieved.Chollet and Knoop plan to release a second-generation ARC-AGI benchmark and organize a 2025 competition to address these issues. “We will continue to guide the efforts of the research community towards what we consider the most important unsolved problems in AI and accelerate the path to AGI,” Chollet wrote in an X post.Fixing the shortcomings of the first ARC-AGI test will not be an easy task. If the flaws of the initial test are any indication, defining intelligence for AI will be as challenging and controversial as it has been for human beings.
Automattic Buys WPAI to Enhance WordPress with AI
2024-12-10
WordPress hosting giant Automattic made a significant move on Monday by acquiring WPAI, a startup specializing in AI solutions for WordPress. The undisclosed price of the acquisition marks a new chapter in the evolution of WordPress.

Unlock the Potential of WordPress with Automattic's AI Acquisition

Acquisition Details

Automattic, a renowned name in the WordPress hosting space, announced the acquisition of WPAI. WPAI has a range of innovative products like CodeWP, which uses AI to create WP Plugins, AgentWP, an AI assistant for WordPress site builders, and WP Chat, an AI-powered chat for WordPress-related questions. However, WPAI has stated that CodeWP and AgentWP will be phased out in their current form and integrated within Automattic's offerings in the future.

The founding team of WPAI will be joining Automattic to lead the efforts in developing AI features for WordPress. As Automattic mentioned in their announcement, "They’ll be working on testing, building, and integrating innovative AI solutions into the core ecosystem to redefine how users and developers work with WordPress." This acquisition is set to bring a new wave of technological advancements to the WordPress platform.

CEO's Perspective

Automattic's CEO Matt Mullenweg also shared the news on his personal blog. He emphasized the importance of this acquisition in shaping the future of WordPress. Mullenweg believes that by integrating WPAI's AI solutions, Automattic can offer users and developers even more powerful tools to enhance their WordPress experience.

WPAI, on its blog, expressed its focus on creating applied AI solutions for the WordPress ecosystem. This includes developing AI standards for WordPress, improving the platform's core functionality, and creating tools that help users build and manage better websites. The company will work closely with the WordPress community to ensure that these improvements are implemented thoughtfully while maintaining open-source values.

Previous AI Initiatives and Future Plans

Over the past few years, Automattic has been at the forefront of introducing AI tools to assist users in writing better and more concise posts. With this acquisition, the startup is likely to focus on creating AI-powered developer and site building tools. This will further enhance the capabilities of WordPress and provide users with more options to customize and optimize their websites.

It's worth noting that this is Automattic's second acquisition in two months. Last month, the company acquired Harper, a Grammarly competitor for developers that checks grammar locally on the device. These acquisitions demonstrate Automattic's commitment to staying at the cutting edge of technology and providing the best solutions for WordPress users.

Legal Battle with WP Engine

Both Automattic and Mullenweg are involved in a legal battle with rival WordPress hosting site WP Engine. WP Engine has accused Mullenweg of anti-competitive behavior, while Automattic and Mullenweg have argued that WP Engine infringed the "WordPress" trademark and did not contribute enough to the ecosystem. The judge in the case indicated last month that the court would pass some primary injunction, but the specifics of the order still need to be finalized.

This legal battle adds an interesting dimension to the story and highlights the competitive landscape in the WordPress hosting industry. However, Automattic's acquisition of WPAI shows their determination to continue driving innovation and growth in the WordPress ecosystem.

See More
Amazon Establishes New R&D Lab for AI Agents Led by Adept Co-founder
2024-12-09
Amazon is making significant strides in the field of artificial intelligence by establishing a new R&D lab in San Francisco. This move is set to focus on building "foundational" capabilities for AI agents, marking a new era in the company's technological pursuits.

Unleashing the Potential of AI Agents with Amazon's New Lab

Building AI Agents for Real-World Actions

Amazon is establishing the Amazon AGI SF Lab, led by David Luan, the co-founder of AI startup Adept. This lab aims to build agents that can take actions in both the digital and physical worlds and handle complex workflows using various tools. "Our initial focus is on several key research bets that will enable AI agents to perform real-world actions, learn from human feedback, self-course-correct, and infer our goals," added Luan and Pieter Abbeel. The lab will be seeded by Adept employees, and Amazon is looking to hire a few "dozen" additional researchers in fields like quantitative finance, physics, and math.This is a crucial step in Amazon's journey towards more advanced AI capabilities. By focusing on real-world actions, the company hopes to create agents that can have a tangible impact on various aspects of our lives.

Amazon's Quasi-Acquisition of Adept and Its Implications

In June, Adept, which is developing AI-powered agents, agreed to license its tech to Amazon, and Luan and portions of Adept's team joined the e-commerce giant. This quasi-acquisition resembles the deal Microsoft struck with AI company Inflection in May and has come under regulatory scrutiny. Adept was founded with the goal of creating an AI model that can perform actions on any software tool using natural language. Many others now share this vision, as evidenced by the growing interest in agentic AI. According to Emergen Research, "agentic" AI could be worth $31 billion by the end of the year. Eighty-two percent of organizations plan to integrate AI agents within three years, attracted by the possible efficiency boosts.This acquisition and the focus on agentic AI show Amazon's commitment to staying at the forefront of technological innovation and leveraging the power of AI to drive business growth and improve user experiences.

Amazon's Existing and Future Ventures in the Agent Space

Amazon has dabbled in the agent space but has yet to make a serious play. In July, the company announced conversational agents for its Bedrock AI development platform, and just last week, it brought agents to its Amazon Q Business assistant platform for business customers and developers. Amazon CEO Andy Jassy has hinted at a more agentic Alexa, one capable of not only responding to questions but taking actions.This indicates that Amazon is gradually expanding its presence in the agent space and exploring new ways to integrate AI agents into its existing products and services. By doing so, the company aims to provide more personalized and efficient experiences to its users.
See More