* A version of this opinion piece was published by Daily Maverick on 2 April 2025 and can be accessed here.
Until recently the idea of agentic artificial intelligence (AI) systems operating in the real world seemed like science fiction. But that’s no longer the case.
A case in point is the Chinese startup Butterfly Effect’s Manus, launched recently. Unlike many conventional AI models, Manus integrates multiple AI systems and is designed to operate with minimal human oversight. Demand has been overwhelming with millions of people on the waiting list.
Agentic AI systems offer clear societal benefits. Yet, their risks will be particularly difficult for policymakers to manage.
What is an agentic AI system?
To understand agentic AI, we need to start with defining AI more broadly. Despite ongoing debate, the OECD definition remains widely accepted:
“An AI system is a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments.”
What makes agentic AI different from conventional AI systems? A key distinction is autonomy. These systems require far less human supervision than traditional AI. While most AI models execute predefined tasks in response to human input, agentic AI can initiate and complete tasks independently. Closely linked to this is adaptiveness. Unlike conventional AI, which functions best in static environments, agentic AI dynamically adjusts its behaviour based on changing conditions.
A further distinguishing factor is the complexity of objectives. Traditional AI systems optimise for clear-cut goals, but agentic AI must navigate multiple, evolving, and sometimes conflicting objectives. At the heart of agentic AI is reinforcement learning (RL), a machine learning approach that allows these systems to optimise its behaviour from trial and error. Instead of being programmed with fixed rules, RL-based systems refine their actions over time by optimising for rewards based on past experiences.
Agentic AI in action
Agentic AI is already in use across multiple sectors. In healthcare, AI-driven systems manage chronic illnesses by tracking patient histories, reminding individuals to take medication, and even adjusting prescriptions in response to treatment outcomes. Researchers have also developed multi-agent diagnostic systems where multiple AI models collaborate like a team of specialists, improving the accuracy of diagnoses for rare or complex diseases.
In finance, agentic AI analyses market data in real time, and executes trades at speeds that humans simply cannot match. Cybersecurity has also benefited, with autonomous AI systems that not only detect threats but also respond instantly, sometimes even patching vulnerabilities before they can be exploited.
In manufacturing, agentic AI is transforming operational efficiency. Predictive maintenance models now detect potential equipment failures before they happen, reducing costly downtime. Across these sectors, agentic AI reduces human workload, increases precision, and improves responsiveness to complex challenges.
The novel risks of agentic AI systems
Despite major societal benefits, agentic AI introduces new risks that will be particularly difficult to regulate.
Much of these risks are driven by the proliferation problem. Open-source AI models, while promoting innovation, also make it nearly impossible to track their use. In some cases, the technology underlying agentic AI are stolen, further driving proliferation. The problem, of course, is not proliferation in itself. Rather, when proliferation means policymakers are unable to regulate the use of these systems, it presents a major challenge.
The proliferation of agentic AI systems power the malicious use problem, where bad actors exploit agentic AI to cause large-scale harm. Already, agentic AI has been used for voice cloning scams and the mass generation of fake news.
Malicious use, in turn, is exacerbated by the unexpected capabilities problem. As AI models become more sophisticated, they sometimes develop unanticipated abilities that could be misused, with developers only realising the risks after deployment.
Over the medium term, the overuse of agentic AI could contribute to overreliance and disempowerment. As agentic AI becomes embedded in high-stakes fields like finance and law, it could become impossible for human operators to detect failures or intervene effectively. In some cases, humans might not even understand when or why an AI system is malfunctioning, let alone how to correct it.
The risks that are most difficult to regulate
Yet the most challenging risks, in my view, stem from how reinforcement learning (RL) shapes agentic AI behaviour.
RL agents optimise their actions based on a reward function, learning through trial and error. While developers define high-level goals, AI systems often develop instrumental goals, namely intermediate objectives that help them achieve their broader tasks. Already in 2008, Stephen Omohundro argued that sufficiently advanced AI would pursue instrumental goals such as acquiring resources or increasing computing power to improve performance. More recent research has confirmed this intuition.
A particularly concerning category of instrumental goals is convergent instrumental goals, which are objectives that are useful across many different AI tasks. These may include accumulating influence over an environment or even manipulating users to ensure goal completion. The challenge is that these goals emerge without human oversight, making them difficult for policymakers to detect, let alone regulate.
Reward hacking is also fiendishly difficult to regulate. This happens when an AI system finds unintended shortcuts to maximise its reward, sometimes in harmful ways. A well-documented example is when engagement-optimised AI systems (such as those used in social media) promote extreme or emotionally manipulative content because it increases watch time or user interactions—despite the broader harm it may cause.
Over the medium to long term, specific types of reinforcement learning (RL) agents present particular challenges. As discussed in Science, one of the world’s most highly-cited journals, sufficiently capable long-term planning agents may have incentives to ‘thwart human control’. These RL agents, with extended planning horizons, could make human oversight close to impossible. Moreover, the potential for humans to withhold rewards ‘strongly incentivises the AI system to take humans out of the loop’. Mechanisms by which long-term planning agents could do so include taking control over human infrastructure and creating other agents to act on their behalf.
What should policymakers do?
A growing community of researchers is exploring how to regulate agentic AI, often within the broader field of frontier AI governance. However, much work remains, especially in Africa.
One promising approach is a regulatory model that combines principles-based and rules-based regulation. This approach acknowledges two key realities: first, that we do not yet fully understand the risks of agentic AI, and second, that existing safety mechanisms remain underdeveloped. Given this uncertainty, policymakers must build capacity urgently, while building much closer collaboration between AI developers and regulators.
There is also recognition that pure self-regulation is insufficient. While the AI industry has made real efforts to prioritise safety, the fundamental problems remain: misalignment between the incentives of AI developers and the public interest, and both the potential scope of negative externalities and societal impact produced by AI systems weaken the incentive for self-regulation.
Agentic AI represents a major technological leap, offering immense benefits but also introducing unpredictable risks. Unlike traditional AI, agentic systems act autonomously, adapt to new environments, and pursue complex, self-generated objectives, often in ways that are difficult to regulate. For policymakers, the challenge is twofold: understanding these risks and developing governance structures that can keep pace with rapid technological advancements. It sounds simple but putting this into practice will be no easy feat.