Securing an AI agent is unlike securing a normal application. The agent is non-deterministic, it reads untrusted content, and — unlike a chatbot — it can take actions. That combination creates a new attack surface. The reassuring part is that the defences are well understood: they are about boundaries and visibility, not about a perfect model.
A new attack surface
A traditional app does what its code says. An agent decides what to do based partly on the content it reads — which means a malicious document or email can attempt to influence its behaviour. So agent security has two jobs: limit what the agent is allowed to do, and limit what an attacker can achieve even if they steer it.
Least-privilege by default
The single most important control is access. An agent's permissions are its blast radius. Scope its credentials to only the tools and data its workflow needs, prefer read-only access, and never reuse a broad service account. If an agent only needs to read three systems and write to one, that is all it should be able to touch. This is the foundation of the AgentOps control layer.
Prompt injection and untrusted input
Because agents act on what they read, any external content — documents, emails, web pages, tool outputs — is a potential injection vector. You cannot fully prevent a model from being influenced by text, so the defence is architectural: treat all input as untrusted, and constrain the agent so that even a successfully injected instruction cannot reach a dangerous action. The agent's capabilities, not its prompt, are your real boundary.
Guardrails and the human gate
On top of access, constrain the actions themselves: allow-lists for what the agent can call, validation and limits on each action, and human approval for anything irreversible or high-consequence. We cover how to design that approval well in human-in-the-loop AI. The point is defence in depth — least privilege, guardrails and a human gate together, because no single layer is sufficient.
Logging, monitoring and rollback
Finally, assume something will eventually go wrong and prepare for it. Log every tool call, decision and approval so any action can be reconstructed. Monitor for unusual behaviour and cost. And build the ability to pause, roll back or restrict the agent instantly, with a named owner and a defined process. Security is not just prevention; it is the speed and clarity of your response. That readiness is part of the assurance and evidence we build into every deployment.