From: aidotengineer

AI agents are defined as autonomous systems capable of reasoning, creating their own workflows, calling tasks, and using tools to make API calls [00:00:57]. Tasks are specific actions that may use LLMs, RAGs, or tools [00:01:13]. Tools are functions used to retrieve data from the internet, databases, or service APIs [00:01:24]. Memories are contexts shared between agents, tasks, and tools [00:01:36].

Inherent Security Risks of Current AI Agent Implementation

Current agent frameworks often run as a single process where the agent, task, and tools reside in the same process [00:02:03]. This architecture presents several challenges:

  • Shared Credentials: If a tool needs access to a database or an API, it requires credentials or share tokens, which are often service user credentials with super admin privileges [00:02:15].
  • Process-level Access: Within the same process, one tool can technically access credentials meant for another [00:02:33]. Similarly, any third-party library running within the process can access prompts and other data [00:02:47]. This creates an unsecure, “zero trust” environment [00:02:51].
  • Insecure LLMs: If agents and tasks communicate with unsecured LLMs, these can be exploited [00:02:58].
  • Autonomous and Non-Deterministic Behavior: Agents are autonomous by definition, meaning they generate their own workflows based on tasks [00:03:10]. This non-deterministic nature introduces “unknown unknowns” in security, making it difficult to predict what an agent might do [00:03:21]. The attack vectors in a typical AI agent are significantly higher compared to traditional software [00:03:30].

Major Challenges with AI Agent Security

Due to these inherent risks, several challenges arise when building and deploying AI agents:

Security Risks

  • Improperly designed or implemented agents can lead to unauthorized access [00:03:46].
  • Data leakages of sensitive or confidential information are a significant concern [00:03:51].

Safety and Trust

  • Using unreliable models or having an unsafe environment can lead to wrong results, especially if prompts are maliciously altered [00:04:00].

Compliance and Governance

  • Many AI engineers are focused solely on getting agents to work, often overlooking crucial aspects like compliance and governance necessary for enterprise readiness [00:04:16].
  • Organizations, such as credit bureaus, treat AI agents similarly to human users [00:04:40]. This means agents must adhere to the same onboarding, training, and regulatory compliance processes as human employees [00:04:45]. This includes regulations around data access for specific regions (e.g., California resident data, Europe data) [00:04:52]. Without meeting these regulatory requirements, agents cannot be moved into production [00:05:23].

Addressing the Challenges: A Multi-Layered Approach

There is no “silver bullet” for security and compliance [00:05:38]. A layered approach involving evaluation, enforcement, and observability is recommended [00:05:43].

1. Evaluations (Evals)

Evals determine the criteria for deploying an agent to production and aim to generate a risk score [00:05:53]. This risk score helps decide if an agent, even a third-party one, is safe enough to be promoted [00:06:16].

Similar to traditional software development (e.g., test coverage, vulnerability scanning for Docker containers, CVE scanning for third-party software, pen testing for cross-site scripting) [00:07:22]:

  • Baseline Testing: Establish a baseline using appropriate use cases and ground truth to ensure that changes (e.g., prompt changes, new libraries, frameworks, or LLMs) do not alter the expected behavior [00:08:11].
  • LLM and Library Vulnerability Scanning: Ensure third-party LLMs are not poisoned and have been scanned for vulnerabilities [00:08:31]. Similarly, third-party libraries must meet minimum vulnerability criteria [00:08:38].
  • Prompt Injection Testing: Test for prompt injection attacks and ensure the application has controls to block them [00:08:48].
  • Data Leakage Evaluation: Verify that agents do not leak sensitive data [00:09:04]. For example, an HR agent should not allow an employee to access another’s salary benefits, or prevent malicious users from exploiting loopholes [00:09:39].
  • Unauthorized Actions: For agents that perform actions or change things, ensure these actions are performed by authorized personnel [00:09:55].
  • Runaway Agents: Test for scenarios where agents can enter a tight loop due to bad prompts or task/agent design issues, preventing them from going into production [00:10:14].

2. Enforcement

Enforcement ensures the proper implementation of security controls, particularly in a zero-trust environment where libraries can access various resources [00:10:39].

  • Authentication and Authorization: These are crucial for enterprise-level security [00:11:18].
    • Authentication: Prevents impersonation and theft of confidential information by ensuring proper user identification throughout the agent’s interaction flow (user request to agent, task, tools, and API/database calls) [00:11:25]. The user’s identity must be propagated to the final data access point [00:13:30].
    • Authorization: Ensures that access control policies are properly applied [00:11:46]. Agents have their own roles and should not exceed their defined scope [00:11:54]. If an agent acts on behalf of a user, the user’s specific role and permissions must be enforced, preventing access to data or APIs beyond their authorized scope [00:12:06].
  • Approvals: Integrate workflows where agents can automatically approve certain actions based on defined thresholds and guard rails [00:12:30]. For actions exceeding a certain limit, a human can be brought into the loop for approval [00:13:06].

3. Observability

Observability is critical in the agent world due to its dynamic nature, unlike traditional software which is more static [00:13:56].

  • Continuous Monitoring:
    • Model Changes: Models evolve rapidly [00:14:18].
    • Framework and Library Evolution: Agent frameworks and third-party libraries constantly change their behavior [00:14:23].
    • User Input Variability: Agent behavior is highly subjective to user input [00:14:32]. Agents must be monitored to see how they behave with diverse user queries beyond initial testing [00:14:51].
    • Data Leakage: Monitor for anomalies in the amount of PII or confidential data being sent out, allowing for quick action if detected [00:15:03].
  • Metrics and Thresholds: It’s impractical to monitor every request [00:15:15]. Instead, define thresholds and metrics such as failure rates [00:15:21]. Alerts can be triggered when failure rates exceed tolerance levels, indicating potential misbehaving agents or malicious users [00:15:37].
  • Anomaly Detection: Similar to User Behavior Analytics in traditional security, anomaly detection for agents involves monitoring whether an agent behaves within accepted boundaries [00:15:53]. This will contribute to a security posture report, providing near real-time confidence in how an agent is performing in a live environment [00:16:27].

Recap

To ensure safe and reliable AI agents:

  • Preemptive Vulnerability Evaluation: Conduct comprehensive evals to get a risk score, providing confidence to promote agents (or use third-party agents) to production [00:16:44].
  • Proactive Enforcement: Implement robust guard rails, enforcement mechanisms, and sandboxing to run agents securely [00:16:59].
  • Real-time Observability: Maintain strong observability to know how agents are performing in real-time and quickly fine-tune them if anomalies are detected [00:17:12].

The speaker mentions that Private AI has open-sourced their Safety and Security Solutions called “page.ai” and is looking for design partners and contributors [00:17:28].