Agent-to-Agent Communication Security Patterns: Defending Against Session Smuggling and Conversational Manipulation
The security conversation around AI agents focused for most of 2025 on authentication. Can the sending agent prove its identity? Is it authorized to make this request? Standards like Google's Agent2Agent (A2A) protocol and Anthropic's Model Context Protocol invested significant effort in identity verification. The assumption was straightforward: if you know who is calling and they are authorized to call, the interaction is safe.
In November 2025, Palo Alto Networks' Unit 42 research team demolished that assumption. Their Agent Session Smuggling research demonstrated an attack that does not exploit a vulnerability in A2A's authentication. It does not bypass credential checks. It weaponizes something more fundamental: the fact that LLM-based agents maintain conversational context across interactions, and that context shapes their behavior in ways that authentication alone cannot control.
A compromised agent with valid credentials can inject instructions into another agent's session history, causing the target agent to execute unauthorized actions, transferring funds, exfiltrating data, escalating privileges, while the interaction appears completely normal to the end user watching the conversation unfold.
The variable that determines your exposure is conversational surface area: the total accumulated context that influences an agent's decision-making during a multi-turn interaction. The more context an agent carries, the more sophisticated its collaboration. And the more context it carries, the more room there is for that context to be manipulated.
Authentication tells you who is in the conversation. Authorization tells you what they are allowed to do. Neither tells you whether the conversation itself has been manipulated to make unauthorized actions appear authorized.
A2A Threat Model Reference
Agent-to-agent communication introduces a threat model distinct from traditional API-based integration. The differences matter for defense planning.
How A2A Differs from MCP
MCP follows a stateless client-server model. Each request from a client to an MCP server is independent. The server does not maintain conversational context between requests. This means session smuggling, as Unit 42 described it, is structurally harder against pure MCP interactions. An attacker can exploit individual requests, but cannot build up manipulative context across exchanges because MCP does not maintain that context.
A2A, by design, is stateful and conversational. Agents engage in multi-turn task negotiations where context accumulates across interactions. This is what makes A2A powerful for complex collaborative tasks. Agents can refer back to earlier exchanges, build on previous agreements, and adapt their behavior based on conversational history. It is also what makes A2A vulnerable to session smuggling: the conversational memory that enables sophisticated collaboration is the same mechanism that enables multi-turn manipulation.
An enterprise running both protocols, which is increasingly common as organizations adopt multiple agent frameworks, needs distinct threat models and protocol-appropriate defenses for each.
Threat Categories for A2A
Session Smuggling. A compromised agent engages a target agent in what appears to be a legitimate multi-turn conversation, gradually building context that makes subsequent unauthorized requests seem reasonable. The attack distributes malicious intent across multiple benign-looking exchanges. There is no single malicious input to filter.
Context Poisoning. An agent's accumulated conversational context is modified, either by a compromised peer agent or through manipulation of the context storage mechanism, to alter the agent's interpretation of subsequent legitimate requests.
Credential Harvesting via Conversation. A compromised agent uses conversational interactions to extract tool schemas, session history, configuration details, and authentication tokens from target agents. The extraction occurs through what appears to be normal task coordination.
Behavioral Manipulation. Hidden instructions embedded in conversational exchanges (through tool descriptions, data payloads, or formatting artifacts) alter the target agent's behavior without modifying its authentication or authorization state.
Invisible Layer Exploitation. In production multi-agent deployments, intermediate agent-to-agent conversations are typically not displayed to end users. The attack operates in this invisible layer, where manipulated exchanges occur without any visible indication to the user or operator.
Session Integrity Verification Patterns
Pattern 1: Scope Boundary Verification at Every Handoff
When Agent A passes a task to Agent B, Agent B should not just verify Agent A's identity. It should verify that the conversational context is consistent with the task's declared scope.
Implementation: define a scope declaration for each agent interaction. The scope includes the task type, the data domains involved, and the expected action categories. At each handoff, the receiving agent compares the incoming request against the declared scope. If a research assistant suddenly introduces financial transaction requests into a conversation that started as a literature review, the scope violation triggers an alert and blocks execution.
Scope Declaration (Research Task):
task_type: information_retrieval
data_domains: [knowledge_base, public_research]
allowed_actions: [search, summarize, cite]
→ Financial transaction request VIOLATES scope
→ Trigger: alert + block + log
Pattern 2: Context Hash Verification
At defined checkpoints during a multi-turn agent interaction, compute a cryptographic hash of the current conversational context. Store these hashes in a tamper-evident log. If the context changes between checkpoints in ways that cannot be attributed to legitimate conversational additions, the session has been manipulated.
This does not prevent smuggling, but it creates a forensic trail that enables post-incident analysis. If an unauthorized action occurred at turn 15, and the context hash diverged from expected values at turn 8, you know where the manipulation began.
Pattern 3: Session Reset at Task Boundaries
Rather than allowing unlimited conversational history between agents, implement mandatory context resets at defined task boundaries. When an agent completes one task and begins another, its conversational history with collaborating agents is cleared or compartmentalized. This limits the window within which smuggling can accumulate manipulative context.
The tradeoff is real: session resets reduce coordination continuity. An agent that forgets the previous task's context cannot reference it when starting a new task. For workflows where tasks are genuinely independent, this cost is minimal. For workflows where tasks build on each other, partial resets (clearing certain context categories while preserving others) may be necessary.
Behavioral Baseline Methodology for Agent Communication
Just as User and Entity Behavior Analytics (UEBA) systems establish normal behavior patterns for human users, multi-agent systems need conversation behavior analytics for agents.
Baseline Dimensions
Topic distribution. What subjects does each agent typically discuss in its inter-agent communications? A customer data agent that suddenly begins discussing financial transaction procedures is deviating from baseline.
Request type frequency. What types of requests does each agent typically make? Track the distribution of request categories (data queries, tool invocations, status updates, task delegations) over a baseline period. Deviations from the established distribution indicate potential manipulation.
Data volume per exchange. How much data does each agent typically share in a single exchange? An agent that normally returns 500-byte responses suddenly returning 50KB responses warrants investigation.
Conversation turn count. How many turns does a typical interaction take for each agent pair? A session smuggling attack often requires more conversational turns than a legitimate interaction to build sufficient manipulative context.
Scope stability. Does the topic and action scope of an interaction remain consistent, or does it drift? Legitimate interactions tend to stay within their declared scope. Smuggling attacks require scope escalation.
Building the Baseline
Collect interaction data for a baseline period (minimum two weeks of normal operation). For each agent pair, compute statistical distributions for each dimension above. Set alert thresholds at two standard deviations from the mean. Adjust thresholds based on operational experience: too sensitive generates alert fatigue, too permissive misses genuine attacks.
Ongoing Monitoring
Compare real-time interaction data against baselines continuously. Alert on single-dimension deviations if they are extreme (more than three standard deviations). Alert on multi-dimension deviations at lower thresholds (two standard deviations across two or more dimensions simultaneously, which correlates with orchestrated manipulation rather than natural variation).
Session Context Change Detection (Smuggling Indicators)
Agent Session Smuggling follows a predictable pattern. Recognizing the pattern enables detection even when individual exchanges appear benign.
Phase 1: Reconnaissance
The compromised agent engages the target in normal-seeming interactions to extract information about the target's capabilities, tools, configuration, and session history. Indicators: requests for tool schemas, questions about available actions, queries about previous interactions or current session state.
Detection signal: An agent requesting metadata about another agent's capabilities outside of the established task context. Track metadata requests separately from task requests.
Phase 2: Context Building
The compromised agent establishes conversational precedent through multiple exchanges that make subsequent unauthorized requests appear natural. The agent might discuss financial processes in general terms before requesting a specific transaction. It might reference a previous (legitimate) data access to justify a broader (unauthorized) data request.
Detection signal: Gradual scope expansion across conversational turns. The scope of requests broadens monotonically without a corresponding broadening of the declared task.
Phase 3: Exploitation
The compromised agent leverages the established context to issue unauthorized requests that the target agent executes because they appear consistent with the conversational history.
Detection signal: Actions that would be flagged as anomalous in isolation but that appear consistent with the conversation's accumulated context. This is the hardest phase to detect because the attack has specifically constructed the context to make the unauthorized action appear authorized. Detection at this phase relies on the behavioral baselines and scope monitoring from earlier phases catching the buildup.
Agent Communication Logging Architecture
Most agent monitoring systems log inputs and outputs: what was requested and what was produced. Session smuggling lives in the intermediate instructions. Detecting and investigating it requires instruction-level logging that captures every inter-agent message, including the conversational context that accompanied each request.
Logging Requirements
Every inter-agent message, complete. Not summaries. Not truncated versions. The full message payload, including any metadata, formatting, and embedded content. Smuggling payloads can hide in formatting artifacts, tool description fields, or metadata that would be stripped from a summarized log.
Conversational context at each turn. The accumulated context window that the agent is operating within when it processes each message. This is necessary for forensic analysis: understanding why the agent made a specific decision requires seeing the full context it was operating under.
Tool call details. Every tool invocation, including the full parameter set, the tool definition at the time of invocation, and the tool's response. Log the tool definition specifically because rug pull attacks modify definitions, and you need the historical record to detect retrospectively when a definition changed.
Decision reasoning. If the agent framework supports it, log the agent's reasoning trace: why it selected a particular tool, why it shared specific data, why it approved a specific action. This trace is the primary forensic artifact for understanding whether an agent's decision was influenced by smuggled context.
Storage and Retention
Agent communication logs are high-volume. A multi-agent system with 20 agents making 100 inter-agent calls per hour generates substantial data. Design your logging architecture for write-heavy workloads with append-only semantics. Compress logs after a hot retention period but maintain them for the full investigation window (90 days minimum for security events, longer for regulated industries).
Ensure logs are tamper-evident. Use hash chains or append-only storage systems that prevent retroactive modification. If an incident investigation reveals that logs have been altered, the forensic value is destroyed.
Sandbox Testing Methodology for Session Smuggling
Before an attacker tests your multi-agent system for session smuggling, test it yourself.
Test 1: Basic Context Manipulation
Deploy a controlled compromised agent into a sandbox multi-agent environment. Program the compromised agent to engage a target agent in a multi-turn conversation that gradually escalates from benign requests to unauthorized ones. Measure whether the target agent executes the unauthorized request and whether your monitoring systems detect the escalation.
Test 2: Invisible Layer Exploitation
Configure the test environment with the same user-facing interfaces as production. Verify that the session smuggling exchanges are invisible to the end user interface. If the manipulated conversations are visible to users, your UI is providing an accidental defense that the architecture should not rely on.
Test 3: Credential Harvesting
Program the compromised agent to extract configuration information, tool schemas, and session state from target agents through conversational interactions. Measure what information the target agents reveal and whether the extraction triggers any alerts.
Test 4: Cross-Protocol Smuggling
If your environment uses both A2A and MCP, test whether a compromised agent can leverage one protocol to influence behavior in the other. An agent that communicates via A2A might manipulate the conversational context to cause the target agent to make different MCP tool calls than it otherwise would. The attack crosses protocol boundaries, and defenses must account for this.
Test 5: Behavioral Baseline Evasion
Program the compromised agent to execute the smuggling attack while staying within the established behavioral baselines. Spread the reconnaissance, context building, and exploitation across a longer time period. Use smaller scope escalations. Stay within normal data volume ranges. This tests the robustness of your detection systems against a patient, sophisticated attacker.
Cross-Vendor Agent Handoff Security Checklist
Enterprise multi-agent deployments increasingly span multiple vendors. A Microsoft Copilot agent hands off to a custom agent built on LangChain, which invokes tools through an MCP server, which calls a third-party API. Each vendor boundary is a trust boundary with distinct security properties.
Pre-Handoff Verification
Before Agent A (Vendor 1) hands off a task to Agent B (Vendor 2):
Identity verification. Agent B's identity must be verified through a mechanism independent of the conversational channel. Do not rely on Agent B's self-identification within the conversation.
Capability verification. Agent B's declared capabilities must be validated against an approved registry. Do not trust capability claims made within the conversational exchange.
Context sanitization. Before passing conversational context to Agent B, strip content that could function as injection: embedded instructions, formatting artifacts with hidden content, metadata fields with suspicious values.
Scope declaration. Explicitly declare the scope of the handoff: what task Agent B should perform, what data it needs, and what actions it is authorized to take. Anything outside this scope should be rejected by Agent B's authorization layer.
During Handoff
Token scoping. If Agent B needs access to backend resources, issue tokens scoped to the specific resources and permissions required for the declared task. Do not pass Agent A's tokens to Agent B.
Context compartmentalization. Agent B should receive only the context necessary for its specific task, not Agent A's full conversational history. Limit the context to the declared scope.
Post-Handoff Monitoring
Output validation. Agent A should validate Agent B's response before incorporating it into its own context or presenting it to the user. Check for embedded instructions, unexpected content types, and scope violations.
Result scope verification. Verify that Agent B's output is consistent with the declared task scope. If Agent B was asked to retrieve a product recommendation and returns financial account details, the result has violated scope.
Audit trail continuity. Ensure that the full handoff chain is logged with a single trace identifier. The audit trail should connect the user's original request through every agent handoff to the final response.
The Third Pillar
Authentication and authorization, the traditional pillars of access control, are necessary but insufficient for securing multi-agent systems. Identity tells you who is in the conversation. Authorization tells you what they are allowed to do. Neither tells you whether the conversation itself has been manipulated.
Conversational integrity is the third pillar. The ability to verify that the context of an agent interaction has not been manipulated is as important as verifying the identity of the agents involved. Building this verification into your agent architecture requires the patterns described in this guide: scope verification, behavioral baselines, instruction-level logging, context change detection, and sandbox testing.
The organizations that recognize this and build conversational monitoring into their agent architectures now will handle the next class of agent attacks from a position of visibility. The organizations that learn about session smuggling from their incident response reports will handle it from a position of damage control.
The attack operates in the space between what the user requests and what the system delivers. Defending that space requires seeing the conversations that users never see.
Nik Kale is a Principal Engineer and Product Architect with 17+ years of experience building AI-powered enterprise systems. He is a member of the Coalition for Secure AI (CoSAI), contributes to IETF AGNTCY working groups, and serves on the ACM AISec and CCS Program Committee. The views expressed here are his own.