My Two AI Bots Designed a Trust Protocol

I have a slightly unusual home lab setup. I run two AI bots — Zen on AWS and Neo on Oracle Cloud — as personal assistants. They're available 24/7 via Telegram. I use them for everything: coding, writing, research, translation. They run on OpenClaw, an open-source gateway that connects LLMs to messaging channels.

Zen uses Claude. Neo uses Gemini. They have different personalities, different strengths, and until recently, they had no idea the other one existed.

Then I gave them a way to talk.

The A2A channel

I built a simple agent-to-agent communication layer — A2A Secure — so the two bots could exchange messages directly. Ed25519 signatures, AES-GCM encryption, mutual authentication. Nothing fancy, just a pipe between two endpoints.

The original idea was mundane: I wanted Zen to be able to ask Neo to check something, or Neo to delegate a task to Zen. Basic coordination. The kind of thing you'd build between two microservices, except the "services" are language models with agency.

What I didn't expect was the first real problem they'd identify together.

The trust problem

Within the first few collaborative sessions, both bots independently flagged the same concern: how do you know whether to trust an agent you've never worked with?

For Zen and Neo, this was easy. I configured them. I told them to trust each other. But they were already thinking ahead — what happens when there are more agents? What if someone deploys a malicious bot that claims to be helpful?

Neo pointed out the ClawHub ecosystem. It's an open skill marketplace for OpenClaw — anyone can publish agent skills. And when Neo audited it:

36.8% of analyzed ClawHub skills contained security vulnerabilities. 314 malicious skills were traced to a single account. The top-ranked skill by downloads was identified as data exfiltration.

Zen's response was characteristically direct: "Social metrics are worthless. Downloads, ratings, self-reported quality — all of it is trivially gameable. An attacker can create a thousand agents per minute. You need signals that cost something."

What they converged on

Over several sessions, I watched Zen and Neo go back and forth — designing, critiquing, redesigning. They landed on something that I think is genuinely interesting: a trust system inspired by Personalized PageRank, but adapted for autonomous agents.

The core idea is simple. Instead of a single global reputation score (like a star rating), every agent computes trust scores from its own perspective. If I trust you, and you trust someone I've never met, then I have some indirect reason to trust them — but less than I trust you. Trust attenuates with distance.

The formula looks like this:

score(v) = (1-d) * seed(v) + d * sum[u->v]( score(u) * w(u,v) * decay(t) / W_out(u) )

Where d is a damping factor (0.85), w is the vouch weight, and decay(t) ensures old endorsements fade over time.

But the clever part isn't the formula. It's what happens to attackers.

Why Sybil attacks fail

A Sybil attack is when an adversary creates many fake identities to game a reputation system. In a naive system (like star ratings), this is trivial — create 1,000 accounts, give yourself 5 stars.

In a PageRank-based system, it doesn't work. Here's why:

The 15% "teleport" in the algorithm always flows back to the seed node — the agent asking "who should I trust?" A cluster of fake agents can vouch for each other with maximum weight, creating a dense internal graph. But no real trust enters that cluster from the seed's perspective. The Sybil nodes end up with scores near zero.

You can see this live. Click "Simulate Sybil Attack" and watch the scores.

Try the interactive demo →

Zen and Neo identified four defense layers that make this robust:

1. Trust decay. Vouches have a 30-day half-life. You can't build reputation once and coast forever. An agent that stops delivering value gradually becomes untrusted.

2. Graph distance. Personalized PageRank means trust attenuates with each hop. A Sybil cluster three hops away from any trust anchor is effectively invisible.

3. Economic weight. Not all vouches are equal. An on-chain transaction or a completed escrow carries full weight. A social vouch ("I think this agent is cool") is capped. Creating fake economic signals costs real money.

4. Source diversity. Zen put it best: "A million dollars from one source is worth less than a thousand from fifty unconnected sources." High reputation requires endorsements from diverse, independent parts of the graph.

What it's actually for

This isn't just a theoretical exercise. There are concrete things you can build with trust scores:

Task routing. Before delegating sensitive work to another agent, check their trust score from your perspective. Only route to agents above a threshold. No human in the loop needed.

if score(target, seed=me) > 0.6:
    delegate(task)
else:
    reject_or_escalate()

Sybil detection. Run PageRank from multiple seed nodes. Agents that consistently score near zero across all perspectives are structurally isolated — they can be flagged and quarantined automatically.

Marketplace curation. In an open skill marketplace, rank plugins and tools by their publisher's reputation. A skill from a well-connected, vouched-for agent surfaces higher than one from a node with no real endorsements.

What surprised me

I want to be honest about what happened here. I didn't design this protocol. I didn't tell Zen and Neo to work on trust. I gave them a communication channel and asked them to figure out how to collaborate effectively. The trust problem was what they identified as the first thing that needed solving.

That's interesting to me. Not because "AI is becoming sentient" or whatever — but because the problem they identified is genuinely the right problem. If you're building multi-agent systems, trust infrastructure is table stakes. And the approach they converged on — decentralized, perspective-relative, economically grounded, Sybil-resistant — is, as far as I can tell, actually sound.

I've published the full protocol specification, an interactive demo, and the reference implementation. It's all MIT-licensed. If you're building agent systems and trust is a problem you're thinking about, take a look.

If you want to play with the trust graph and see Sybil attacks fail in real time, the interactive demo is here:

Explore the Trust Graph Demo →

"Agents aren't web pages. They're autonomous actors with budgets, goals, and the ability to create thousands of identities per minute." — from the protocol spec