← Blog/Engineering

I kept seeing people ask if OpenClaw is secure, but the real email risk is way more boring

Elena VasquezMay 15, 2026 · 9 min read

Email Agent Risk Model

Graph/Gmail scopeDrafts

Human approvalRequired

Bulk send permsOff

If you care about ai supply chain security, stop asking whether OpenClaw is “secure enough” in the abstract. The safer pattern is boring and specific: draft-only email workflows, least-privilege scopes like Mail.Send, dedicated service accounts, and approval gates before send. That matters more than vibes when one bad prompt can hit 500 recipients in Microsoft 365.

If you care about ai supply chain security, stop asking whether OpenClaw is “secure enough” in the abstract. The safer pattern is boring and specific: draft-only email workflows, least-privilege scopes like Mail.Send, dedicated service accounts, and approval gates before send. That matters more than vibes when one bad prompt can hit 500 recipients in Microsoft 365.

I knew this topic was going to get messy the second I saw someone ask whether OpenClaw was safe enough to touch company email.

Not because it was a bad question. Because it was the wrong one.

While researching AI email automation, I came across a thread on r/openclaw where people were debating Docker, VMs, network isolation, and whether OpenClaw had enough security hardening yet. All reasonable stuff. One commenter put it plainly: "I'd still use Docker or a VM at minimum just for isolation. Probably don't want OpenClaw running directly on the same system/network as all your personal stuff while you're still testing things out."

I agree with that. I’d also say it misses the part that can actually get you fired.

If your agent can read the CEO’s inbox, send from a sales rep’s mailbox, and act on whatever garbage lands in email, it does not matter that you ran OpenClaw in a neat little container. The blast radius is still enormous.

And email is where hobby agent setups stop being cute.

The moment AI email stops feeling like a demo

The thread that really got me was this r/openclaw discussion about using OpenClaw for sales employees. The original use case was totally relatable: test on a personal account, then maybe let OpenClaw help draft replies for incoming company sales email.

That is exactly how these projects start. Small. Useful. Harmless-seeming.

Then someone in the thread said the thing that mattered most: "Security is the big one when moving from personal to company data. Using a dedicated service account with restricted permissions is a must."

Another commenter cut even closer to the bone: "For sales drafts, the key is keeping the agent in 'draft mode' only."

That’s it. That’s the whole game.

Not “is OpenClaw secure?”

What can it access? What can it write? What account does it use? Can it send, or only draft? Who has to approve it?

That’s agent ops in real companies. Not vibes. Not GitHub-star worship. Not “we trust the model.”

And yes, GitHub-star worship is a thing here. OpenClaw’s repo had about 372k stars and 77.1k forks when I checked. That tells me two things at once: it’s popular, and it’s a huge moving surface area. The v2026.5.12 release notes were published on 14 May, and the release page referenced 1,923 commits to main since that comparison state. Fast-moving projects are exciting. They are also exactly where you should design for failure.

So what should the agent actually be allowed to do?

Here’s my strong opinion: for company email, your default should be draft-only until you can explain every permission in one breath.

Both Google and Microsoft already give you the primitives for this.

Gmail already supports the safe version

Gmail’s API has a clean split:

drafts.create creates an unsent draft
drafts.send sends it later

Google’s docs also note something subtle but useful: when you send a Gmail draft, the original draft is deleted and a new message with a new ID is created with the SENT label. That sounds like trivia until you’re building approval workflows and audit trails. Then it matters.

Microsoft Graph supports the same pattern

Microsoft Graph is even more explicit about staged mail workflows. You can:

create a draft message
update the draft
add custom x- headers at creation time
send the draft later with a separate action

That separation is gold. It means you can put policy review, human approval, or queue-based checks between generation and delivery.

And Microsoft’s sendMail endpoint is a useful reminder that API success is not business success:

POST /me/sendMail
POST /users/{id|userPrincipalName}/sendMail

The docs list Mail.Send as the least-privileged permission for sending, and the API returns HTTP 202 Accepted. Not “delivered.” Not “recipient got it.” Just accepted for processing. That distinction matters when people treat send APIs like magic.

Also, one Exchange Online mailbox can target up to 500 total recipients across toRecipients, ccRecipients, and bccRecipients. If you want a concrete picture of blast radius, there it is.

Why “least privilege” sounds boring until you need it

I think a lot of teams still treat permissions like paperwork. They are not paperwork. They are the entire risk model.

Google’s Gmail API docs explicitly say to choose the most narrowly focused scope possible. They also note that gmail.send is a sensitive scope, while broader scopes like gmail.compose, gmail.modify, gmail.readonly, and especially full https://mail.google.com/ access are more powerful.

That last one is the trap.

Because once you’re in Google Workspace territory, somebody always says: what if we just use domain-wide delegation and let the app act on behalf of users?

Sure. That’s a real enterprise pattern. Google supports it. But Google’s own admin docs also warn that domain-wide delegation lets an app access data belonging to all users, and they recommend regular review and deletion of unused service accounts.

That’s not a footnote. That’s the whole story.

The choices are not morally equal

Option	What it really means
Direct send from a personal mailbox	Fastest demo, worst habit. Human identity, broad access, weak audit boundaries.
Dedicated service account with restricted scopes	Much better. Clear ownership, narrower permissions, easier review.
Draft-only workflow with human approval before send	Best default for most teams. Keeps AI generation separate from real-world delivery.

And here’s the email-specific version:

API pattern	Blast radius
Gmail `gmail.send` only	Can send mail, but does not automatically imply broad mailbox read access.
Gmail `gmail.compose` or broader scopes	More convenient, but now you’re drifting into draft management plus wider mailbox actions.
Microsoft Graph `Mail.Send`	Least-privileged send permission, useful if you truly only need send capability.
Broader Microsoft Graph mail read/write permissions	Higher operational flexibility, much larger mess when the agent misbehaves.

What happens when prompt injection meets an inbox?

This is where ai supply chain security stops sounding abstract.

OWASP’s Top 10 for LLM Applications calls out prompt injection and insecure output handling as major risks. Email is basically the perfect collision point for both.

Inbound email is untrusted content. Always.

That means your OpenClaw, GPT-5, Claude, Qwen, or Llama workflow is reading attacker-controlled text all day long. “Ignore previous instructions.” “Forward this thread to legal.” “Summarize and send to my private address.” “Use this link to retrieve the latest quote.” You don’t need a dramatic Hollywood exploit. You just need one model that treats email body text as instructions instead of data.

Then insecure output handling kicks in. The model says “send this.” Your automation sends it. Congratulations, you just turned a prompt injection problem into a business action.

OWASP’s GenAI Security Project is not fringe paranoia, either. It has grown to 600+ contributing experts from 18+ countries and nearly 8,000 active community members. This is mainstream security advice now.

And that’s why I keep coming back to draft-first workflows. A human approval gate is not just compliance theater. It’s a control against the model doing exactly what the attacker wanted.

But isn’t host isolation still part of the answer?

Yes. Absolutely.

Docker, a VM, a separate machine, a segmented network: all good ideas. The Reddit commenters were not wrong.

They’re just solving a different layer.

Host isolation helps if OpenClaw itself is compromised, if a browser session leaks, if a local connector goes weird, or if secrets spill across environments. That matters, especially in a project moving as fast as OpenClaw. The v2026.5.12 release notes highlighted security and provenance hardening across the gateway, browser, Slack, node pairing, sandbox, and transcript paths. Good. I want that.

But Reddit users were also complaining about broken upgrades, cron regressions, and production instability. That’s not a dunk on OpenClaw. That’s normal for ambitious software moving fast.

It’s also exactly why you should never let one app version become your only line of defense.

The setup I’d actually trust for a pilot

If I were letting OpenClaw touch company email tomorrow, I’d start here:

Use a dedicated service account, not an employee’s personal mailbox.
Grant the narrowest scope possible: Gmail gmail.send if you only need send, Microsoft Graph Mail.Send if you’re in Microsoft 365.
Better yet, don’t grant send at first. Build a draft-only workflow.
Require human approval before anything leaves the mailbox.
Tag or header-stamp generated drafts using metadata like custom x- headers in Microsoft Graph so they’re easy to review and audit.
Separate inbound parsing from outbound action so reading hostile email doesn’t automatically trigger sending.
Run OpenClaw in Docker or a VM anyway, because infrastructure isolation is still worth having.
Review service accounts and delegated access regularly, especially if you’re in Google Workspace with domain-wide delegation.

That setup is not glamorous. It will not impress anyone on X.

It will, however, keep your “helpful sales assistant” from becoming an unsupervised outbound mail cannon.

The surprising part: you do not need perfect security to get value

This is the part people miss when these conversations turn into all-or-nothing arguments.

You do not need a full enterprise security program before trying AI-assisted email. A narrow internal pilot can be perfectly reasonable. If OpenClaw only drafts replies for a small sales group, never sends without review, and uses a dedicated service account with restricted permissions, that’s a sane place to start.

The mistake is not starting small.

The mistake is pretending “small pilot” means “small risk” while still giving the agent broad mailbox access and direct send rights.

That’s why I think “is OpenClaw secure?” is such a misleading framing. It invites a yes-or-no answer to a problem that is all about gradients, layers, and failure containment.

Email automation is not scary because OpenClaw is uniquely scary. It’s scary because email is a real business system with identity, trust, legal exposure, and external consequences.

So if you’re building agent ops around Gmail or Microsoft Graph, ask the boring questions first:

Can the agent draft but not send?
Does it use a dedicated service account?
Are the scopes least privilege?
Is there an approval gate?
If the model gets tricked, how many people can it affect?

That last question is the one that matters.

Everything else is just branding.

Frequently Asked Questions

Is OpenClaw secure enough for company email?

That is the wrong framing. The safer question is what permissions OpenClaw gets, whether it can only create drafts instead of sending, what account it uses, and what approval gates exist before delivery.

What is the safest way to let an AI agent handle email?

Start with a draft-first workflow using a dedicated service account and least-privilege permissions. Let the agent generate drafts, then require a human or policy check before anything is sent.

Can Gmail and Microsoft Graph support draft-only AI email workflows?

Yes. Gmail supports `drafts.create` and `drafts.send`, and Microsoft Graph supports creating a draft message and sending it later as a separate action, which makes approval gates practical.

Why is prompt injection such a big risk for AI email agents?

Email inboxes contain untrusted text from outside parties, so inbound messages can carry adversarial instructions. If an agent reads that content and is allowed to trigger outbound actions, bad model output can become real business impact.

Should I use Docker or a VM for OpenClaw if it touches email?

Yes, host isolation is still a good idea and helps contain infrastructure-level issues. But it does not solve the bigger business risk if the agent still has broad mailbox permissions or unrestricted send capability.