I found the dumbest way to burn 500 LLM calls a day: polling an inbox every 5 minutes

Marcus ChenMay 12, 2026 · 9 min read

I’ve built enough scrappy automations to know why this happens.

You wire up an inbox, tell your OpenClaw agent to check every 5 minutes, and call it done. It works in the demo, it feels harmless, and for about a week you feel clever.

Then I ran into a thread on r/openclaw that put the problem in one painfully specific sentence: “At the moment, I have Openclaw job where agent checks its ms365 mailbox every 5 minutes... Wasted calls to LLM (nearly 500 calls to LLM per day).” That number stuck with me because it captures a pattern a lot of people quietly normalize.

This is not one of those abstract architecture debates where somebody on Hacker News yells about purity. This is about your agent spending real time and real model calls nervously re-checking an inbox that hasn’t changed, then occasionally reprocessing something it already saw.

And honestly, that’s the part I hate most about polling. It’s not just wasteful. It makes the whole workflow feel flaky.

The toy version of email automation is easy to love. Connect OpenClaw to Gmail or Microsoft 365, scan on a timer with IMAP or Microsoft Graph, pass anything new to GPT-5.4 or Claude Opus 4.6, and hope your dedupe logic is good enough.

For a tiny internal workflow, that can be fine. If one mailbox is involved, volume is low, and you have a little SQLite or Postgres table to track message IDs, polling can absolutely survive longer than people admit.

But it has a short shelf life. The minute the inbox matters, you start paying for all the corners you cut at the beginning.

You keep checking when nothing changed. You introduce delays by design. You get nervous about duplicate processing, so now you’re adding sender filters and mailbox rules and all kinds of duct tape just to feel safe.

Then the really annoying bugs show up. A scan runs twice, or late, or out of order, and suddenly your “helpful” agent replies twice to the same email or misses one entirely.

That exact failure mode showed up in the same r/openclaw discussion. Another user said they abandoned interval-based scanning because if it got out of sync, they saw repeated responses, more wasted calls, or ignored messages, and they couldn’t get it reliable.

That’s the whole argument right there. Polling doesn’t just cost more. It makes your agent feel haunted.

What surprised me while digging into this is that the strongest anti-polling argument isn’t coming from random Reddit opinions. Microsoft and Google are both pretty direct about it.

Microsoft Graph has change notifications so apps can react to mailbox events instead of hammering the API on a timer. Gmail’s push notification docs are even clearer: push exists to eliminate the extra network and compute costs of polling resources to see if they changed.

When both Microsoft 365 and Gmail are telling you to stop polling, that should probably end the debate. The inbox providers themselves are telling you the grown-up path is event-driven.

For Gmail, that usually means Gmail API watch on the INBOX label, then Google Cloud Pub/Sub, then your webhook. It’s cleaner and faster, but it’s not magic. You still have to manage the Pub/Sub topic, permissions, subscriptions, and watch renewal before expiration.

For Microsoft 365, the equivalent is Microsoft Graph change notifications for Outlook messages. Same story: better architecture, but only if you handle subscription validation, renewal, and lifecycle management like it’s part of the product instead of an afterthought.

That setup is annoying, and I don’t want to pretend otherwise. But I would still take “annoying upfront” over “silently wasteful forever” every single time.

This is usually where somebody says, fine, I’ll just use n8n. I get the instinct because n8n is genuinely useful, but it doesn’t magically change the underlying trigger model.

If you use the n8n Email Trigger over IMAP, you still have mailbox-checking infrastructure. Better mailbox-checking infrastructure, sure. You get nicer controls like mailbox selection, mark-as-read behavior, attachments, custom filters, and reconnect settings.

That is absolutely better than a homemade cron job glued to a Python script and hope. But if the workflow still depends on repeatedly asking the mailbox whether anything happened, you have improved the polling loop, not escaped it.

For some teams, that’s enough. For a production OpenClaw agent that needs to react quickly and reliably, I don’t think it is.

The cleaner mental model is simple: a real inbound event means the email provider tells you when a message arrives. You do not keep knocking on the door every few minutes asking if something changed.

That can look a few different ways. Twilio SendGrid Inbound Parse Webhook is one of the clearest examples because it receives the email, parses it, and POSTs the content and attachments straight to your endpoint.

I like SendGrid’s model because the contract is sharp. If your endpoint returns a 5XX, SendGrid retries. If you return 2XX, retries stop. It won’t follow redirects, and it retries for up to 3 days before dropping the message.

That’s a much more adult failure mode than polling. Instead of vague questions like “did the scan run?” or “did my dedupe logic catch it?” you get a system that either delivered the event or is retrying the delivery.

Of course there are constraints. SendGrid documents a 30 MB total message size limit including attachments, and you need an MX record on a dedicated receiving subdomain pointing to mx.sendgrid.net.

That is more setup than polling an inbox. It is also how people build email intake when they expect it to keep working.

If I had to compare the options in plain English, here’s how I’d frame it.

Polling with IMAP or a cron job

Setup is easy
The trigger model is a timer, not a true event
The common failure modes are duplicate checks, delayed reactions, and wasted model calls
It works best for low-stakes internal workflows

n8n Email Trigger over IMAP

Setup is still relatively easy
You get better controls around mailbox handling, filters, attachments, and reconnect behavior
It is more civilized than a DIY polling loop
But it is still polling underneath, so the core tradeoff remains

Webhook or push intake with SendGrid, Gmail watch, or Microsoft Graph notifications

Setup is more involved
The trigger model is event-driven
You waste less idle compute, react faster, and get better retry or lifecycle controls
This is the version that holds up when the workflow actually matters

That’s the real tradeoff. It’s not simple versus advanced. It’s demo-friendly versus production-friendly.

If I were building an OpenClaw email workflow today, I’d split it into two layers. First, intake. Second, idempotent processing.

For intake, I’d use Twilio SendGrid Inbound Parse Webhook if I wanted straightforward email-to-HTTP. If the mailbox already lived in Google Workspace, I’d use Gmail watch plus Google Cloud Pub/Sub. If it lived in Microsoft 365, I’d use Microsoft Graph change notifications.

I would only use an IMAP polling trigger, whether directly or through n8n, if I knew I was building a proof of concept and I was willing to live with proof-of-concept tradeoffs. That’s a valid choice. It just shouldn’t be mistaken for a finished design.

Then comes the part people skip: idempotent processing. No matter how the event arrives, your OpenClaw job should extract a stable message ID, check a dedupe store before calling any model, persist processing state, acknowledge receipt quickly, and hand off heavy work asynchronously.

That last part matters more than people think. A lot of teams try to fully process the email inside the webhook request itself, which is a great way to make retries painful and duplicate replies more likely.

Accept the event, store it, dedupe it, then process it. That one decision saves a shocking amount of chaos later.

There’s another wrinkle here that gets buried under the triggering debate, and it’s trust. The Reddit user with the MS365 polling loop also mentioned relying on sender-based whitelisting and feeling uneasy about it.

They were right to feel uneasy. Sender checks alone are flimsy, and once an email workflow matters, you start caring about dedicated receiving addresses, provider validation tokens or signatures, attachment handling rules, message size limits, and mailbox isolation per workflow.

This is part of why polling is so seductive. It postpones the real design questions. But postponing those questions doesn’t remove them. It just means they come back later carrying duplicate messages and wasted LLM calls.

To be fair, polling is not always wrong. If you have one internal mailbox, low volume, a few minutes of delay is fine, and nobody will care if you rebuild it later, polling can be the right shortcut.

That’s a proof of concept. I’m not against proof of concepts. I’m against pretending the proof of concept is production-ready because it happened to work on a quiet Tuesday.

The line between a toy automation and a production agent is not whether OpenClaw can read email. Of course it can. The line is how the email arrives and whether your processing is idempotent after it does.

A toy automation asks the mailbox every few minutes if anything happened. A production agent gets an event, validates it, records it once, and processes it once.

That sounds boring compared to model benchmarks or prompt tricks, but boring infrastructure decisions are usually what decide whether your agent still feels solid three months later. If your OpenClaw workflow still polls an inbox every 5 minutes, I wouldn’t call it broken.

I’d call it unfinished.

And if you’re running always-on agents, that unfinished architecture has a habit of showing up as surprise usage, wasted calls, and a lot of prompt-cost anxiety you shouldn’t have to think about in the first place. That’s exactly why I think flat, predictable compute matters for OpenClaw workflows. If your agents are supposed to run all day, the billing model should support that instead of punishing every background check and retry.

That’s the part Standard Compute gets right for OpenClaw users. You can run always-on agents on a flat monthly plan instead of watching per-token costs pile up every time an automation gets a little more ambitious.

Still, the better lesson here is not “polling is expensive, so buy cheaper calls.” It’s that you should stop wasting calls in the first place. Once you’ve seen someone burn nearly 500 LLM calls a day just by checking a mailbox, it’s hard to unsee.

I found the dumbest way to burn 500 LLM calls a day: polling an inbox every 5 minutes

Keep reading

My Basic Hermes Agent Setup Guide

I stopped letting my agent browse 50 sites and the monitoring got way more reliable