Hermes Agent: What It Is and Why Devs Switched
I ignored OpenClaw. Tried it during the hype, poked at it for an afternoon, and quietly went back to my normal setup because I couldn't find a single thing it did that Claude Desktop and Claude Code weren't already doing for me. So when people started telling me Hermes Agent was the new thing, my first reaction was a tired sigh. Another agent. Great.
I was wrong to sigh. Hermes Agent is different in ways that actually matter, and after a few weeks of running it for real work, I get why the dev crowd moved over. This is the honest version: what it is, what you can actually do with it, how it stacks up against OpenClaw, and the security tradeoff nobody seems to want to talk about.
What exactly is Hermes Agent?
Hermes Agent is an open-source, model-agnostic autonomous agent from Nous Research that plans, calls tools, and keeps state across sessions, while teaching itself new skills as it goes.
That bolded sentence is the whole pitch. It's not a coding copilot bolted to your editor. It's a persistent agent that you point at a task, and it goes off and does it, remembering what happened last time.
Now, the naming. The naming is a mess and you need to keep three things straight or you'll confuse yourself reading the docs:
- Hermes 4 is the LLM weights, the actual model.
- Hermes Agent is the framework, the thing this article is about.
- Hermes Desktop is the GUI app that hit public preview in June 2026.
The desktop release shipped no new model and no new architecture. It's just a door into the existing agent that doesn't make you live in a terminal. Useful, but don't mistake it for a reinvention.
What makes Hermes Agent interesting is the closed learning loop. It auto-creates skills after hard tasks, searches its own past sessions (it keeps them in SQLite with full-text search), and builds a model of you over time. Most agents forget you the second the session ends. This one tries not to. That's the part worth paying attention to.
What can you do with a Hermes agent?
Here's where I'm going to disappoint the hype merchants. The genuinely good use cases are boring, and that's a feature.
Where Hermes shines is always-on scheduled work. Morning research digests, competitive-intelligence sweeps, watching a trend API or a search feed and pinging you when something moves. The canonical example everyone reaches for is a daily briefing dropped into Telegram or Slack at 8am, and honestly, that example is popular because it works.
My actual usage looks like this. I've got it collecting information on a schedule, running a paper-trading agent, and watching trend APIs and search results so I don't have to babysit a dozen browser tabs. None of that is glamorous. All of it saves me time every single day.
What it's bad at is worth saying plainly. Complex multi-file refactors that need real architectural judgment are not its strength, because the open models I run it on are good but not Claude-Opus good. And for tight interactive coding where I want to see and control every bit of context, the CLI or Claude Code still beats it. Hermes is for the work you'd otherwise forget to do, not the work you're actively thinking hard about.
This is also where I'll gently push back on the discourse. Half of what I read on X during the OpenClaw wave was people breathlessly announcing they'd "automated their business," and when you looked closer it was a script that renamed some files. If you want a realistic sense of what these tools do for an actual small operation, I wrote up where AI agents help small businesses in 2026 without the fairy dust. Hermes is more capable than the OpenClaw stuff was. The honest wins are still the unglamorous scheduled ones.
Hermes Agent vs OpenClaw
So why did OpenClaw bounce off me and Hermes stick?
With OpenClaw, every task I imagined throwing at it had an answer along the lines of "yeah, Claude Desktop already does that, or Claude Code does it better." It never carved out its own reason to exist on my machine. I'm sure people built real things with it. I just wasn't one of them, and I didn't find the hype convincing enough to keep trying.
Hermes is different because of three things: persistent memory, a "Soul" identity layer, and automatic skill creation. The point isn't any single feature. It's that the agent compounds. Recurring work gets cheaper and faster as it accrues skills, so the thing you set up this week is better next week without you touching it. That's a fundamentally different value proposition than "wrapper around a model that runs commands."
I want to be fair, though, because I'm not here to sell you anything. The memory is small. We're talking roughly a 2,200-character memory file and a 1,375-character user file, with no version history by default. Early facts get overwritten as new ones come in. So the "it remembers you" story is real but bounded, and you'll feel the ceiling on long-running personal-assistant work. This is early infrastructure, not a finished memory system, and I'll come back to that.
Hermes agent skills, memory, and the self-improving loop
This is the most technically interesting part of the system, so let's actually dig in.
How skills work
Skills are just markdown files. Each one is a SKILL.md file living under ~/.hermes/skills/, following the open agentskills.io standard. They load by progressive disclosure, which is a fancy way of saying the agent reads only the short description of each skill for free, and pulls the full content into context only when a task actually matches. That keeps the context lean, which matters more than people realize.
The clever bit is that the agent writes its own skills. After a non-trivial task (think five or more tool calls, or some non-obvious workaround it had to figure out), it can save what it just did as a reusable skill via skill_manage. There's also an autonomous Curator that grades, consolidates, and prunes the skill library on a cycle, so it doesn't just accumulate junk.
The mental model that finally made it click for me: memory is for "what" (facts), skills are for "how" (procedures). Keep those two ideas separate and the whole architecture suddenly makes sense.
Where auto-generated skills fall short
Here's my slightly contrarian take, because almost nobody is saying this. The auto-created skills are convenient, and sometimes they're genuinely good. But more than once I looked at one and thought, I could write this better myself.
When I hand-author a skill with proper prompting and the right context, the result is cleaner and more reliable than what the agent scribbles after a task. Claude even has a dedicated skill-creation skill that helps structure new ones, and if you care about the output, a few minutes of deliberate authoring beats auto-generation. (If you want to get the prompting right, I put my approach in prompt engineering best practices for 2026.) Treat auto-generation as a fast first draft, not the finished article.
Same energy on memory and Soul. They're interesting, and I'm glad they exist, but they feel like the first rung on a tall ladder. The memory is too small and not situationally aware enough yet. I'm fairly convinced the next generation of agents will have far more sophisticated, context-aware memory systems, and we'll look back on this as the charmingly primitive beginning. Which is fine. Everything starts somewhere.
How to get Hermes agent (and run it cheaply)
Getting started is straightforward, and the cost story is genuinely good if you set it up right.
You've got two front ends. The Desktop app (public preview since June 2026) and the CLI are the same agent core sharing the same ~/.hermes config, so anything you set up in one shows up in the other. Desktop adds a settings UI so you're not editing YAML, plus a file browser, voice mode, and live subagent watch-windows that let you see parallel work happening. The CLI keeps finer control over context and compaction. I bounce between both. Desktop for monitoring and review, CLI when I want my hands on the wheel.
The part that makes this cheap is the inference. I run mine on my OpenCode Go subscription, which Hermes supports as a first-class provider. Setup is three things:
- Add
OPENCODE_GO_API_KEY=<your-key>to~/.hermes/.env. - Set the provider to
opencode-go(base URLhttps://opencode.ai/zen/go/v1). - Pick your model in Settings, or in
config.yaml.
Then route deliberately. I use DeepSeek V4 Flash for roughly 95% of what I throw at Hermes, because the easy, high-volume cron and subagent work doesn't need a frontier model and Flash is absurdly cost-effective. I reserve the stronger models (Qwen3.7 Max, Kimi K2.6) for the genuinely hard reasoning. The caps on OpenCode Go are dollar-based rather than request-based, so cheap models stretch almost comically far. You can run an enormous amount of agent work for not much money.
Two gotchas to flag before they bite you, because they bit me. Dotted model IDs sometimes get mangled into hyphens (minimax-m2.7 turning into minimax-m2-7), which produces a confusing "model not supported" 401. And there are routing edge cases where Hermes picks the wrong endpoint format and you get a 401 for no obvious reason. The fix is the same in both cases: pin your provider and model explicitly, start a fresh session, and update Hermes regularly because these are being patched. Check the project's GitHub issues if a specific model misbehaves; the relevant ones were still in flux at the time I set this up.
The security tradeoff nobody mentions
Now the part I actually want you to remember.
Early on, Hermes ran into a Node.js install conflict on my Mac and just... fixed it. Diagnosed it, sorted out the mess, moved on. My first reaction was delight. My second reaction, about thirty seconds later, was a small cold feeling in my stomach.
Because an agent that can fix your machine can also break it. Unsandboxed, with free run of your filesystem and shell, Hermes is a real security risk. It's not malicious, but it's powerful and autonomous, and "powerful and autonomous with root-ish access to your laptop" is exactly the sentence that should make you pause. I've written before about the danger of skipping permissions on coding agents, and the same logic applies here, except Hermes runs on a schedule when you're not watching.
So harden it before you expose it to anything. Switch the terminal backend to Docker so it's containerized. Use gateway allowlists. If you're running the agent on a VPS or home server and reaching it from your laptop, bind the dashboard to a Tailscale IP and put OAuth in front of it, and never, ever expose a dashboard to the open internet. Tailscale plus OAuth is the clean pattern here and it's not much work.
And here's the bigger-picture thought that the Node incident kicked off for me. Google, Microsoft, and Apple are all going to bake agents directly into their operating systems. When that happens, a complex, less-sandboxed, set-it-up-yourself tool like Hermes becomes obsolete for casual and non-professional users, because they'll get something safer and simpler for free, built into the thing they already use. Hermes's real audience isn't the casual user. It's devs and tinkerers who want control, can manage the risk, and have workflows worth the setup cost. That's not a knock. It's just who this is for.
So is it worth it?
Hermes Agent earns the place on my machine that OpenClaw never did. The compounding skills, the persistent memory, the always-on scheduled autonomy that quietly handles the work I'd otherwise forget. It's a genuinely useful tool and I'm keeping it.
But it's preview-stage and it's rough in real places. The context compression can get lossy on long sessions, the memory is too small, and the provider routing has sharp edges. Treat it as powerful early infrastructure, not a polished product, and you won't be disappointed.
Honest bottom line: if you're a dev running cheap inference and you've got unglamorous, always-on workflows that eat your time, set it up this weekend. If you want a casual assistant that just works with zero fuss, sit tight, because the OS vendors are coming for exactly that lane, and they'll do the babysitting for you.
Related Articles
The Ralph Loop: How Recursive AI Agents Actually Work
How the Ralph Loop turns Claude Code, Codex /goal, and any LLM into a recursive AI agent that ships code overnight — and when it actually works.
15 min read
AI CodingCLAUDE.md: Helpful or Just Expensive Noise?
Research shows CLAUDE.md files can hurt more than help. Here's what actually works—and when to skip it entirely.
9 min read
AI CodingClaude Code dangerously-skip-permissions: Why It's Tempting, Why It's Dangerous
dangerously-skip-permissions makes Claude Code autonomous—no more prompt fatigue. But real devs have lost home directories. Here's what you actually need to know.
11 min read