Hermes Agent Skills Advanced Guide: SKILL.md, GEPA Self-Evolution & Skill Bundles (2026)

If your Hermes Agent still relies on one-shot prompts, you are paying full context cost every session while procedural knowledge never compounds. This guide is for advanced users and team leads building evolvable skill libraries: it covers agentskills.io SKILL.md format, Progressive Disclosure tiers, Skill Bundles YAML, conditional activation rules, GEPA + DSPy self-evolution ($2–10 per run), Skill Tap publishing, a hosting decision matrix, five-step Mac cloud Runbook, and five FAQ answers—so your Gateway keeps learning after you close the lid.

Diagram of Hermes Agent Skills architecture showing SKILL.md, Skill Bundles, GEPA evolution pipeline, and a Mac cloud Gateway running 7x24

Table of contents

Pain points: why Skills need a dedicated deep dive

  1. Prompts do not survive sessions or repos. Deployment checklists and PR templates live in chat history. Every new engineer re-pastes the same twelve steps. Skills move procedural knowledge into Git where review applies—but only if you understand SKILL.md routing and Progressive Disclosure.
  2. Loading everything burns context. Dumping all instructions into system prompts costs tokens on every turn. Hermes Skills load on demand via Level 0 descriptions (~3K tokens for the full catalog), yet most teams never tune descriptions or split references—so the wrong Skill fires or none fires at all.
  3. Evolution and uptime are disconnected. GEPA can improve SKILL.md text from execution traces, but if your Gateway sleeps on a laptop or runs on Linux without native macOS tooling, Skill scripts fail silently and evolution data never accumulates. See our Hermes three-layer memory and always-on hardware guide for why uptime compounds Skill value.

1. Why Hermes Skills deserve their own guide

In early 2026, Nous Research open-sourced Hermes Agent. Within two months it surpassed 160k GitHub stars—one of the fastest-growing AI agent projects. The thesis is not a bigger model; it is the agent that grows with you. Skills are the procedural memory layer that makes that growth real: standardized, evolvable, cross-session documents—not disposable prompts.

This post skips install basics. We go straight into SKILL.md authoring, Bundles, conditional activation, community Taps, and GEPA self-evolution—the mechanics that separate a demo Agent from a production skill library.

2. Skills ≠ Memory ≠ Prompts

DimensionPromptMemorySkill
PersistenceCurrent conversationCross-session, permanentCross-session, permanent
Load timingAlways in contextInjected each sessionOn demand (key difference)
Token costEvery turnSmall and stableZero until activated
Content typeAny intentUser preferences / factsProcedural steps (how to do X)
Maintained byUser manuallyAgent automaticallyUser + Agent
ShareabilityHard to sharePrivatePublishable as community Tap

Mnemonic: Prompt = sticky note (single use). Memory = notebook (always nearby). Skill = SOP manual (pulled when needed).

3. SKILL.md format & Progressive Disclosure

All Hermes Skills follow the agentskills.io open standard—portable across Hermes, Claude Code, and Cursor.

--- name: my-skill description: | Use when the user needs to [...]. Handles [...] and [...]. version: 1.0.0 license: MIT compatibility: Requires git, docker metadata: hermes: tags: [devops, automation] requires_toolsets: [terminal] --- # My Skill Title ## Procedure 1. Step one (exact command) 2. Step two ## Common Pitfalls - Failure mode + fix

Directory layout under ~/.hermes/skills/:

~/.hermes/skills/my-category/my-skill/ ├── SKILL.md # Core steps (≤500 lines recommended) ├── references/ # Loaded on demand ├── templates/ └── scripts/ # Executed locally; only output enters context

Progressive Disclosure — three loading tiers

LevelContentWhen loadedToken cost
Level 0name + descriptionSession start (all skills)~3K total catalog
Level 1Full SKILL.md body/skill-name or LLM matchDepends on file length
Level 2references/, scripts/During executionPer file, on demand

Write descriptions for when, not what. The LLM routes on Level 0 text alone. Vague descriptions cause misfires; precise trigger phrases save tokens downstream.

4. Skill Bundles: one command, full workflow

Skill Bundles (2026) pack multiple skills into a single slash command. File location: ~/.hermes/skill-bundles/<slug>.yaml.

name: backend-dev description: | Full backend feature workflow — code review, TDD, and PR management. skills: - github-code-review - test-driven-development - github-pr-workflow instruction: | Always write failing tests first before implementation. Never push directly to main.

Research session bundle

name: research-session skills: - arxiv - deep-research - plan - excalidraw instruction: | Start every session by checking recent papers on the topic.

Priority rules: Bundle beats a same-named Skill; missing skills are skipped with a warning; Bundles do not modify system prompts (prompt-cache friendly).

CLI quick create:

hermes bundles create backend-dev \ --skills github-code-review,test-driven-development,github-pr-workflow \ --instruction "Always write failing tests first"

5. Conditional Activation — four rules

Skills can auto-hide or show based on available toolsets. Configure under metadata.hermes:

metadata: hermes: requires_toolsets: [web] requires_tools: [web_search] fallback_for_toolsets: [browser] fallback_for_tools: [browser_navigate]
FieldBehavior
requires_toolsetsHide skill when listed toolsets are missing
requires_toolsHide skill when listed tools are missing
fallback_for_toolsetsHide skill when listed toolsets exist (fallback only)
fallback_for_toolsHide skill when listed tools exist (fallback only)

Classic pattern: duckduckgo-search sets fallback_for_tools: [web_search]—when Firecrawl/Brave keys activate paid search, the free fallback disappears automatically, saving tokens.

6. Skills Hub & open-source repos

hermes skills install official/research/arxiv hermes skills install https://example.com/SKILL.md --name my-skill hermes skills install github:openai/skills/k8s hermes skills tap add github:my-org/my-skills
RepositoryDescriptionHighlight
ChuckSRQ/awesome-hermes-skillsProduction-grade curated skillsDeep Research, MLOps, Apple integration
amanning3390/hermeshubCommunity registry with security scanPrompt-injection detection per skill
kevinnft/ai-agent-skills191 skills, 28 categoriesCross Agent: Hermes / Claude / Cursor
NousResearch/hermes-agentOfficial repoAuthoritative built-in skills

Validate format compliance: skills-ref validate ./my-skill.

7. Publishing your Skill Tap

my-skills-tap/ ├── skills.sh.json ├── mlops/vllm-deploy/SKILL.md └── research/paper-summarizer/SKILL.md
hermes skills tap add github:your-org/your-skills-tap hermes skills tap add github:your-org/private-skills --token $GH_TOKEN hermes skills tap update hermes skills tap list

Version-control ~/.hermes/skills/ in Git for cross-device sync. After pull: hermes skills reset rebuilds built-ins.

8. GEPA + DSPy self-evolution

GEPA (Genetic-Pareto Prompt Evolution)—ICLR 2026 Oral—lives in hermes-agent-self-evolution. It improves SKILL.md text from execution traces without touching model weights. Cost: $2–10 per optimization run (API only, no GPU).

Five-stage pipeline

  1. Trace collection — SQLite stores full reasoning traces (tool calls, branches, errors).
  2. Reflective failure analysis — LLM generates actionable side information, not just "failed."
  3. Targeted mutation — 10–20 SKILL.md variants per failure root cause.
  4. Multi-objective Pareto evaluation — Optimize success rate × token efficiency × speed.
  5. Human PR review — Best variant opens a PR; ship after approval.
git clone https://github.com/NousResearch/hermes-agent-self-evolution cd hermes-agent-self-evolution && pip install -r requirements.txt export HERMES_AGENT_PATH=~/.hermes python -m evolution.skills.evolve_skill \ --skill github-code-review \ --iterations 10 \ --eval-source sessiondb

Four guardrails (all must pass)

  1. Full test suite: pytest tests/ -q at 100%
  2. Size limits: Skills ≤ 15KB; tool descriptions ≤ 500 chars
  3. Prompt-cache compatibility: no mid-session invalidation
  4. Semantic preservation: core purpose unchanged

Evolution roadmap

PhaseTargetEngineStatus
Phase 1SKILL.md filesDSPy + GEPA✅ Shipped
Phase 2Tool descriptionsDSPy + GEPAPlanned
Phase 3System prompt fragmentsDSPy + GEPAPlanned
Phase 4Tool implementation codeDarwinian EvolverPlanned
Phase 5Fully automated loopPipelinePlanned

9. Plugin-bundled skills

Plugins namespace skills as plugin:skill—hidden from default skills_list, opt-in only, with sibling awareness:

skill_view("superpowers:writing-plans") # Agent: "This plugin also includes: superpowers:editing, superpowers:research"

In plugin.yaml:

name: my-hermes-plugin skills: - name: writing-plans path: skills/writing-plans/SKILL.md

10. Authoring tips & skill_manage

Description precision: Wrong: "Helps with code." Right: "Use when reviewing a pull request… Do NOT use for writing new code."

Pitfalls section separates good Skills from great ones—specific failure modes, root causes, and fixes (rate limits, selector brittleness, token overflow on large diffs).

Size guidance: <500 lines in SKILL.md; 500–1000 split to references/; >15KB blocks GEPA evolution.

Agents can maintain skills programmatically:

skill_manage( action='patch', name='github-code-review', old_string='Check for obvious bugs', new_string='Check for: null pointers, SQL injection, XSS, logic errors' ) # config.yaml approval gate: skills: agent_writes_require_approval: true

11. Blog workflow case study

# ~/.hermes/skill-bundles/blog-workflow.yaml name: blog-workflow skills: - seo-keyword-research - outline-generator - code-example-validator - bilingual-checker - publish-to-platform instruction: | Always research SEO keywords before writing. Ensure all code examples are tested and runnable.

The seo-keyword-research skill uses requires_toolsets: [web] and outputs a keyword matrix (3–5 primary + 10–15 long-tail per language) before any outline work begins—exactly the workflow behind multi-language VPSMAC blog production.

12. Hosting decision matrix: where Skills actually run

Host7×24 uptimeGEPA trace collectionNative macOS / XcodeBest fit
Local MacBook❌ Lid close drops Gateway❌ Gaps in session DBAuthoring, short tests
Linux VPS✅ systemd✅ CLI-only skillsText agents, no Apple toolchain
VPSMAC Mac cloud✅ launchd✅ Continuous traces✅ Bare-metal SSHHermes Gateway + GEPA loop

13. Five-step Runbook: production Skill library on Mac cloud

Step 1 — Audit and install base skills. Run hermes skills tap add for team Taps; validate with skills-ref validate. Document which Bundles map to which workflows.

Step 2 — Author or patch SKILL.md. Write Level 0 descriptions with trigger phrases; split references/ for anything over 500 lines. Enable agent_writes_require_approval in production.

Step 3 — Create Bundles and conditional rules. hermes bundles create blog-workflow --skills …; set fallback_for_tools for free/paid tool switching.

Step 4 — Deploy Gateway on VPSMAC Mac node. Sync ~/.hermes/skills/ and skill-bundles/ via Git; install Hermes with launchd KeepAlive. Confirm Cron and IM channels stay connected 7×24.

Step 5 — Enable GEPA evolution loop. Point HERMES_AGENT_PATH at the synced directory; run evolve_skill --eval-source sessiondb weekly; review PRs before merge. Backup ~/.hermes before instance changes.

14. Citeable technical facts (2026-06)

15. FAQ

Skills vs MCP? Skills teach procedure; MCP supplies tools. Use both—Skills orchestrate MCP calls.

Skill changed but Agent uses old version? Edits apply only in new sessions (/reset) or with --now install (invalidates prompt cache).

Are GEPA-evolved Skills safe? Four guardrails + human PR review; still inspect every diff.

Reuse in Claude Code? Copy to ~/.claude/skills/ or use cross-platform install scripts.

Chinese content token cost? ~1–1.5 tokens per character; keep description in English for routing accuracy.

16. Resources

17. Conclusion: Skills compound only when the Gateway keeps running

Laptop authoring, Docker on cheap VPS, and WSL2 can all host Hermes—but each leaves gaps: sleep interrupts trace collection, Linux lacks native Apple tooling for signing and Metal-backed scripts, and local hardware ties GEPA data to one machine with no clean backup story. Skill Bundles and conditional activation save tokens; GEPA turns failures into better SKILL.md text—but only if session databases grow continuously on stable hardware.

For teams treating Skills as infrastructure—not chat tricks—renting a VPSMAC Apple Silicon Mac cloud node delivers launchd 7×24 uptime, Git-synced ~/.hermes, and monthly RAM upgrades without buying new silicon. Ship the skill library on bare-metal macOS; let GEPA iterate while you review PRs—not while you chase uptime.