← Back to Articles & Artefacts
artefactssouth

OpenClaw Plugins for Local AI Workflows

IAIP Research
rch-tech-jgwill-claws-infrastructure

OpenClaw Plugins for Local AI Workflows

Date: April 15, 2026 For: Guillaume Descoteaux-Isabelle Context: Mac Mini purchase planning for local AI model inference via OpenClaw, alongside existing OpenAI Codex and GitHub Copilot subscriptions


Executive Summary

  • OpenClaw is a modular AI automation agent (~250K GitHub stars) with a provider-plugin architecture. All model providers — local and cloud — are decoupled plugins that can be installed, enabled, and routed independently. It is a general-purpose AI agent framework, not a coding-specific tool. (Source: docs.openclaw.ai/tools/plugin, verdent.ai comparison)
  • @openclaw/ollama-provider is the essential plugin for local AI. It connects to a local Ollama instance, auto-discovers all pulled models via /api/tags, and enables fully offline, private inference on Mac Mini hardware. (Source: OpenClaw Provider Architecture)
  • The GitHub Copilot provider is a bundled, first-class extension (not a community plugin) — it ships in the OpenClaw monorepo at extensions/github-copilot/ and is enabled by default. It gives Guillaume $0 marginal cost access to GPT-4.1, GPT-5 family, Claude Sonnet/Opus, and Gemini models through his Copilot subscription. (Source: OpenClaw monorepo — extensions/github-copilot/models-defaults.ts)
  • Hermes Agent (Nous Research) is a separate Python-based framework, not plugin-compatible with OpenClaw at runtime, but both can share the same Ollama instance simultaneously. A migration tool (hermes claw migrate) enables importing OpenClaw configs. (Source: hermes-agent.nousresearch.com)
  • ⚠️ SECURITY WARNING: 12–20% of community-published plugins on ClawHub have been found to contain malicious code (credential theft, backdoors, malware). Only install verified, official plugins. Run openclaw security audit --deep regularly. (Source: OpenClaw Security Crisis reporting, eastondev.com alert)

What is OpenClaw?

OpenClaw is an open-source, general-purpose AI automation agent framework originally created by Peter Steinberger and launched in November 2025. By March 2026 it had reached ~250,000 GitHub stars, making it one of the most popular open-source AI projects. After Steinberger joined OpenAI in February 2026, the project transitioned to an independent community foundation (with OpenAI providing financial support). (Source: verdent.ai guide, theclawstreetjournal.com)

OpenClaw is not a coding agent (unlike Claude Code or Cursor). It is an AI orchestrator that can:

  • Route requests between multiple AI model providers (local and cloud)
  • Execute multi-step agentic workflows with tool use
  • Manage persistent memory and context across sessions
  • Connect to 25+ messaging platforms (Telegram, Discord, Slack, etc.)
  • Support image, video, and music generation through provider plugins

The architecture is built on Node.js/TypeScript with a modular plugin system where every model provider, tool, and capability is a separate plugin registered through the Plugin SDK.


Plugin-by-Plugin Analysis

Local AI Plugins (Essential)

@openclaw/ollama-provider

Status: Bundled first-class provider — ships with OpenClaw Type: LOCAL inference Relevance for Guillaume: ★★★★★ ESSENTIAL

The Ollama provider is the cornerstone of local AI workflows in OpenClaw. It connects to a running Ollama instance (default: http://localhost:11434), queries the /api/tags endpoint for automatic model discovery, and supports streaming responses and tool calling. (Source: OpenClaw Provider Architecture)

What it does:

  • Bridges OpenClaw to any model Ollama can serve — Llama 3.3, Gemma 4, Mistral, Phi-3, CodeLlama, Qwen, DeepSeek, and hundreds more
  • Auto-discovers models: any model pulled via ollama pull becomes immediately available in OpenClaw without configuration changes
  • Supports tool calling for agentic workflows
  • Works fully offline — no internet required after model download

Configuration:

# 1. Start Ollama and pull models
ollama serve
ollama pull llama3.3
ollama pull gemma4

# 2. Configure OpenClaw
openclaw config set 'models.providers.ollama' --json '{
  "baseUrl": "http://localhost:11434",
  "api": "ollama",
  "models": "auto"
}'

# 3. Set default model
openclaw models set ollama/llama3.3

Key advantage: The "models": "auto" setting means new models pulled via ollama pull are automatically available without restarting or reconfiguring OpenClaw. This makes it trivial to experiment with different models on Mac Mini hardware.

@openclaw/huggingface-provider

Status: Available via plugin install Type: CLOUD API (not local) Relevance for Guillaume: ★★☆☆☆ LOW

Important clarification: Despite the name, this plugin wraps the HuggingFace Inference API — a cloud-hosted service at api-inference.huggingface.co. It does NOT load or serve HuggingFace models locally. (Source: OpenClaw docs pattern, HuggingFace API documentation)

For local HuggingFace models: Convert them to GGUF format and serve them through Ollama instead. Ollama natively runs HuggingFace-format models after conversion, and the Ollama provider handles the rest.

Configuration (cloud only):

openclaw config set 'models.providers.huggingface' --json '{
  "apiKey": "$HUGGINGFACE_API_TOKEN",
  "api": "huggingface-inference"
}'

When to use: Only when you need access to HuggingFace-hosted inference endpoints for models too large to run locally or for specialized models not available in Ollama-compatible format.

vLLM Provider (Bundled)

Status: Bundled — ships with OpenClaw Type: LOCAL high-performance inference Relevance for Guillaume: ★★★★☆ HIGH (for production use)

Connects to a local vLLM server via its OpenAI-compatible /v1/models endpoint. vLLM excels at high-concurrency local inference — serving multiple simultaneous requests more efficiently than Ollama. (Source: OpenClaw Provider Architecture blog)

When to use: When Ollama's single-request throughput isn't sufficient, particularly for batch processing or multi-agent OpenClaw setups where several agents query local models simultaneously.

SGLang Provider (Bundled)

Status: Bundled — ships with OpenClaw Type: LOCAL structured generation Relevance for Guillaume: ★★★☆☆ MODERATE

Connects to a local SGLang inference server. Specialized for constrained/structured output generation at high speed. (Source: OpenClaw Provider Architecture blog)

When to use: When you need JSON-schema-constrained output or other structured generation that benefits from SGLang's grammar-guided decoding.


Cloud AI Plugins (Complementary)

@openclaw/github-copilot-provider

Status: Bundled first-class extension — enabled by default Type: CLOUD (via GitHub Copilot subscription) Relevance for Guillaume: ★★★★★ ESSENTIAL (he has a Copilot subscription)

Correction from initial research: This is not an unofficial/community integration. The Copilot provider is maintained in the official OpenClaw monorepo at extensions/github-copilot/ and ships enabled by default. (Source: OpenClaw monorepo extensions/github-copilot/openclaw.plugin.json, confirmed by reviewer verification)

What it does: Turns Guillaume's existing GitHub Copilot subscription into a full LLM provider for OpenClaw. Authentication uses a GitHub device-login OAuth flow — no API key needed. All model usage is covered by the subscription at $0 marginal cost (subject to premium request quotas on some models).

Models available (April 2026, corrected and complete):

ModelFamilyNotes
gpt-4.1OpenAICurrent default — unlimited for paid users
gpt-4.1-miniOpenAILighter variant
gpt-4.1-nanoOpenAILightest variant
gpt-5-miniOpenAIFast, cheap — new GPT-5 family
gpt-5.2OpenAIStandard GPT-5
gpt-5.3-codexOpenAICode-optimized
gpt-5.4OpenAILatest flagship
claude-sonnet-4.5AnthropicVia Anthropic Messages transport
claude-sonnet-4.6AnthropicGA since Feb 17, 2026
claude-opus-4.5AnthropicAdvanced reasoning
claude-opus-4.6AnthropicPremium reasoning
claude-haiku-4.5AnthropicFast, lightweight
o1OpenAIReasoning model
o3-miniOpenAIReasoning model
gemini-2.5-proGoogleVia Copilot
gemini-3-flashGoogleVia Copilot

(Source: GitHub Copilot Docs — Supported Models, reviewer verification April 2026)

⚠️ Note: gpt-4o is deprecated in GitHub Copilot — replaced by gpt-4.1. Do not use gpt-4o in configuration examples. (Source: GitHub blog changelog)

⚠️ Premium request quotas: While all models are "$0 marginal cost," some premium models (Claude Opus, GPT-5.4) consume limited monthly premium request quotas. Pro plan: 300/month; Pro+: 1,500/month. Extras: $0.04/request. GPT-4.1 is unlimited for paid users. (Source: GitHub Copilot Plans & Pricing)

Configuration:

# Authenticate with GitHub
openclaw models auth login-github-copilot
# Follow device-login flow: visit URL → enter code → done

# Set as primary model
openclaw models set github-copilot/gpt-4.1

Forward compatibility: The provider has a catch-all mechanism — any unknown model ID is accepted and synthesized dynamically. If GitHub adds new models, you can use them immediately without waiting for an OpenClaw update.

@openclaw/google-plugin

Status: Bundled first-class extension — enabled by default Type: CLOUD (via Google API key) Relevance for Guillaume: ★★★☆☆ MODERATE (adds multimodal capabilities)

A multi-capability extension providing access to the full Google Gemini ecosystem — not just search. It registers under two provider IDs: google (API key) and google-gemini-cli (OAuth). (Source: OpenClaw monorepo extensions/google/openclaw.plugin.json)

Capabilities: Gemini LLM chat, image generation (Gemini Flash Image), video generation (Veo 3.1), music generation (Lyria 3), media understanding (image/audio/video analysis), and web search via Gemini Grounding.

Configuration:

# Set API key (from Google AI Studio — free tier available)
export GEMINI_API_KEY="your-key-here"
# Or use openclaw onboard:
openclaw onboard --auth-choice gemini-api-key

Free tier limits: 5–15 RPM depending on model (Gemini 2.5 Pro: 5 RPM / 100 RPD; Flash: 10 RPM / 250 RPD). Paid tier unlocks higher throughput. (Source: Google AI Studio pricing, April 2026)

@openclaw/perplexity-plugin

Status: Available via plugin install Type: CLOUD — web search only (not an LLM provider) Relevance for Guillaume: ★★★☆☆ MODERATE

Adds structured web search with domain/date filtering (native API) or AI-synthesized answers with citations (OpenRouter/Sonar). Complements local models that have no internet access. (Source: OpenClaw extensions/perplexity/openclaw.plugin.json)

Configuration:

openclaw plugins install @openclaw/perplexity-plugin
openclaw config set 'models.providers.perplexity' --json '{
  "apiKey": "$PERPLEXITY_API_KEY"
}'

Pricing: No free API tier. Pro subscription ($20/month) includes $5 API credits. Token pricing: Sonar budget at $1/M input tokens; Sonar Pro at $3/M input + $15/M output tokens. Search API: $5/1,000 requests. (Source: Perplexity Pricing)

Other Plugins

@openclaw/minimax-provider, @openclaw/moonshot-provider

Status: Available via plugin install Type: CLOUD — Chinese AI cloud services Relevance for Guillaume: ★☆☆☆☆ SKIP

Both are cloud-only integrations for Chinese AI models (MiniMax M2.5, Moonshot Kimi K2.5) using OpenAI-compatible API format. They require separate API keys and are primarily relevant for Chinese-language ecosystems. (Source: OpenClaw China providers docs)

Recommendation: Skip unless you have specific needs for Chinese-language AI models or want budget-priced cloud alternatives.

ACPX Runtime

Status: Core OpenClaw subsystem — not a plugin Acronym expansion: Unconfirmed (some sources suggest "Active Claw Plugin eXecution" but this is not verified in official documentation)

ACPX is OpenClaw's internal plugin execution engine. It is the runtime layer that hosts and manages ALL plugins — including the Ollama provider. (Source: OpenClaw Architecture docs)

Key capabilities:

  • Sandbox isolation: Plugins run in controlled environments preventing unauthorized system access
  • Dynamic lifecycle: Plugins can be loaded, unloaded, updated, and hot-reloaded without restarting OpenClaw
  • Resource management: Enforces memory, CPU, and execution time limits per plugin
  • Security monitoring: Active monitoring with heuristic threat detection (critical given the malicious plugin crisis)

Why it matters for Guillaume: All providers — including local Ollama — run within ACPX's managed runtime. This provides isolation and monitoring, which is especially important given the documented security issues with community plugins.


Multi-Provider Routing

OpenClaw supports running multiple model providers simultaneously with an ordered failover chain. This is not round-robin — it's a priority-based routing system. (Source: OpenClaw model providers documentation)

How it works:

  1. OpenClaw always tries the primary model first
  2. On failure (rate limit, timeout, context overflow), it falls to the next model in fallbacks
  3. Each provider plugin classifies its own error types via classifyFailoverReason
  4. Cooldown probes prevent hammering a failed provider
  5. Session-override persistence lets a specific conversation stick to a provider mid-session

Recommended multi-provider configuration for Guillaume:

{
  agents: {
    defaults: {
      // LLM: Copilot primary (GPT-4.1, unlimited for paid users)
      // Ollama as free local fallback
      model: {
        primary: "github-copilot/gpt-4.1",
        fallbacks: ["github-copilot/gpt-5.2", "ollama/gemma4", "ollama/llama3.3"]
      },
      // Memory embeddings: Copilot (free with subscription)
      memorySearch: {
        provider: "github-copilot",
        model: "text-embedding-3-small"
      }
    }
  },
  // Provider configurations
  models: {
    providers: {
      ollama: {
        baseUrl: "http://localhost:11434",
        api: "ollama",
        models: "auto"
      }
    },
    // Privacy-based routing policy
    routing: {
      policy: {
        sensitive: "ollama/*",     // Private/sensitive data stays local
        general: "github-copilot/gpt-4.1"  // General tasks use Copilot
      }
    }
  }
}

Capability-specific routing:

  • LLM inference: One model per request, failover chain spans providers
  • Web search: Google (Gemini Grounding) or Perplexity — configured separately via tools.web.search.provider
  • Image generation: Google plugin provides this; not available through Ollama
  • Embeddings: Copilot (priority 15, free), Ollama (local), or OpenAI — auto-detected in priority order

Hermes Compatibility

Is Hermes a Different System?

Yes. Hermes Agent is a completely separate AI agent framework built by Nous Research, released in February 2026. It is written in Python (vs. OpenClaw's Node.js/TypeScript) and focuses on self-improving agents that learn from their own task execution. (Source: hermes-agent.nousresearch.com, NousResearch GitHub)

DimensionOpenClawHermes Agent
DeveloperCommunity Foundation (ex-Steinberger)Nous Research
LanguageNode.js / TypeScriptPython
PhilosophyGateway/orchestratorSelf-improving agent
Plugin formatopenclaw.plugin.json + TypeScript modulesplugin.yaml + Python modules + lifecycle hooks
Plugin registryClawHub marketplaceagentskills.io + pip distribution
SkillsHuman-authored SKILL.mdAuto-generated from successful workflows + SKILL.md
Built-in toolsVaries by plugins40+ built-in tools
SecurityFile-based auth (⚠️ security concerns)Hardened container isolation, zero telemetry, OAuth 2.1
LearningStatic — no self-improvementContinuous via replay, RL, skill extraction
Model supportVia provider plugins200+ LLM models

(Source: vibesparking.com comparison, petronellatech.com)

Can They Share Ollama?

Yes. Both OpenClaw and Hermes can connect to the same Ollama instance running on the Mac Mini. Ollama handles concurrent requests natively, so both frameworks can query models simultaneously without conflict. They run as separate processes with separate memory/context — no shared state.

Plugin Cross-Compatibility

Plugins are NOT cross-compatible at runtime. OpenClaw's TypeScript plugin modules will not run in Hermes's Python environment, and vice versa.

What IS portable:

  • SKILL.md format — both projects use Markdown-based skill specifications, enabling partial skill portability
  • Model provider configurations — API keys and Ollama endpoints work with both
  • Migration tools — Hermes provides hermes claw migrate to import OpenClaw settings, memories, skills, and API keys. Markdown-defined skills transfer cleanly; complex TypeScript plugins require manual Python rewrites. (Source: Hermes migration docs)

Running Both Together

Many advanced users run both:

  • OpenClaw as the messaging gateway and workflow orchestrator
  • Hermes as a specialist agent that learns and improves at specific tasks
  • Both connect to the same local Ollama instance
  • They maintain completely separate runtime environments, memory, and context

Recommended Plugin Stack for Guillaume

Priority 0 — Install First

PluginWhy
@openclaw/ollama-provider (bundled)Core local inference — connects to all local models on Mac Mini
GitHub Copilot provider (bundled, enabled by default)$0 access to GPT-4.1, GPT-5 family, Claude, Gemini via existing subscription
memory-core (bundled)Local persistent memory for agent context

Priority 1 — Recommended

PluginWhy
OpenAI provider (bundled)Leverages existing Codex subscription for direct OpenAI API access
@openclaw/google-plugin (bundled)Multimodal capabilities — image/video/music gen, Gemini LLM, web search grounding. Free tier available

Priority 2 — Optional

PluginWhy
@openclaw/perplexity-pluginSearch-grounded answers when local models lack web context. Requires $20/mo subscription
vLLM provider (bundled)Higher-performance local inference for multi-agent workloads
LanceDB memory pluginLocal vector memory storage on disk — runs entirely local

Not Needed

PluginWhy Skip
@openclaw/huggingface-providerCloud API only — use Ollama for local HF models
@openclaw/minimax-providerChinese cloud AI — irrelevant for Guillaume's use case
@openclaw/moonshot-providerChinese cloud AI — irrelevant for Guillaume's use case

Security Recommendations

Given the documented security crisis with OpenClaw community plugins:

  1. Only install bundled/official plugins from the OpenClaw monorepo
  2. Never install unvetted community plugins from ClawHub without thorough code review
  3. Run openclaw security audit --deep after any plugin installation
  4. Rotate all API keys if you suspect compromise
  5. Keep OpenClaw updated to the latest version for security patches

(Source: OpenClaw Security Guide, OpenClaw Security Crisis)


Sources

Official Documentation & Source Code

  1. OpenClaw Plugin Docshttps://docs.openclaw.ai/tools/plugin
  2. OpenClaw Plugin Architecture (DeepWiki)https://deepwiki.com/openclaw/openclaw/5.1-plugin-architecture
  3. OpenClaw GitHub Repositoryhttps://github.com/openclaw/openclaw (~250K stars)
  4. Hermes Agent (Nous Research)https://hermes-agent.nousresearch.com/
  5. Hermes Agent GitHubhttps://github.com/NousResearch/hermes-agent
  6. GitHub Copilot Supported Modelshttps://docs.github.com/en/copilot/reference/ai-models/supported-models
  7. GitHub Copilot Plans & Pricinghttps://github.com/features/copilot/plans

Security Sources

  1. OpenClaw Security Crisis Reporthttps://openclaw.nasseroumer.com/blog/openclaw-security-crisis-2026/
  2. 800+ Malicious Plugins Alerthttps://eastondev.com/blog/en/posts/ai/20260227-openclaw-security-alert/
  3. OpenClaw Security Guide (CVEs)https://www.bitdoze.com/openclaw-security-guide/
  4. OpenClaw Security (Adversa AI)https://adversa.ai/blog/openclaw-security-101-vulnerabilities-hardening-2026/

Comparison & Analysis

  1. Verdent.ai: Claw Code vs Claude Code vs OpenClawhttps://www.verdent.ai/guides/claw-code-claude-code-vs-openclaw
  2. OpenClaw vs Hermes Agent Deep Comparisonhttps://www.vibesparking.com/en/blog/ai/openclaw/2026-04-09-openclaw-vs-hermes-agent-deep-comparison/
  3. Hermes Migration from OpenClawhttps://hermes-agent.nousresearch.com/docs/guides/migrate-from-openclaw/
  4. OpenClaw Provider Architecture Bloghttps://cheesecat.net/blog/2026-03-14-provider-architecture/
  5. OpenClaw China Providershttps://openclawcn.com/en/docs/providers/china/
  6. GPT-4o Deprecation in Copilothttps://github.blog/changelog/2026-01-13-upcoming-deprecation-of-select-github-copilot-models-from-claude-and-openai/

Pricing Sources

  1. Perplexity Pricing 2026https://screenapp.io/blog/perplexity-pricing
  2. Google Gemini API Free Tierhttps://findskill.ai/blog/gemini-api-pricing-guide/
  3. Anthropic OAuth Block (April 4, 2026) — Confirmed via dev.to, TechCrunch, aitoolsrecap.com

Final document compiled April 15, 2026. All reviewer corrections incorporated. All claims sourced.