OpenClaw Plugins for Local AI Workflows

Date: April 15, 2026 For: Guillaume Descoteaux-Isabelle Context: Mac Mini purchase planning for local AI model inference via OpenClaw, alongside existing OpenAI Codex and GitHub Copilot subscriptions

Executive Summary

OpenClaw is a modular AI automation agent (~250K GitHub stars) with a provider-plugin architecture. All model providers — local and cloud — are decoupled plugins that can be installed, enabled, and routed independently. It is a general-purpose AI agent framework, not a coding-specific tool. (Source: docs.openclaw.ai/tools/plugin, verdent.ai comparison)
@openclaw/ollama-provider is the essential plugin for local AI. It connects to a local Ollama instance, auto-discovers all pulled models via /api/tags, and enables fully offline, private inference on Mac Mini hardware. (Source: OpenClaw Provider Architecture)
The GitHub Copilot provider is a bundled, first-class extension (not a community plugin) — it ships in the OpenClaw monorepo at extensions/github-copilot/ and is enabled by default. It gives Guillaume $0 marginal cost access to GPT-4.1, GPT-5 family, Claude Sonnet/Opus, and Gemini models through his Copilot subscription. (Source: OpenClaw monorepo — extensions/github-copilot/models-defaults.ts)
Hermes Agent (Nous Research) is a separate Python-based framework, not plugin-compatible with OpenClaw at runtime, but both can share the same Ollama instance simultaneously. A migration tool (hermes claw migrate) enables importing OpenClaw configs. (Source: hermes-agent.nousresearch.com)
⚠️ SECURITY WARNING: 12–20% of community-published plugins on ClawHub have been found to contain malicious code (credential theft, backdoors, malware). Only install verified, official plugins. Run openclaw security audit --deep regularly. (Source: OpenClaw Security Crisis reporting, eastondev.com alert)

What is OpenClaw?

OpenClaw is an open-source, general-purpose AI automation agent framework originally created by Peter Steinberger and launched in November 2025. By March 2026 it had reached ~250,000 GitHub stars, making it one of the most popular open-source AI projects. After Steinberger joined OpenAI in February 2026, the project transitioned to an independent community foundation (with OpenAI providing financial support). (Source: verdent.ai guide, theclawstreetjournal.com)

OpenClaw is not a coding agent (unlike Claude Code or Cursor). It is an AI orchestrator that can:

Route requests between multiple AI model providers (local and cloud)
Execute multi-step agentic workflows with tool use
Manage persistent memory and context across sessions
Connect to 25+ messaging platforms (Telegram, Discord, Slack, etc.)
Support image, video, and music generation through provider plugins

The architecture is built on Node.js/TypeScript with a modular plugin system where every model provider, tool, and capability is a separate plugin registered through the Plugin SDK.

Plugin-by-Plugin Analysis

Local AI Plugins (Essential)

@openclaw/ollama-provider

Status: Bundled first-class provider — ships with OpenClaw Type: LOCAL inference Relevance for Guillaume: ★★★★★ ESSENTIAL

The Ollama provider is the cornerstone of local AI workflows in OpenClaw. It connects to a running Ollama instance (default: http://localhost:11434), queries the /api/tags endpoint for automatic model discovery, and supports streaming responses and tool calling. (Source: OpenClaw Provider Architecture)

What it does:

Bridges OpenClaw to any model Ollama can serve — Llama 3.3, Gemma 4, Mistral, Phi-3, CodeLlama, Qwen, DeepSeek, and hundreds more
Auto-discovers models: any model pulled via ollama pull becomes immediately available in OpenClaw without configuration changes
Supports tool calling for agentic workflows
Works fully offline — no internet required after model download

Configuration:

# 1. Start Ollama and pull models
ollama serve
ollama pull llama3.3
ollama pull gemma4

# 2. Configure OpenClaw
openclaw config set 'models.providers.ollama' --json '{
  "baseUrl": "http://localhost:11434",
  "api": "ollama",
  "models": "auto"
}'

# 3. Set default model
openclaw models set ollama/llama3.3

Key advantage: The "models": "auto" setting means new models pulled via ollama pull are automatically available without restarting or reconfiguring OpenClaw. This makes it trivial to experiment with different models on Mac Mini hardware.

@openclaw/huggingface-provider

Status: Available via plugin install Type: CLOUD API (not local) Relevance for Guillaume: ★★☆☆☆ LOW

Important clarification: Despite the name, this plugin wraps the HuggingFace Inference API — a cloud-hosted service at api-inference.huggingface.co. It does NOT load or serve HuggingFace models locally. (Source: OpenClaw docs pattern, HuggingFace API documentation)

For local HuggingFace models: Convert them to GGUF format and serve them through Ollama instead. Ollama natively runs HuggingFace-format models after conversion, and the Ollama provider handles the rest.

Configuration (cloud only):

openclaw config set 'models.providers.huggingface' --json '{
  "apiKey": "$HUGGINGFACE_API_TOKEN",
  "api": "huggingface-inference"
}'

When to use: Only when you need access to HuggingFace-hosted inference endpoints for models too large to run locally or for specialized models not available in Ollama-compatible format.

vLLM Provider (Bundled)

Status: Bundled — ships with OpenClaw Type: LOCAL high-performance inference Relevance for Guillaume: ★★★★☆ HIGH (for production use)

Connects to a local vLLM server via its OpenAI-compatible /v1/models endpoint. vLLM excels at high-concurrency local inference — serving multiple simultaneous requests more efficiently than Ollama. (Source: OpenClaw Provider Architecture blog)

When to use: When Ollama's single-request throughput isn't sufficient, particularly for batch processing or multi-agent OpenClaw setups where several agents query local models simultaneously.

SGLang Provider (Bundled)

Status: Bundled — ships with OpenClaw Type: LOCAL structured generation Relevance for Guillaume: ★★★☆☆ MODERATE

Connects to a local SGLang inference server. Specialized for constrained/structured output generation at high speed. (Source: OpenClaw Provider Architecture blog)

When to use: When you need JSON-schema-constrained output or other structured generation that benefits from SGLang's grammar-guided decoding.

Cloud AI Plugins (Complementary)

@openclaw/github-copilot-provider

Status: Bundled first-class extension — enabled by default Type: CLOUD (via GitHub Copilot subscription) Relevance for Guillaume: ★★★★★ ESSENTIAL (he has a Copilot subscription)

Correction from initial research: This is not an unofficial/community integration. The Copilot provider is maintained in the official OpenClaw monorepo at extensions/github-copilot/ and ships enabled by default. (Source: OpenClaw monorepo extensions/github-copilot/openclaw.plugin.json, confirmed by reviewer verification)

What it does: Turns Guillaume's existing GitHub Copilot subscription into a full LLM provider for OpenClaw. Authentication uses a GitHub device-login OAuth flow — no API key needed. All model usage is covered by the subscription at $0 marginal cost (subject to premium request quotas on some models).

Models available (April 2026, corrected and complete):

Model	Family	Notes
`gpt-4.1`	OpenAI	Current default — unlimited for paid users
`gpt-4.1-mini`	OpenAI	Lighter variant
`gpt-4.1-nano`	OpenAI	Lightest variant
`gpt-5-mini`	OpenAI	Fast, cheap — new GPT-5 family
`gpt-5.2`	OpenAI	Standard GPT-5
`gpt-5.3-codex`	OpenAI	Code-optimized
`gpt-5.4`	OpenAI	Latest flagship
`claude-sonnet-4.5`	Anthropic	Via Anthropic Messages transport
`claude-sonnet-4.6`	Anthropic	GA since Feb 17, 2026
`claude-opus-4.5`	Anthropic	Advanced reasoning
`claude-opus-4.6`	Anthropic	Premium reasoning
`claude-haiku-4.5`	Anthropic	Fast, lightweight
`o1`	OpenAI	Reasoning model
`o3-mini`	OpenAI	Reasoning model
`gemini-2.5-pro`	Google	Via Copilot
`gemini-3-flash`	Google	Via Copilot

(Source: GitHub Copilot Docs — Supported Models, reviewer verification April 2026)

⚠️ Note: gpt-4o is deprecated in GitHub Copilot — replaced by gpt-4.1. Do not use gpt-4o in configuration examples. (Source: GitHub blog changelog)

⚠️ Premium request quotas: While all models are "$0 marginal cost," some premium models (Claude Opus, GPT-5.4) consume limited monthly premium request quotas. Pro plan: 300/month; Pro+: 1,500/month. Extras: $0.04/request. GPT-4.1 is unlimited for paid users. (Source: GitHub Copilot Plans & Pricing)

Configuration:

# Authenticate with GitHub
openclaw models auth login-github-copilot
# Follow device-login flow: visit URL → enter code → done

# Set as primary model
openclaw models set github-copilot/gpt-4.1

Forward compatibility: The provider has a catch-all mechanism — any unknown model ID is accepted and synthesized dynamically. If GitHub adds new models, you can use them immediately without waiting for an OpenClaw update.

@openclaw/google-plugin

Status: Bundled first-class extension — enabled by default Type: CLOUD (via Google API key) Relevance for Guillaume: ★★★☆☆ MODERATE (adds multimodal capabilities)

A multi-capability extension providing access to the full Google Gemini ecosystem — not just search. It registers under two provider IDs: google (API key) and google-gemini-cli (OAuth). (Source: OpenClaw monorepo extensions/google/openclaw.plugin.json)

Capabilities: Gemini LLM chat, image generation (Gemini Flash Image), video generation (Veo 3.1), music generation (Lyria 3), media understanding (image/audio/video analysis), and web search via Gemini Grounding.

Configuration:

# Set API key (from Google AI Studio — free tier available)
export GEMINI_API_KEY="your-key-here"
# Or use openclaw onboard:
openclaw onboard --auth-choice gemini-api-key

Free tier limits: 5–15 RPM depending on model (Gemini 2.5 Pro: 5 RPM / 100 RPD; Flash: 10 RPM / 250 RPD). Paid tier unlocks higher throughput. (Source: Google AI Studio pricing, April 2026)

@openclaw/perplexity-plugin

Status: Available via plugin install Type: CLOUD — web search only (not an LLM provider) Relevance for Guillaume: ★★★☆☆ MODERATE

Adds structured web search with domain/date filtering (native API) or AI-synthesized answers with citations (OpenRouter/Sonar). Complements local models that have no internet access. (Source: OpenClaw extensions/perplexity/openclaw.plugin.json)

Configuration:

openclaw plugins install @openclaw/perplexity-plugin
openclaw config set 'models.providers.perplexity' --json '{
  "apiKey": "$PERPLEXITY_API_KEY"
}'

Pricing: No free API tier. Pro subscription ($20/month) includes $5 API credits. Token pricing: Sonar budget at $1/M input tokens; Sonar Pro at $3/M input + $15/M output tokens. Search API: $5/1,000 requests. (Source: Perplexity Pricing)

Other Plugins

@openclaw/minimax-provider, @openclaw/moonshot-provider

Status: Available via plugin install Type: CLOUD — Chinese AI cloud services Relevance for Guillaume: ★☆☆☆☆ SKIP

Both are cloud-only integrations for Chinese AI models (MiniMax M2.5, Moonshot Kimi K2.5) using OpenAI-compatible API format. They require separate API keys and are primarily relevant for Chinese-language ecosystems. (Source: OpenClaw China providers docs)

Recommendation: Skip unless you have specific needs for Chinese-language AI models or want budget-priced cloud alternatives.

ACPX Runtime

Status: Core OpenClaw subsystem — not a plugin Acronym expansion: Unconfirmed (some sources suggest "Active Claw Plugin eXecution" but this is not verified in official documentation)

ACPX is OpenClaw's internal plugin execution engine. It is the runtime layer that hosts and manages ALL plugins — including the Ollama provider. (Source: OpenClaw Architecture docs)

Key capabilities:

Sandbox isolation: Plugins run in controlled environments preventing unauthorized system access
Dynamic lifecycle: Plugins can be loaded, unloaded, updated, and hot-reloaded without restarting OpenClaw
Resource management: Enforces memory, CPU, and execution time limits per plugin
Security monitoring: Active monitoring with heuristic threat detection (critical given the malicious plugin crisis)

Why it matters for Guillaume: All providers — including local Ollama — run within ACPX's managed runtime. This provides isolation and monitoring, which is especially important given the documented security issues with community plugins.

Multi-Provider Routing

OpenClaw supports running multiple model providers simultaneously with an ordered failover chain. This is not round-robin — it's a priority-based routing system. (Source: OpenClaw model providers documentation)

How it works:

OpenClaw always tries the primary model first
On failure (rate limit, timeout, context overflow), it falls to the next model in fallbacks
Each provider plugin classifies its own error types via classifyFailoverReason
Cooldown probes prevent hammering a failed provider
Session-override persistence lets a specific conversation stick to a provider mid-session

Recommended multi-provider configuration for Guillaume:

{
  agents: {
    defaults: {
      // LLM: Copilot primary (GPT-4.1, unlimited for paid users)
      // Ollama as free local fallback
      model: {
        primary: "github-copilot/gpt-4.1",
        fallbacks: ["github-copilot/gpt-5.2", "ollama/gemma4", "ollama/llama3.3"]
      },
      // Memory embeddings: Copilot (free with subscription)
      memorySearch: {
        provider: "github-copilot",
        model: "text-embedding-3-small"
      }
    }
  },
  // Provider configurations
  models: {
    providers: {
      ollama: {
        baseUrl: "http://localhost:11434",
        api: "ollama",
        models: "auto"
      }
    },
    // Privacy-based routing policy
    routing: {
      policy: {
        sensitive: "ollama/*",     // Private/sensitive data stays local
        general: "github-copilot/gpt-4.1"  // General tasks use Copilot
      }
    }
  }
}

Capability-specific routing:

LLM inference: One model per request, failover chain spans providers
Web search: Google (Gemini Grounding) or Perplexity — configured separately via tools.web.search.provider
Image generation: Google plugin provides this; not available through Ollama
Embeddings: Copilot (priority 15, free), Ollama (local), or OpenAI — auto-detected in priority order

Hermes Compatibility

Is Hermes a Different System?

Yes. Hermes Agent is a completely separate AI agent framework built by Nous Research, released in February 2026. It is written in Python (vs. OpenClaw's Node.js/TypeScript) and focuses on self-improving agents that learn from their own task execution. (Source: hermes-agent.nousresearch.com, NousResearch GitHub)

Dimension	OpenClaw	Hermes Agent
Developer	Community Foundation (ex-Steinberger)	Nous Research
Language	Node.js / TypeScript	Python
Philosophy	Gateway/orchestrator	Self-improving agent
Plugin format	`openclaw.plugin.json` + TypeScript modules	`plugin.yaml` + Python modules + lifecycle hooks
Plugin registry	ClawHub marketplace	agentskills.io + pip distribution
Skills	Human-authored SKILL.md	Auto-generated from successful workflows + SKILL.md
Built-in tools	Varies by plugins	40+ built-in tools
Security	File-based auth (⚠️ security concerns)	Hardened container isolation, zero telemetry, OAuth 2.1
Learning	Static — no self-improvement	Continuous via replay, RL, skill extraction
Model support	Via provider plugins	200+ LLM models

(Source: vibesparking.com comparison, petronellatech.com)

Can They Share Ollama?

Yes. Both OpenClaw and Hermes can connect to the same Ollama instance running on the Mac Mini. Ollama handles concurrent requests natively, so both frameworks can query models simultaneously without conflict. They run as separate processes with separate memory/context — no shared state.

Plugin Cross-Compatibility

Plugins are NOT cross-compatible at runtime. OpenClaw's TypeScript plugin modules will not run in Hermes's Python environment, and vice versa.

What IS portable:

SKILL.md format — both projects use Markdown-based skill specifications, enabling partial skill portability
Model provider configurations — API keys and Ollama endpoints work with both
Migration tools — Hermes provides hermes claw migrate to import OpenClaw settings, memories, skills, and API keys. Markdown-defined skills transfer cleanly; complex TypeScript plugins require manual Python rewrites. (Source: Hermes migration docs)

Running Both Together

Many advanced users run both:

OpenClaw as the messaging gateway and workflow orchestrator
Hermes as a specialist agent that learns and improves at specific tasks
Both connect to the same local Ollama instance
They maintain completely separate runtime environments, memory, and context

Recommended Plugin Stack for Guillaume

Priority 0 — Install First

Plugin	Why
@openclaw/ollama-provider (bundled)	Core local inference — connects to all local models on Mac Mini
GitHub Copilot provider (bundled, enabled by default)	$0 access to GPT-4.1, GPT-5 family, Claude, Gemini via existing subscription
memory-core (bundled)	Local persistent memory for agent context

Priority 1 — Recommended

Plugin	Why
OpenAI provider (bundled)	Leverages existing Codex subscription for direct OpenAI API access
@openclaw/google-plugin (bundled)	Multimodal capabilities — image/video/music gen, Gemini LLM, web search grounding. Free tier available

Priority 2 — Optional

Plugin	Why
@openclaw/perplexity-plugin	Search-grounded answers when local models lack web context. Requires $20/mo subscription
vLLM provider (bundled)	Higher-performance local inference for multi-agent workloads
LanceDB memory plugin	Local vector memory storage on disk — runs entirely local

Not Needed

Plugin	Why Skip
@openclaw/huggingface-provider	Cloud API only — use Ollama for local HF models
@openclaw/minimax-provider	Chinese cloud AI — irrelevant for Guillaume's use case
@openclaw/moonshot-provider	Chinese cloud AI — irrelevant for Guillaume's use case

Security Recommendations

Given the documented security crisis with OpenClaw community plugins:

Only install bundled/official plugins from the OpenClaw monorepo
Never install unvetted community plugins from ClawHub without thorough code review
Run openclaw security audit --deep after any plugin installation
Rotate all API keys if you suspect compromise
Keep OpenClaw updated to the latest version for security patches

(Source: OpenClaw Security Guide, OpenClaw Security Crisis)

Sources

Pricing Sources

Perplexity Pricing 2026 — https://screenapp.io/blog/perplexity-pricing
Google Gemini API Free Tier — https://findskill.ai/blog/gemini-api-pricing-guide/
Anthropic OAuth Block (April 4, 2026) — Confirmed via dev.to, TechCrunch, aitoolsrecap.com

Final document compiled April 15, 2026. All reviewer corrections incorporated. All claims sourced.

OpenClaw Plugins for Local AI Workflows

OpenClaw Plugins for Local AI Workflows

Executive Summary

What is OpenClaw?

Plugin-by-Plugin Analysis

Local AI Plugins (Essential)

@openclaw/ollama-provider

@openclaw/huggingface-provider

vLLM Provider (Bundled)

SGLang Provider (Bundled)

Cloud AI Plugins (Complementary)

@openclaw/github-copilot-provider

@openclaw/google-plugin

@openclaw/perplexity-plugin

Other Plugins

@openclaw/minimax-provider, @openclaw/moonshot-provider

ACPX Runtime

Multi-Provider Routing

Hermes Compatibility

Is Hermes a Different System?

Can They Share Ollama?

Plugin Cross-Compatibility

Running Both Together

Recommended Plugin Stack for Guillaume

Priority 0 — Install First

Priority 1 — Recommended

Priority 2 — Optional

Not Needed

Security Recommendations

Sources

Official Documentation & Source Code

Security Sources

Comparison & Analysis

Pricing Sources