WASM Plugins
LLMProxy supports WebAssembly plugins via the Extism SDK, enabling Rust, Go, and C plugins to run in a memory-safe sandbox.
Why WASM?
- Memory-safe sandboxing: WASM plugins run in an isolated VM — crashes cannot affect the Python process
- Language freedom: Write plugins in Rust, Go, C, or any language that compiles to WASM
- Same guarantees: Timeout enforcement, per-plugin metrics, and fail policies apply identically to WASM and Python plugins
Prerequisites
bash
pip install extismIf the extism python package or the underlying system shared library libextism is missing or incompatible, WASM plugins are skipped to prevent crashes, but LLMProxy logs clear diagnostic warnings/errors at startup detailing the root cause to assist operators in troubleshooting.
JSON I/O Protocol
WASM plugins communicate via JSON:
Input:
json
{
"body": { "messages": [...] },
"metadata": {},
"session_id": "abc123",
"config": { "my_setting": "value" }
}Output:
json
{
"action": "ALLOW",
"body": { "messages": [...] },
"status_code": 200,
"message": "Processed"
}Actions
| WASM Action | Maps To | Effect |
|---|---|---|
ALLOW | passthrough | Let request continue |
BLOCK | block | Stop chain, return error |
MODIFIED | modify | Body was mutated, continue |
Rust Plugin Template
rust
// lib.rs
use extism_pdk::*;
use serde::{Deserialize, Serialize};
#[derive(Deserialize)]
struct PluginInput {
body: serde_json::Value,
metadata: serde_json::Value,
session_id: String,
config: serde_json::Value,
}
#[derive(Serialize)]
struct PluginOutput {
action: String,
body: serde_json::Value,
status_code: u32,
message: String,
}
#[plugin_fn]
pub fn execute(input: String) -> FnResult<String> {
let ctx: PluginInput = serde_json::from_str(&input)?;
let output = PluginOutput {
action: "ALLOW".to_string(),
body: ctx.body,
status_code: 200,
message: "Processed by WASM plugin".to_string(),
};
Ok(serde_json::to_string(&output)?)
}Cargo.toml
toml
[package]
name = "my-wasm-plugin"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
extism-pdk = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"Build
bash
cargo build --release --target wasm32-wasi
# Output: target/wasm32-wasi/release/my_wasm_plugin.wasmExecution Model
- All WASM calls run via
asyncio.to_thread(), releasing the GIL - The event loop stays free for other requests
- Timeout enforcement applies identically to Python plugins
- Non-blocking by design
Registration
Register WASM plugins in manifest.yaml:
yaml
- name: "My WASM Plugin"
hook: "pre_flight"
priority: 25
enabled: true
type: "wasm"
entrypoint: "plugins/wasm/my_plugin.wasm"
version: "0.1.0"
timeout_ms: 100
config:
my_setting: "value"