Introduction

l0-cache is a lightweight CLI proxy that sits between your AI coding assistant and shell commands. It filters, truncates, and compresses command output to reduce token consumption by 50-80%.

Why

AI coding assistants (Claude Code, Gemini CLI, Cursor) read the full output of every shell command they execute. Most of this output is noise:

Hundreds of passing test lines when only the failures matter
ANSI color codes that consume tokens but carry no semantic value
Repeated identical lines from build tools
Blank lines that waste token budget

The useful information is almost always concentrated in the first few lines (headers, command echo) and the last few lines (errors, summary, exit status).

How

Instead of running:

cargo test

Run:

l0-cache cargo test

The output is filtered through a streaming pipeline:

ANSI stripping -- removes color codes, cursor movement, etc.
Identical line collapsing -- repeated lines become (×N)
Blank line squeezing -- consecutive blanks reduced to one
Head/tail buffering -- keeps first 30 and last 30 lines, discards the middle

The result is a compact output that preserves the information an LLM needs to understand what happened.

Design Principles

Universal: works with any command, no per-tool parsers
Unintrusive: the binary never rewrites your shell, aliases, or $PATH -- you invoke it explicitly. An optional, off-by-default Claude Code hook can do the prefixing for you, and is just as conservative about what it touches.
Safe: never modifies the child command's behavior
Lightweight: single Rust binary, zero runtime dependencies
Observable: every invocation logs metrics for analysis

Introduction ​

Why ​

How ​

Design Principles ​

Introduction

Why

How

Design Principles