Reduces token consumption by compressing web content through local LLMs before feeding it to Claude Code.

curl -fsSL https://ai-summary.agent-tools.org/install.sh | sh

How It Works

Pipeline:

Web Search → Fetch Pages → Readability Extract → LLM Summary → Compressed output (60–98% smaller)

Instead of sending raw 50K+ page content to Claude, ai-summary returns a focused 1–4K summary.

Features

Search + Summarize — Gemini (Google Search grounding), DuckDuckGo, or Brave
Fetch + Summarize — Fetch any URL, extract article content, summarize with LLM
Stdin Summarize — Pipe any text through for compression
Fast Compress — No-LLM text extraction for instant compression
JS-heavy Pages — agent-browser and Cloudflare Browser Rendering
Pipe-friendly — cat urls.txt | ai-summary fetch, --json output, standard exit codes
GitHub Code Search — Search code and read files from GitHub repos via gh CLI + LLM summarization
Repo Summarize — Pack remote GitHub repos with repomix and summarize via LLM
Test Output Compression — wrap subcommand compresses passing test output (cargo test, npm test, pytest, etc.)
Claude Code Integration — Prompt injection for subagent coverage + PreToolUse hooks for real token savings
Rich Statistics — Time-period breakdown, ROI tracking
Multiple LLM Backends — opencode (free), oMLX (local), OpenAI, Groq, DeepSeek

Quick Start

ai-summary "what is the latest Rust version"

ai-summary fetch https://example.com/article -p "key points"

echo "large text" | ai-summary compress -m 4000

ai-summary github "error handling" -r tokio-rs/tokio -l rust

ai-summary github owner/repo src/main.rs -p "explain this"

ai-summary repo user/repo -p "explain the architecture"

ai-summary stats

Subcommands

ai-summary <query> — Search + summarize
ai-summary fetch <urls> -p ... — Fetch + summarize
ai-summary sum <prompt> — Summarize stdin via LLM
ai-summary compress -m <chars> — Fast compression (no LLM)
ai-summary wrap <command> — Run command, compress passing test output
ai-summary github <query> [-r repo] [-l lang] — Search GitHub code
ai-summary github <owner/repo> [path] — Read file / browse repo
ai-summary repo <owner/repo> -p ... — Pack remote repo + summarize
ai-summary crawl <url> — Crawl via CF Browser Rendering
ai-summary stats — Token savings stats
ai-summary init — Install Claude Code integration
ai-summary config — Show/create config

Flags: --deep, --raw, --json, --browser, --cf, --api-url, --api-key, --model

Claude Code Integration

One-command setup

ai-summary init                # Install prompt + hook
ai-summary init --with-repomix # Also install repomix (for repo command)
ai-summary init --uninstall    # Remove

Installs prompt injection (Claude + subagents use ai-summary), Bash hook (test commands compressed), and WebFetch/WebSearch hooks (one-time education per session).

Without hook: cargo test → 3000 tokens (raw)
With hook:    cargo test → 15 tokens (compressed)

Tee mode

Failed commands save full output to /tmp/ai-summary-tee/ — AI can read the raw log if the summary isn't enough.

Benchmarks (v2.6.0)

Real-world evaluation run on 2026-03-14. All tests use default config with free LLM backends.

Scenario	Input	Output	Compression	Time
Repo (ai-summary, 15 .rs files)	59.1K chars	1.6K chars	97%	59s
Repo (repomix, src/*/.ts)	182.3K chars	1.7K chars	99%	36s
Fetch (docs.rs/reqwest)	~1K tokens	~516 tokens	51%	10s
Fetch (react.dev/learn)	~1K tokens	~308 tokens	69%	10s
Fetch (Wikipedia/Rust)	~1K tokens	~378 tokens	62%	10s
Fetch (Hacker News)	~938 tokens	~262 tokens	72%	10s

Cumulative Stats (166 queries)

Metric	Value
Total tokens saved	300,700
Overall compression	84%
Estimated Claude cost saved	$0.90 (at $3/M input tokens)
LLM cost	$0.18 (mostly free backends)
ROI	5x return

By Mode

Mode	Queries	Tokens Saved	Avg Compression
gemini-cli (search)	58	171,600	85%
repo (repomix)	2	59,500	99%
fetch (single URL)	93	48,700	76%
compress (no LLM)	4	10,100	83%
gemini API	2	6,100	88%
stdin pipe	5	1,700	71%
hook-bash (auto)	2	867	30%

Repo mode achieves 97-99% compression using repomix Tree-sitter extraction + LLM summarization. Compress mode uses no LLM — pure text extraction at near-instant speed.

Installation

Quick install (recommended)

curl -fsSL https://ai-summary.agent-tools.org/install.sh | sh

Downloads prebuilt binary for your platform (macOS/Linux, x86_64/aarch64). Set AI_SUMMARY_INSTALL_DIR to customize install path (default: ~/.local/bin).

From crates.io

cargo install ai-summary

From source

git clone https://github.com/agent-tools-org/ai-summary
cd ai-summary
cargo install --path .

Releases: GitHub Releases