pakka. v0.19.0 · shipped

pakka / compress / the diet

compress.

Fewer tokens, in and out. Without losing the load-bearing context.

What it does

Token diet, semantically aware.

Long sessions accrete weight: stale tool output, half-typed plans, repeated file dumps. Compress walks the conversation and rewrites it down to the parts the model actually needs to keep going.

in detail

Compress is a turn-level pass. It distinguishes load-bearing spans (decisions, current diff, open spec) from scaffolding (raw file dumps, tool retries, model thinking out loud). Scaffolding gets distilled. Decisions get preserved verbatim and indexed in recall.

A 14-turn session in super-ultra rides at 30–40% of the unmodified token cost, with no measurable drop in code quality.

/01

Lossless on decisions.

Anything that affected a diff is preserved verbatim and re-anchored into recall — never paraphrased away.

/02

Distilled on scaffolding.

Tool retries, raw file dumps, the model's thinking aloud — collapsed into a one-line summary the agent can still reference.

/03

Mode-driven aggressiveness.

lite trims, strict folds more aggressively, ultra condenses prior turns, super-ultra re-summarises and extracts decisions.

/04

Reversible, always.

The full transcript is kept on disk in .pakka/. Restore any turn from the raw JSONL store — useful for debugging or audit.

Modes

Four dials. Pick once.

Mode is a session-level knob. Default is super-ultra. Switch with /pakka:compress <level>.

/ 01 lite

Minimal compression. Trims obvious noise — repeated preambles, stale greetings. Good for very short sessions where you want near-zero behavioural change.

token reduction~27%

per 1M out~$4.05

quality+ within noise

/ 02 strict

Trims redundant tool output and stale file reads. The conservative default for short sessions where you want zero behavioural change.

token reduction~33%

per 1M out~$4.95

qualityidentical

/ 03 ultra

Adds turn-level summarization for prior tool calls and folds repeated file reads. Recommended for sessions over six turns.

token reduction~55%

per 1M out~$8.25

quality+ within noise

/ 04 · default super-ultra

Aggressive: prior-turn re-summarization, decision-grade extraction to recall, scoped tool-output retention. The shipping default.

token reduction~66%

per 1M out~$9.90

quality+ within noise

One example

Same turn. Different weight.

A real turn from the rate-limit engagement, captured before/after compress. Tool output trimmed; decisions preserved verbatim and indexed.

Without compressturn 6

Input tokens14,820

Output tokens3,142

File-read repeats5

Tool output keptall

Spend$0.064

→

With compresssuper-ultra

Input tokens5,210 ↓ 65%

Output tokens1,287

File-read repeats0 (folded)

Tool output keptscoped

Spend$0.022

next primitive gate.

Exit-code evidence before "done." Reviewer + security + architect agents, anchored to the spec.

→ read pairs with recall.

Compress hands decisions to recall. Ask weeks later why you dropped Redis — get the original turn back.

→ read