PII Shield
Anonymize documents before Claude sees them. Restore real data after analysis.
Anonymize documents before any LLM sees them. Restore real data after analysis.
Built for Claude Desktop. No coding required.
MCP plugin for Claude Desktop. CLI for any other LLM. Same engine, same session, same disk. Anonymize on one surface, deanonymize on the other — sessions are portable.
Your data stays on your computer. Claude only sees placeholders.
Pipeline: local → API → local
The document is anonymized locally; only placeholder text reaches Claude. When the model is done, PII is stitched back in — without ever leaving your disk.
The document is anonymized locally; only placeholder text reaches the LLM (Claude via the MCP plugin, or any other model via the CLI — ChatGPT, Gemini, local Llama, internal gateway). When the model is done, PII is stitched back in — without ever leaving your disk.
Document → Shield: anonymize_file({ file_path, session_id? }) → { session_id, entity_count, output_path, … }
Shield → Claude API: Claude reads output_path from the connected folder. Only placeholders cross the wire.
Claude API → Shield: model output flows back as text — still placeholder-only.
Shield → Result: deanonymize_docx({ file_path, session_id }) or deanonymize_text({ text, session_id }) stitches the real PII back in, locally.
The problem with "just use a model"
Pasting client material into a consumer LLM is both a privacy and a privilege problem.
A federal court has already confirmed the privilege side directly: in United States v. Heppner (SDNY, February 2026), documents a defendant created with consumer Claude were ruled outside attorney-client privilege. An AI tool is not an attorney and owes no duty of confidentiality. The same principle generalizes to ChatGPT, Gemini, and every other consumer LLM.
PII Shield runs the redaction step before the model sees anything. It's an MCP plugin for Claude Desktop — drop a file, Claude receives placeholders like <PERSON_1> and <ORG_2>, and the real values are stitched back into the output on your machine. The original text never leaves your disk.
PII Shield runs the redaction step before any model sees anything. Two surfaces: an MCP plugin for Claude Desktop (drop a file, Claude receives placeholders like <PERSON_1>, real values stitched back on save) and a CLI for any other LLM (anonymize → paste into ChatGPT, Gemini, local Llama, internal gateway → deanonymize on return). Same engine, same on-disk session format. The original text never leaves your disk.
v1 was Python + Presidio + spaCy — every install fought Claude Desktop's bundled runtime. v2.1.0 is a complete Node.js rewrite over onnxruntime-node + @xenova/transformers: same detection coverage, no Python.
Plugin: 700 KB on Windows/Linux (host Node), or 83 MB on macOS (bundled Node 24.15.0). The 634 MB gliner-pii-base-v1.0 model installs on-demand via an in-chat panel — never bundled in the artefact. CI runs the matrix on Ubuntu/Windows/macOS × Node 22/24.
Both v1.0.0 and v2.1.0 release tags live in gregmos/PII-Shield — the v1 Python build remains downloadable for legacy installs.
Six legal-document workflows out of the box
The companion pii-contract-analyze skill ships six task-shaped modes. Each runs the same anonymize → analyze → deanonymize pipeline, just optimized for a different deliverable.
Four steps. No terminal.
Download two files, drop them into Claude Desktop, connect a folder. The 634 MB GLiNER model installs itself in-chat on the first anonymization — no PowerShell, no bash, no scripts.
Three commands. Same engine.
For terminal users, scripting, CI gates, and round-trips through any LLM. Sessions, mappings, and the model live in ~/.pii_shield/ — shared with the MCP plugin.
Not sure? → MCP, CLI, or both?
.mcpb for your OS plus the contract-analyze skill. Direct links to the v2.1.0 release ↗.Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. The first time you use it, PII Shield will spend 2–3 minutes setting itself up. After that, it's instant.Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. On first call, PII Shield runs npm ci --ignore-scripts to install pinned runtime deps (onnxruntime-node, @xenova/transformers, gliner) into ~/.pii_shield/deps/ — 2–3 minutes once per machine, instant thereafter.Customize → Skills → + → Upload a skill and select pii-contract-analyze.skill. The skill orchestrates the full anonymize → review → analyze → deanonymize flow without you spelling out each step.pii-contract-analyze skill, connect a folder with your documents, then tell Claude what you need.you > analyze risks for the purchaser in contract.pdf
and prepare a short memo
[Skill: pii-contract-analyze · Folder: ~/Documents/contracts]
claude > // calls anonymize_file → sees <PERSON_1>, <ORG_1>…
// HITL review — you confirm/edit detected entities
// runs MEMO mode, drafts the memo
// calls deanonymize_docx on the output
here's contract-risks.memo.docx
~/.pii_shield/models/). Subsequent runs skip the panel entirely.
node -v).npm install -g pii-shield
pii-shield --version
npm uninstall; lives at ~/.pii_shield/models/.pii-shield install-model
# add --yes for non-interactive (CI)
anonymize or scan takes ~1–2 min while the engine deps install (deterministic, cached). Subsequent runs are instant.pii-shield doctor # green checks across the board
pii-shield anonymize contract.pdf --no-review
# → contract_anonymized.txt with <PERSON_1>, <ORG_1>, …
# → Session: 2026-04-29_120000_ab12
anonymize or scan call, the engine installs onnxruntime-node, @xenova/transformers, and gliner at pinned versions into ~/.pii_shield/deps/installs/<hash>/. About 300 MB, 1–2 min, deterministic. Everything (model, deps, mappings, audit logs) survives npm uninstall -g pii-shield.
Full reference for every flag, env var, and exit code: → CLI command reference
npm ci --ignore-scripts --legacy-peer-deps
npm run build # MCP server (esbuild bundle)
npm run build:cli # CLI binary
npm run smoke # node scripts/smoke-protocol.mjs
npm test # 8 suites, incl. multi-doc HITL & session archival
CI runs the same matrix on Ubuntu / Windows / macOS × Node 22 / 24.
→ Full install reference (configuration, paths, advanced flags)
One engine. Two front-ends.
Same 33 entity types, same session format on disk, same ~/.pii_shield/. Pick the front-end that matches how you work — sessions exported from one open in the other.
pii-shield verify ./out as a compliance gate; round-trip via terminal.sessions export → transfer → MCP import_session.What's in the box
anonymize_file reads the document on your machine and returns only a file path + session id. Claude reads the anonymized file from disk — PII never enters an API request.gliner-pii-base-v1.0 over onnxruntime-node. Handles ALL-CAPS, domain-specific names, multilingual text. No Python, no PyTorch.<ORG_1>, "Acme Corp." → <ORG_1a>, "Acme Corporation" → <ORG_1b>. One canonical form; every variant maps back correctly on restore..docx carries its session_id in Word custom properties. Weeks later, in a brand-new chat, drop the file in — PII is restored from the embedded id.session_id; identical entities share placeholders across files. One deanonymize call restores PII everywhere.export_session(passphrase) packs the mapping + anonymized documents into an encrypted .pii-session archive (AES-GCM via scrypt). Colleague runs import_session. PII never transits.~/.pii_shield/audit/. NER bootstrap, session lifecycle, dropped stderr — all on disk, appendable, off-network.Every flag. Every command. Every env var.
Configuration, full command reference for both surfaces, workflows, HITL walkthrough, Python integration, troubleshooting — on a separate docs page so this one stays readable.