pinned project

view

PII Shield

Anonymize documents before Claude sees them. Restore real data after analysis.

Anonymize documents before any LLM sees them. Restore real data after analysis.

Built for Claude Desktop. No coding required.

surfaces

MCP plugin for Claude Desktop. CLI for any other LLM. Same engine, same session, same disk. Anonymize on one surface, deanonymize on the other — sessions are portable.

v2.1.0 · Live MCP server CLI Node.js 22+ Windows · macOS · Linux MIT

Install from GitHub Documentation See how it works

$ npm install -g pii-shield

01how it flows

§ 01 / How it flows

Your data stays on your computer. Claude only sees placeholders.

Pipeline: local → API → local

The document is anonymized locally; only placeholder text reaches Claude. When the model is done, PII is stitched back in — without ever leaving your disk.

The document is anonymized locally; only placeholder text reaches the LLM (Claude via the MCP plugin, or any other model via the CLI — ChatGPT, Gemini, local Llama, internal gateway). When the model is done, PII is stitched back in — without ever leaving your disk.

Document

pdf · docx · txt

Has PII

PII Shield

anonymize

Local

Claude API

LLM API

analyze placeholders

claude · gpt · gemini · llama

Remote

PII Shield

restore

Local

Result

analysis with real names

Has PII

PII never leaves your machine. Anonymization and restoration both happen locally.

Original on disk

John Smith signed the NDA.

Acme Corp. v. Widget Inc.

Contact: john@acme.com

Amount: €120,000 via IBAN CY17…

UK NIN: AB123456C · DOB 1984-06-12

What Claude sees What the LLM sees over API

<PERSON_1> signed the NDA.

<ORG_1> v. <ORG_2>.

Contact: <EMAIL_1>

Amount: <MONEY_1> via IBAN <IBAN_1>

UK NIN: <UK_NIN_1> · DOB <DATE_1>

MCP call sequence

Document → Shield: anonymize_file({ file_path, session_id? }) → { session_id, entity_count, output_path, … }

Shield → Claude API: Claude reads output_path from the connected folder. Only placeholders cross the wire.

Claude API → Shield: model output flows back as text — still placeholder-only.

Shield → Result: deanonymize_docx({ file_path, session_id }) or deanonymize_text({ text, session_id }) stitches the real PII back in, locally.

Demo · 1:40 · auto-plays muted when in view

02why this exists

§ 02 / Why this exists

The problem with "just use a model"

Pasting client material into a consumer LLM is both a privacy and a privilege problem.

A federal court has already confirmed the privilege side directly: in United States v. Heppner (SDNY, February 2026), documents a defendant created with consumer Claude were ruled outside attorney-client privilege. An AI tool is not an attorney and owes no duty of confidentiality. The same principle generalizes to ChatGPT, Gemini, and every other consumer LLM.

PII Shield runs the redaction step before the model sees anything. It's an MCP plugin for Claude Desktop — drop a file, Claude receives placeholders like <PERSON_1> and <ORG_2>, and the real values are stitched back into the output on your machine. The original text never leaves your disk.

PII Shield runs the redaction step before any model sees anything. Two surfaces: an MCP plugin for Claude Desktop (drop a file, Claude receives placeholders like <PERSON_1>, real values stitched back on save) and a CLI for any other LLM (anonymize → paste into ChatGPT, Gemini, local Llama, internal gateway → deanonymize on return). Same engine, same on-disk session format. The original text never leaves your disk.

v1 → v2 · the rewrite

v1 was Python + Presidio + spaCy — every install fought Claude Desktop's bundled runtime. v2.1.0 is a complete Node.js rewrite over onnxruntime-node + @xenova/transformers: same detection coverage, no Python.

Plugin: 700 KB on Windows/Linux (host Node), or 83 MB on macOS (bundled Node 24.15.0). The 634 MB gliner-pii-base-v1.0 model installs on-demand via an in-chat panel — never bundled in the artefact. CI runs the matrix on Ubuntu/Windows/macOS × Node 22/24.

Both v1.0.0 and v2.1.0 release tags live in gregmos/PII-Shield — the v1 Python build remains downloadable for legacy installs.

03skill modes

§ 03 / Skill modes

Six legal-document workflows out of the box

The companion pii-contract-analyze skill ships six task-shaped modes. Each runs the same anonymize → analyze → deanonymize pipeline, just optimized for a different deliverable.

MEMO

Risk analysis & legal memorandum.

"Review this NDA for red-flag clauses."

REDLINE

Tracked-changes contract markup with safer language.

"Redline clauses 4.2 and 7.1."

SUMMARY

Brief plain-English overview, key terms surfaced.

"Summarize this DPA in 200 words."

COMPARISON

Diff two documents — obligations and clauses, not whitespace.

"Compare v1 and v2 of this MSA."

BULK

Up to 5 files in one shared session — placeholders consistent across docs.

"Anonymize all 5 NDAs as one batch."

ANONYMIZE

Output a redacted file, no analysis. For external sharing.

"Just anonymize, don't analyze."

04quick start

§ 04 / Quick start

Four steps. No terminal.

Download two files, drop them into Claude Desktop, connect a folder. The 634 MB GLiNER model installs itself in-chat on the first anonymization — no PowerShell, no bash, no scripts.

Three commands. Same engine.

For terminal users, scripting, CI gates, and round-trips through any LLM. Sessions, mappings, and the model live in ~/.pii_shield/ — shared with the MCP plugin.

surface

Not sure? → MCP, CLI, or both?

Download the artefacts

Pick the .mcpb for your OS plus the contract-analyze skill. Direct links to the v2.1.0 release ↗.

WIN · LINUX

pii-shield-v2.1.0-windows-linux.mcpb

700 KB host Node

MACOS

pii-shield-v2.1.0-macos.mcpb

83 MB bundled Node 24

SKILL · ANY OS

pii-contract-analyze.skill

25 KB contract analysis

Install the MCP extension

In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. The first time you use it, PII Shield will spend 2–3 minutes setting itself up. After that, it's instant.

In Claude Desktop: Settings → Extensions → Advanced Settings → Install extension and select your .mcpb. On first call, PII Shield runs npm ci --ignore-scripts to install pinned runtime deps (onnxruntime-node, @xenova/transformers, gliner) into ~/.pii_shield/deps/ — 2–3 minutes once per machine, instant thereafter.

Upload the skill

In Claude Desktop: Customize → Skills → + → Upload a skill and select pii-contract-analyze.skill. The skill orchestrates the full anonymize → review → analyze → deanonymize flow without you spelling out each step.

Use it from a chat

Start a new conversation, pick the pii-contract-analyze skill, connect a folder with your documents, then tell Claude what you need.

you    > analyze risks for the purchaser in contract.pdf
         and prepare a short memo
         [Skill: pii-contract-analyze · Folder: ~/Documents/contracts]
claude > // calls anonymize_file → sees <PERSON_1>, <ORG_1>…
         // HITL review — you confirm/edit detected entities
         // runs MEMO mode, drafts the memo
         // calls deanonymize_docx on the output
         here's contract-risks.memo.docx

First-run model install — handled in-chat

The first time you ask Claude to anonymize, PII Shield notices the model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser) and Install downloaded ZIP (PII Shield finds it in your Downloads folder automatically). Subsequent runs skip the panel.

The first time you ask Claude to anonymize, PII Shield notices the GLiNER model isn't on disk and opens an in-chat install panel with two buttons: Download model (~634 MB, fetched by your browser — no SmartScreen or Gatekeeper issues with unsigned scripts) and Install downloaded ZIP (PII Shield finds it in Downloads / Desktop / Documents, validates, atomic-extracts into ~/.pii_shield/models/). Subsequent runs skip the panel entirely.

Connect a folder. Don't drag-attach.

Use Claude Desktop → Settings → Connected folders to grant access. If you drag-attach a file directly into the chat, the raw document hits the API before PII Shield can intercept — and that defeats the entire design. Connected folders are read by the local plugin first.

Install the npm package

Node.js 22 or newer required (node -v).

npm install -g pii-shield
pii-shield --version

Download the GLiNER model

One-off, ~634 MB. Survives npm uninstall; lives at ~/.pii_shield/models/.

pii-shield install-model
# add --yes for non-interactive (CI)

Health-check, then run

First anonymize or scan takes ~1–2 min while the engine deps install (deterministic, cached). Subsequent runs are instant.

pii-shield doctor          # green checks across the board
pii-shield anonymize contract.pdf --no-review
# → contract_anonymized.txt with <PERSON_1>, <ORG_1>, …
# → Session: 2026-04-29_120000_ab12

First-run engine deps

On the first anonymize or scan call, the engine installs onnxruntime-node, @xenova/transformers, and gliner at pinned versions into ~/.pii_shield/deps/installs/<hash>/. About 300 MB, 1–2 min, deterministic. Everything (model, deps, mappings, audit logs) survives npm uninstall -g pii-shield.

Full reference for every flag, env var, and exit code: → CLI command reference

Build & verify locally (from source)

npm ci --ignore-scripts --legacy-peer-deps
npm run build              # MCP server (esbuild bundle)
npm run build:cli          # CLI binary
npm run smoke              # node scripts/smoke-protocol.mjs
npm test                   # 8 suites, incl. multi-doc HITL & session archival

CI runs the same matrix on Ubuntu / Windows / macOS × Node 22 / 24.

→ Full install reference (configuration, paths, advanced flags)

05mcp, cli, or both?

§ 05 / MCP, CLI, or both?

One engine. Two front-ends.

Same 33 entity types, same session format on disk, same ~/.pii_shield/. Pick the front-end that matches how you work — sessions exported from one open in the other.

Use MCP

You work in Claude Desktop and want a natural-language flow. The skill picks the right tool calls; you stay in the chat.

"Drop a contract. Ask Claude to find the indemnity clauses. Restore on save."

Use CLI

Scripting, CI/CD, non-Claude LLMs (ChatGPT / Gemini / local models), batch jobs across many files, headless servers.

pii-shield verify ./out as a compliance gate; round-trip via terminal.

Use both

Mixed teams; portable sessions across machines. Anonymize on a partner's laptop, review on yours — the same encrypted archive.

CLI sessions export → transfer → MCP import_session.

→ Full reference for both surfaces

06features

§ 06 / Features

What's in the box

Zero PII in API

anonymize_file reads the document on your machine and returns only a file path + session id. Claude reads the anonymized file from disk — PII never enters an API request.

GLiNER zero-shot NER

gliner-pii-base-v1.0 over onnxruntime-node. Handles ALL-CAPS, domain-specific names, multilingual text. No Python, no PyTorch.

Human-in-the-loop review

MCP Apps iframe UI rendered directly in Claude Desktop. Remove false positives, add missed entities — no localhost browser detour.

Entity deduplication

"Acme" → <ORG_1>, "Acme Corp." → <ORG_1a>, "Acme Corporation" → <ORG_1b>. One canonical form; every variant maps back correctly on restore.

Cross-session deanonymize

Each anonymized .docx carries its session_id in Word custom properties. Weeks later, in a brand-new chat, drop the file in — PII is restored from the embedded id.

Multi-file sessions

Anonymize N related documents under one session_id; identical entities share placeholders across files. One deanonymize call restores PII everywhere.

Encrypted team handoff

export_session(passphrase) packs the mapping + anonymized documents into an encrypted .pii-session archive (AES-GCM via scrypt). Colleague runs import_session. PII never transits.

Audit logging

Every tool call + response logged locally to ~/.pii_shield/audit/. NER bootstrap, session lifecycle, dropped stderr — all on disk, appendable, off-network.

07full documentation

Every flag. Every command. Every env var.

Configuration, full command reference for both surfaces, workflows, HITL walkthrough, Python integration, troubleshooting — on a separate docs page so this one stays readable.

Open full documentation GitHub repo

MCP tools→ CLI commands→ Configuration→ Workflows→ Python integration→ Troubleshooting→