Skip to content
Back to home

§ DOCUMENTATION

PII Detection

Checksum-validated national IDs across the US, EU, and India, plus opt-in name and address NER. Two tiers: structured IDs block synchronously; names and addresses are an opt-in, latency-bounded NER pass.

§ 01

Coverage

RegionDetected typesTier
USSSN, credit card (Luhn), phone, email, IP (octet-bounded), date of birthTier-1
IndiaAadhaar (Verhoeff), PAN, GSTINTier-1
EU / UKIBAN (mod-97), UK NINO, Spanish DNI/NIE, Italian Codice Fiscale, German IdNr, French INSEETier-1
Any localeNames (PERSON), addresses (LOCATION), organizations — Presidio + GLiNERTier-2 (opt-in)
§ 02

Two tiers, by design

Tier 1 — deterministic + checksum. Sub-millisecond pattern recognizers, each gated by a checksum where one exists (Luhn for cards, Verhoeff for Aadhaar, mod-97 for IBAN, ISO 7064 for the German IdNr, and so on). Always synchronous; keeps false positives low because a candidate must actually validate.

Tier 2 — name/address NER. Microsoft Presidio with a GLiNER recognizer (urchade/gliner_multi_pii-v1, Apache-2.0) detects PERSON, LOCATION, and ORGANIZATION. It is opt-in per policy via sync_ner: the model is warm-loaded at startup, the call is Redis-cached by input hash, fired in parallel with the rule loop, hard-timeout-bounded, and governed by the policy's failure_mode.

§ 03

False-positive guards

TypeGuard
credit_cardLuhn checksum — a random 16-digit run does not match.
ip_addressEvery octet must be 0–255 — version strings like 1.2.3.400 are ignored.
aadhaar / iban / …Checksum-validated before counting as a detection.
§ 04

Creating a pii-access policy

Structured IDs in denied_pii_types block synchronously. Add sync_ner: true to also deny names/addresses via Tier-2 NER. Pair with failureMode: "fail_closed" for a compliance-grade control that blocks if a detector is unavailable.
curl -X POST https://api.execlave.com/api/v1/policies \  -H "Authorization: Bearer $EXECLAVE_API_KEY" \  -H "Content-Type: application/json" \  -d '{    "name": "Block National IDs + Names",    "policyType": "pii_access",    "enforcementMode": "block",    "failureMode": "fail_closed",    "ruleDefinition": {      "denied_pii_types": ["ssn", "credit_card", "aadhaar", "iban", "person", "address"],      "allowed_pii_types": ["email"],      "mask_output": true,      "log_access": true,      "sync_ner": true    }  }'
§ 05

Frequently asked questions

Which national IDs does Execlave detect?
US Social Security Numbers and (Luhn-checked) credit cards; India Aadhaar (Verhoeff checksum), PAN, and GSTIN; EU IBAN (mod-97), UK National Insurance Number, Spanish DNI/NIE, Italian Codice Fiscale, German IdNr, and French INSEE. Email, phone, IP address, and date of birth are also detected. Each ID type with a checksum is validated before it counts as a match, keeping false positives low.
Can Execlave detect names and physical addresses, not just structured IDs?
Yes, via opt-in Tier-2 NER. When a pii_access policy sets sync_ner: true and denies a name/address type (person, address, organization), Execlave runs Microsoft Presidio with a GLiNER recognizer to detect PERSON, LOCATION, and ORGANIZATION entities. It is latency-bounded: the model is warm-loaded once, the call is cached by input hash and fired in parallel with the rule engine, and the result is governed by the policy’s failure_mode.
Does the broad credit-card / IP pattern cause false positives?
No. Credit-card candidates must pass the Luhn checksum, and IP candidates must have every octet in range (0–255), so version strings like 1.2.3.400 are not flagged as IPs. Checksum-gated national IDs (Aadhaar, IBAN, Codice Fiscale, etc.) only count when a matched candidate actually validates.
Is name/address NER on the hot path by default?
No. Structured, checksum-validated IDs are detected synchronously and always. Name/address NER is off by default and only runs when a policy explicitly opts in with sync_ner: true — and even then it overlaps the rule loop, is hard-timeout-bounded, and follows the policy’s failure_mode on any error, so the synchronous enforce budget is preserved.