Detect and de-identify personal data offline - validated recognizers, reversible encryption, structured JSON/CSV redaction, and optional multilingual NER. PII never leaves your process.
dotnet add package TasmanianDevil
Quick Start
One facade wires the analyzer, anonymizer, and deanonymizer from a single options object - or compose the engines directly for full control.
1. The facade - one call to de-identify
using TasmanianDevil;
var engine = new PiiEngine();
var result = engine.Deidentify("Email jane@contoso.com or call +1 425 555 0100.");
Console.WriteLine(result.AnonymizedText);
// Email <EMAIL_ADDRESS> or call <PHONE_NUMBER>.
2. Per-entity operators + opt-in country pack
using TasmanianDevil;
var options = new PiiOptions
{
Countries = [PiiCountries.De],
Operators = new Dictionary<string, OperatorConfig>
{
["EMAIL_ADDRESS"] = new("mask", new() { [OperatorParams.CharsToMask] = 6 }),
["CREDIT_CARD"] = new("redact"),
["DEFAULT"] = new("encrypt", new() { [OperatorParams.Key] = key }),
},
};
var engine = new PiiEngine(options);
var deid = engine.Deidentify(text); // deid.IsReversible == true
3. Reversible round-trip - encrypt out, restore exactly
// hand the opaque text to a third party, then decrypt it back byte-for-byte
var decrypt = new Dictionary<string, OperatorConfig>
{
["DEFAULT"] = new("decrypt", new() { [OperatorParams.Key] = key }),
};
var restored = engine.Reidentify(deid, decrypt);
// restored.Text == original text
Why TasmanianDevil
Architecture inspired by Microsoft Presidio (MIT; not affiliated or endorsed), rebuilt from scratch as idiomatic, dependency-light C#.
A 16-digit number is only a credit card if it passes Luhn; an IBAN if it passes mod-97. Real validation (Luhn, mod-97, Verhoeff, ISO-7064, ICAO, bech32) cuts false positives that naive regex can't.
A bare token scores low and drops; nearby words ("card", "IBAN", "postcode") lift it over threshold via a dependency-free Porter-stemmer lemma matcher. Detection adapts to surrounding text.
Encrypt PII spans (AES), hand the opaque text to a third party, and decrypt the exact original back. Lossy operators (mask, hash, redact) report themselves non-reversible; a wrong key fails loudly.
Redact JSON by key path (allow / deny) preserving shape and non-string types, or infer PII columns in CSV/TSV and redact them consistently - reusing the same recognizers and operators.
Analyze and anonymize lists or keyed record dictionaries in one call, results aligned to the input. Allocation-light and safe to share across threads.
The whole engine runs with zero models. When you want more, TasmanianDevil.Onnx plugs a real multilingual span-NER model into the same pipeline - the hard part, shipped.
Detection Coverage
Generic and US recognizers are always on. Opt-in country packs add national identifiers, validated with real checksums and off by default to keep false positives low.
8 generic recognizers (email, phone via libphonenumber, credit card / Luhn, IBAN / mod-97, crypto, IP, URL, MAC) and a
9-entity US pack are always on. Opt-in country packs add national IDs, tax numbers, and passports for the
UK, Germany, India, Italy, and Spain. An optional offline ONNX add-on layers in multilingual named-entity detection
(PERSON, LOCATION, ORGANIZATION, DATE_TIME), all resolved in one pass.
Operators
Per-entity or via a DEFAULT fallback. Compose the analyzer and anonymizer directly,
or drive everything through the PiiEngine facade.
using TasmanianDevil.Analyzer;
using TasmanianDevil.Anonymizer;
var analyzer = new AnalyzerEngine(PiiRecognizers.CreateDefaultRegistry("en"));
var anonymizer = new AnonymizerEngine();
// replace (default) · redact · mask · hash (salted) · encrypt ↔ decrypt · keep · custom
var results = analyzer.Analyze(text, "en");
var anonymized = anonymizer.Anonymize(text, results, operators);
// structured + batch reuse the very same recognizers and operators
engine.AnonymizeJson(json, new JsonRedactionScope { IncludePaths = ["user.email"] });
engine.AnonymizeCsv(header, rows); // infers which columns are PII
engine.AnonymizeBatch(keyedRecords); // aligned results per key
Optional ONNX NER
TasmanianDevil.Onnx adds PERSON / LOCATION / ORGANIZATION / DATE_TIME detection
via a zero-shot, multilingual GLiNER model - run through Kyoto.
It registers as an ordinary recognizer, so its spans flow through the same overlap resolution and
anonymization as the regex/checksum entities.
using TasmanianDevil.Onnx;
// the ONNX export is published on Hugging Face (filip-w/gliner-multi-pii-onnx)
var ner = new GlinerNerRecognizer(new GlinerNerOptions
{
ModelPath = modelPath, TokenizerPath = spmPath, ConfigPath = configPath,
});
registry.AddRecognizer(ner); // PERSON/LOCATION/... now join the same analyzer pass