Advanced Detection

Detection Methods

Prompt Guard uses a combination of pattern matching and information theory to detect known and unknown sensitive data.

Pattern-Based Detection

Uses regular expressions to match known patterns like email addresses, phone numbers, credit card numbers, and API keys with known prefixes.

Personal Information (PII)

Email Addresses
john.doe@example.com
Phone Numbers
(555) 123-4567, +1-555-0192
Social Security Numbers
123-45-6789
Credit Cards
4111 1111 1111 1111
IP Addresses
192.168.1.1
National IDs
Various formats globally

API Keys & Tokens

ProviderPattern
OpenAIsk-... or sk-proj-...
Anthropicsk-ant-...
GoogleAIza...
AWSAKIA...
GitHubghp_..., gho_..., ghs_...
Stripesk_live_..., pk_live_...
Azure86-char base64 keys
DiscordBot token format
Hugging Facehf_...
JWT TokenseyJ...

Secrets & Credentials

  • • Password assignments in code
  • • Private keys (RSA, SSH, PGP)
  • • Database connection strings
  • • OAuth tokens and bearer tokens

Shannon Entropy Analysis

Uses information theory to detect high-randomness strings that are likely secrets, even if they don't match known patterns.

How It Works

Shannon entropy measures the randomness of a string. Higher entropy means more randomness, which is characteristic of cryptographic secrets, API keys, and generated passwords.

Context-Aware Detection

Uses lower thresholds (>3.5 bits/char) when near keywords like "password", "key", "token", or "secret". Standalone strings use higher thresholds (>4.0).

Encoding Detection

Identifies Base64 and hexadecimal encoded secrets that might otherwise evade pattern-based detection.

Entropy Reference

Text TypeEntropyAssessment
English Text3.0 - 4.0Not a secret
Variable Names3.5 - 4.2Unlikely secret
Random Passwords4.5 - 6.0Likely secret
API Keys (Base64)5.5 - 6.0Definitely secret

Character Class Analysis

Checks for mix of uppercase, lowercase, digits, and symbols — a common characteristic of generated secrets.

Ready to protect your data?

Get started with Prompt Guard in seconds.