What KeyAudit indexes
KeyAudit is a public-facing index of 60M+ leaked wallet records across 36 blockchains, aggregated from 31 publicly credited data sources. Every record is tagged with the chain it was derived from, the source it came from, and a confidence tier that reflects how the leak was discovered.
The index is not a breach corpus in the credential-stuffing sense — most entries are addresses with derivable private keys, not stolen account credentials. The provenance breaks down roughly as: confirmed on-chain theft incidents, OFAC and similar sanctions lists, academic brain-wallet research datasets, community-curated scam address lists, and dictionary-derived theoretical matches.
Confidence tiers
Every record sits on a five-tier confidence ladder. The tier determines how much weight to put on a positive match.
confirmed_stolen— the address is documented as the destination or origin of an on-chain theft (rug pulls, exchange hacks, bridge exploits with public post-mortems).sanctioned— listed by OFAC, EU, UK, or similar authorities. A match here is a hard legal-risk signal, not just a security one.academic_dataset— extracted from peer-reviewed brain-wallet or weak-key research (e.g. Vasek et al., Trezor analyses). The address derived from a key the researchers showed to be guessable.community_curated— phishing-tracker lists (ScamSniffer, Chainabuse), CryptoScamDB, and similar volunteer-maintained corpora. Higher coverage, lower individual verification.dict_derived— addresses computed from common wordlists, leaked password dumps, and brain-wallet seed candidates. A hit here is theoretical — it means the input parses to an address an attacker could trivially derive, not that funds were actually stolen.
How a query is processed
The leak checker accepts three input shapes: a raw public address, a BIP-39 mnemonic, or a private key (hex or WIF). For mnemonics and keys, the input is hashed in your browser via SubtleCrypto.digest('SHA-256', ...) before any network request leaves your device. The server only sees a 32-byte hash.
The hash is checked first against an in-memory Bloom filter for O(1) rejection of non-matches, then against a MySQL index on address_hash for exact confirmation. No plaintext seed or key is ever transmitted, logged, or persisted server-side.
Data sources
Every entry in KeyAudit links back to its public source. The full list is at /en/source with per-source coverage statistics. Major contributors include the CryptoScamDB phishing corpus, OFAC SDN list, ScamSniffer indicator feeds, Vasek et al.'s 2014 brain-wallet study, and SecLists' top-1000 password dump derivations. We index nothing that isn't already public.
Limits and what we do not claim
A dict_derived hit is not evidence of theft. It means the input you queried derives to an address an attacker could trivially recompute from a common wordlist or password dump. If your wallet shows a dict_derived match, the prudent response is to migrate funds to a hardware-wallet-generated BIP-39 seed — but no third party necessarily knows your specific phrase.
Conversely, a clean lookup is not a guarantee of safety. Targeted attacks (SIM swaps, malware key exfiltration, supply-chain compromise) leave no trace in dictionary or research datasets. KeyAudit catches commodity-grade key compromise, not bespoke ones.
We do not run on-chain transaction-graph analysis (Chainalysis territory). We do not track wallet activity over time. We do not deanonymize. Every dataset we index is already public.
Update cadence
The index is refreshed on a rolling basis as upstream sources publish updates: sanctions lists weekly, scam-address feeds daily, academic corpora when new research lands. Aggregate statistics on /en/stats are recomputed every six hours from the live database.