What KeyAudit indexes
KeyAudit is a public-facing index of 6.94 million leaked wallet records across 28 blockchains, aggregated from 25 publicly credited data sources. Every record is tagged with the chain it was derived from, the source it came from, and a confidence tier that reflects how the leak was discovered.
The index is not a breach corpus in the credential-stuffing sense โ most entries are addresses with derivable private keys, not stolen account credentials. The provenance breaks down roughly as: confirmed on-chain theft incidents, OFAC and similar sanctions lists, academic brain-wallet research datasets, community-curated scam address lists, and dictionary-derived theoretical matches.
Confidence tiers
Every record sits on a five-tier confidence ladder. The tier determines how much weight to put on a positive match.
confirmed_stolenโ the address is documented as the destination or origin of an on-chain theft (rug pulls, exchange hacks, bridge exploits with public post-mortems).sanctionedโ listed by OFAC, EU, UK, or similar authorities. A match here is a hard legal-risk signal, not just a security one.academic_datasetโ extracted from peer-reviewed brain-wallet or weak-key research (e.g. Vasek et al., Trezor analyses). The address derived from a key the researchers showed to be guessable.community_curatedโ phishing-tracker lists (ScamSniffer, Chainabuse), CryptoScamDB, and similar volunteer-maintained corpora. Higher coverage, lower individual verification.dict_derivedโ addresses computed from common wordlists, leaked password dumps, and brain-wallet seed candidates. A hit here is theoretical โ it means the input parses to an address an attacker could trivially derive, not that funds were actually stolen.
How a query is processed
The leak checker accepts three input shapes: a raw public address, a BIP-39 mnemonic, or a private key (hex or WIF). For mnemonics and keys, the input is hashed in your browser via SubtleCrypto.digest('SHA-256', ...) before any network request leaves your device. The server only sees a 32-byte hash.
The hash is checked first against an in-memory Bloom filter for O(1) rejection of non-matches, then against a MySQL index on address_hash for exact confirmation. No plaintext seed or key is ever transmitted, logged, or persisted server-side.
Data sources
Every entry in KeyAudit links back to its public source. The full list is at /en/source with per-source coverage statistics. Major contributors include the CryptoScamDB phishing corpus, OFAC SDN list, ScamSniffer indicator feeds, Vasek et al.'s 2014 brain-wallet study, and SecLists' top-1000 password dump derivations. We index nothing that isn't already public.
Limits and what we do not claim
A dict_derived hit is not evidence of theft. It means the input you queried derives to an address an attacker could trivially recompute from a common wordlist or password dump. If your wallet shows a dict_derived match, the prudent response is to migrate funds to a hardware-wallet-generated BIP-39 seed โ but no third party necessarily knows your specific phrase.
Conversely, a clean lookup is not a guarantee of safety. Targeted attacks (SIM swaps, malware key exfiltration, supply-chain compromise) leave no trace in dictionary or research datasets. KeyAudit catches commodity-grade key compromise, not bespoke ones.
We do not run on-chain transaction-graph analysis (Chainalysis territory). We do not track wallet activity over time. We do not deanonymize. Every dataset we index is already public.
Update cadence
The index is refreshed on a rolling basis as upstream sources publish updates: sanctions lists weekly, scam-address feeds daily, academic corpora when new research lands. Aggregate statistics on /en/stats are recomputed every six hours from the live database.