Ransomware leak sites are the public face of double-extortion. They are also one of the richest, most underused open-source intelligence (OSINT) feeds in cybercrime research. Every fresh victim listing on a Tor-hosted “wall of shame” gives investigators a starting thread: a sector, a domain, a country, a sample, a deadline. Pull on it carefully and you can map an entire campaign, affiliates, tooling, initial-access broker pipelines, even the operator’s working hours. Pull on it carelessly and you tip off the threat actor, contaminate evidence, or worse, expose a victim that hadn’t disclosed yet.
This walkthrough is the methodology our research team uses behind Ransomtracker, the leak-site monitoring system that powers the victim feed across this site. It is written for SOC analysts, threat intel teams, journalists, and incident responders who need a repeatable, defensible workflow for ransomware leak-site OSINT in 2026. No bash heroics, no zero-day onion scrapers, just a clean process, the right sources, and the discipline to verify before you publish.
The investigation workflow at a glance
Step 1: spot, your sources matter more than your speed
The first job is detection. In 2026 there are roughly 80 active ransomware data-leak sites tracked across the public threat-intelligence community, and the half-life of any individual site is short, operators rotate onion addresses, mirrors go dark for hours at a time, and the busiest groups (LockBit successors, Akira, BlackCat lineage variants, Play, Qilin, RansomHub-class operators) publish on multiple mirrors simultaneously to defeat takedowns. A serious OSINT pipeline subscribes to at least three independent feeds: your own Tor crawler, a community feed such as Ransomwhere or the open mirror lists maintained by independent researchers, and a commercial or curated feed like Ransomtracker that handles the onion plumbing and presents normalised records.
Why three? Because every feed misses something. Crawlers fail, classifiers mis-tag re-listings as new victims, and operators sometimes publish to invite-only side channels (Telegram, XMPP, or, increasingly, custom forum software hosted behind access-controlled darknet markets) before the public wall. If you only have one source, you don’t have triangulation, you have a single point of failure.
Step 2: capture, evidence first, analysis second
When you spot a new listing, the next move is to preserve it. Threat actors edit posts. They retract victim names after ransoms are paid, switch deadlines, swap samples, and occasionally remove entries because the victim turned out to be a customer of a partner operation. If you analyse without capturing first, your analysis can be invalidated by a refresh.
Capture should include: a full-page screenshot (PNG, not just the visible viewport), the raw HTML, every linked file’s SHA-256 hash even if you don’t download the file itself, the onion address and the path, an ISO-8601 timestamp from a synchronised source, and a hash of the screenshot bundle so you can prove it hasn’t been edited later. Store all of this in a write-once location, an object store with bucket-level immutability is ideal, but even a Git repository with signed commits is workable. The point is that your “ground truth” must outlive the leak site itself.
Operational security for the capture step is its own discipline. Use a dedicated investigative browser profile on a hardened VM. Route through Tor (never through your corporate proxy, referer leaks are real). Disable JavaScript by default and only enable it on a per-page basis if the listing requires it. Never log in to anything; never submit a form; never download a sample directly to your investigative machine. Privacy hygiene on the investigator side is not optional, operators do fingerprint visitors, and a sloppy SOC analyst has leaked organisation details to threat actors more than once.
Step 3: attribute, who is the victim, really?
This is where many investigations go off the rails. The leak post says “Acme Corp”, but is it the global parent, a regional subsidiary, a franchisee, a customer of Acme Corp whose data Acme processed, or a completely different company that shares a similar name? Mis-attribution is the single most common error in public leak-site reporting, and the cost is real: lost trust with sources, lawsuit exposure, and contaminated downstream feeds.
Build the attribution from triangulating these signals: the entity name spelled exactly as on the post; any logos visible in screenshots; sample file paths (employee names in directory trees, internal hostnames in screenshots, document footers); domain mentions in the post body or in sample files; and the size of the dump in bytes (operators often display this, and a 4 TB dump is unlikely to be a 12-employee subsidiary). Then validate the inferred entity against public records, corporate registries (Companies House in the UK, OpenCorporates internationally, EDGAR for US public filers), DNS WHOIS (where unmasked), and the entity’s own website for branding, sector, and headcount.
Pay particular attention to victims that are managed service providers or law firms. A single MSP compromise can leak data belonging to dozens of downstream clients, and the leak-site listing will often name only the MSP. Your responsibility as an investigator includes flagging “downstream exposure likely” when the victim profile makes onward impact plausible.
Step 4: verify, sort real victims from re-listings and fakes
Not every entry on a leak site is a fresh, genuine compromise. There are four common false-positive classes that every investigator must learn to recognise: (a) re-listings, where one affiliate re-posts data from a previous breach claimed by another group, sometimes years old; (b) fakes, where a low-skill operator posts samples generated from a customer-side data exposure (e.g. an open S3 bucket) rather than an actual network intrusion; (c) re-victimisation, where the same target is compromised twice and the second crew genuinely owns new data; and (d) staging posts, where operators publish a placeholder with no samples to pressure the victim into negotiation.
Verification techniques include hashing the leaked samples and searching prior breach databases for collisions, looking at file modification times inside the dump (a “fresh” breach with files all dated 2021 should ring alarms), checking whether the same archive structure has appeared on other groups’ sites, and corroborating with the victim’s own disclosures or industry reporting. Stealer-log databases can be cross-referenced too: if a victim already had infostealer-derived credentials circulating before the supposed intrusion, the path-of-entry story becomes more plausible, but it also raises the bar for novelty in the leak claim itself.
Step 5: enrich, connect this victim to the broader campaign
One victim is a data point. Multiple victims, grouped by tooling, affiliate ID, ransom-note structure, or initial-access vector, are a campaign, and campaigns are what defenders need to act on. Enrichment is where you stop describing the listing and start contextualising it: which affiliate likely ran the operation, which initial-access broker likely sold the foothold (cross-reference our 2026 IAB supply-chain analysis), which TTPs match prior incidents.
Useful enrichment sources include MISP communities for the IOCs you can extract from any leaked sample (don’t open the file, use metadata), VirusTotal for any binary fingerprints visible in screenshots, ATT&CK mappings against the operator’s known playbook, and your own historical victim database. Affiliate-ID extraction from ransom notes is increasingly important in 2026 because RaaS programs publish unique IDs in note headers and these survive across operator rebrands, a known affiliate that moved from a defunct group to a successor program is a strong signal for defenders.
Step 6: context, the sector, the trend, the bigger picture
Once you’ve enriched the individual record, zoom out. Is this victim’s sector seeing elevated targeting this quarter? Is this group’s volume up or down compared to the trailing twelve weeks? Are similar victims (size, geography, sector) being hit by the same affiliate? Context is what turns an individual incident report into actionable threat intelligence for stakeholders who don’t share the victim’s sector.
This is where a curated, longitudinal dataset like Ransomtracker pays off. You can answer questions like “how many UK manufacturing victims has this group claimed in the last 90 days” without rebuilding your own corpus from scratch every time. If you’re running your own crawler, build the queryable index from day one, flat JSON dumps in a folder are not a research dataset, they’re a graveyard.
Step 7: report, TLP discipline and the responsible disclosure question
Now the hard part: what do you do with what you’ve found? The default for any ransomware leak-site finding is TLP:AMBER until you have explicit reason to broaden it. The reasoning is straightforward, victims often haven’t disclosed publicly yet, and a SOC analyst’s tweet has more than once been the moment a regulator first learned a regulated entity was breached. That is not the analyst’s call to make.
The exceptions are narrow: if the victim has already self-disclosed (press release, 8-K filing, customer notification), the listing is public commentary. If the victim is a public infrastructure operator and the leak claims active exposure to downstream services, the public-interest case for broader sharing is stronger but should still flow through a CSIRT or sectoral ISAC rather than a social-media post. When in doubt, share narrowly, log your reasoning, and let the trust-and-safety conversation play out before going public.
Step 8: monitor, the listing is a living document
Don’t close the case at publication. Monitor the listing on a schedule for as long as it remains live. The events that matter post-publication include: removal of the entry (often a strong signal that a ransom was paid, although operators sometimes scrub data for other reasons); deadline extensions; sample additions (operators escalate when negotiations stall); deletion of preview links (could indicate negotiation, could indicate operator panic); and complete data dumps (the worst-case outcome, at this point the data is public for everyone, including secondary threat actors who will package it for resale).
Each of these post-publication events updates your understanding of the case and, in aggregate, of the operator. Track them. Re-screenshot when the entry changes. Log every transition with a timestamp.
Common OSINT pitfalls (and how to avoid them)
Tooling: what we actually use
The toolkit for ransomware leak-site OSINT in 2026 is mostly open source and very boring, which is how it should be. A typical investigator setup includes Tor Browser on a hardened Linux VM (we prefer disposable Whonix gateway + workstation pairs); a screenshot pipeline that produces full-page PNGs with a SHA-256 sidecar (a small Playwright script over a Tor SOCKS proxy is enough); a write-once evidence store (S3 with object lock, MinIO with WORM bucket policy, or Git with signed commits); a structured intake form so every record has the same fields (group, victim claimed, sector, country, listing URL, listing first seen, sample present yes/no, sample size, payment deadline, status); and a query layer over the dataset (PostgreSQL is overkill until it isn’t, start there).
For enrichment, MISP for IOC sharing, OpenCTI for graph-style campaign mapping, and your own internal incident dataset for collisions are the workhorses. For monitoring across many onion mirrors at scale, you can either run your own crawler (and accept the operational burden, keeping Tor circuits healthy and mirror lists current is a job in itself) or subscribe to a curated feed. Ransomtracker is our own production answer to the curation problem; it gives you the normalised victim feed without forcing you to maintain the onion plumbing yourself.
Putting it into practice: a worked example shape
Imagine you spot a new entry: “Acme Manufacturing, 2.4 TB, deadline 7 days, sample preview.” Step 1 done, your crawler tagged it. Step 2: snapshot the page, hash everything, store in WORM. Step 3: there are three companies trading as Acme Manufacturing internationally; the leaked sample filenames mention a German legal entity, so you attribute to the German GmbH and note headcount ~340 from the corporate registry. Step 4: the sample file timestamps are from the last six weeks and the archive structure matches three recent victims of the same group, so this looks genuine; no public disclosure yet. Step 5: ransom-note affiliate ID matches an affiliate seen in two prior cases this quarter; one of those used compromised RDP credentials sourced from a known IAB marketplace (see our RDP attack landscape analysis). Step 6: this group’s manufacturing victim count is up 60% quarter-over-quarter; this is a campaign, not an outlier. Step 7: TLP:AMBER report to your sectoral ISAC; no public commentary yet. Step 8: monitor the listing, sample swap, deadline extension, removal all merit case updates.
That entire cycle, well-practised, takes a single analyst about two hours. The discipline is keeping every step reproducible, every claim sourced, every artifact preserved. Sloppy speed is worse than careful slowness.
Ethical and legal lines
A few lines never get crossed. Don’t download the dump, possession of stolen personal data carries weight under GDPR, CCPA, and equivalent regimes regardless of investigator intent. Don’t negotiate with the operator on behalf of a victim (that is law enforcement and IR-firm work, with chain-of-custody implications you don’t want to inherit). Don’t name a victim publicly before the victim themselves has, even if the listing is “public,” your retweet is the moment thousands of people see it. Don’t share data with anyone in a country where its possession is criminal under sanctions or data-protection law.
The ethical posture for ransomware OSINT in 2026 is the same as for any sensitive intelligence: minimise harm, maximise utility for defenders, and never become an unwitting amplifier for the operator’s pressure campaign. That last point is subtle, operators publish to coerce, and republishing without analysis adds to their leverage. Add value before you add visibility.
Frequently asked questions
How accurate are public ransomware leak-site trackers?
Public trackers vary widely. Well-maintained, curated trackers achieve 95%+ accuracy on victim identification because they verify with corporate registries before publishing. Raw scrapers often run lower, closer to 80%, because they take operator claims at face value. The gap between “an entry exists” and “a verified breach” matters; treat any single source as a starting point, not a confirmed finding.
Can I investigate leak sites without using Tor directly?
Yes, most serious investigators consume normalised feeds from a curated tracker rather than browsing onion services directly. This shifts the operational-security burden to the tracker operator and reduces your exposure. Direct browsing is still useful for verification of specific posts and for capturing screenshots, but it should be the exception, not the routine.
What is the difference between TLP:AMBER and TLP:RED for these reports?
TLP:AMBER means “limited disclosure, restricted to participants’ organisations”, appropriate for most early-stage leak-site analysis where the victim hasn’t disclosed yet. TLP:RED is “named recipients only” and is appropriate when the data could materially compromise an active investigation or the victim’s negotiation. Most leak-site OSINT lives at TLP:AMBER and graduates to TLP:GREEN or TLP:CLEAR only after the victim has self-disclosed.
How do I tell if a “new” victim listing is actually a re-listing of an older breach?
Hash the sample files visible in the listing and search prior breach databases for collisions. Check the archive structure, re-listings often preserve folder layouts from the original breach. Look at file modification times; a “fresh” breach with two-year-old timestamps is highly suspicious. Cross-reference the victim’s name against your historical victim database for prior incidents. Genuine re-victimisation does occur, distinguishing it from a re-listing is exactly the skill that separates good investigators from feed-republishers.
Is it legal to view ransomware leak sites?
In most jurisdictions, browsing publicly accessible onion sites is legal. The legal exposure starts with possession, redistribution, or interaction with criminal services. Investigators should consult their organisation’s legal team before establishing a programme; in regulated industries some sectors require explicit authorisation to access these resources at all. Document your authorisation chain in writing before you start.
Closing thought
Ransomware leak-site OSINT is one of the few areas of cybercrime research where the data is genuinely public, genuinely abundant, and genuinely actionable for defenders. The constraint is not access, it is discipline. The same eight-step workflow, run consistently across years and operators, produces a body of evidence that informs everything from sector-specific defensive guidance to international law-enforcement disruptions. The investigators who matter are the ones who built the workflow and kept the artifacts. The ones who chased the dopamine of a fast tweet did not, on average, contribute to the long arc of disruption.
If you’d rather not build the crawler yourself, the same dataset that backs this methodology is live at Ransomtracker, verified, normalised, queryable. If your job intersects with stealer logs as a precursor to ransomware, Stealercheck covers the exposure side of the same supply chain. And if you want the underlying threat actor catalogue, the cybercrime archive is where the group-level deep dives live.
Slow down. Capture before you analyse. Verify before you publish. The rest is repetition.
