Most people who teach themselves OSINT teach themselves the tools first. Maltego, Yandex, Sherlock, Hunchly, fifteen browser extensions. Then they take on a real investigation, hit a wall around hour three, and discover that they have no idea what to do with the information they’ve collected. The wall is process. The tools are the easy part.
This is the workflow I run. It’s adapted from the SANS SEC487 framework, the Hetherington Group intelligence-cycle teaching, and a decade of cleanup work on cases I or someone else got wrong. Five stages: intake, scope, collect, verify, report. The discipline of running every investigation through the same five stages is what produces consistently defensible work.
Stage 1: Intake
Before you open a tool, write down what’s being asked. Specifically:
- The question. One sentence. "Is the named partner of this consultancy a real person with a real footprint?" or "Where was this video filmed?" or "Has this email address been associated with phishing?"
- The requester. Who is paying for the answer or has commissioned the work. For pro-bono journalism this might be your editor, the desk, or yourself.
- The legitimate purpose. What use will the answer be put to. "Pre-contract due diligence," "verifying a source for a published story," "incident response on a phishing report."
- The deadline. Rough order of hours, days, or weeks.
- The bar. What confidence level is good enough. "I need to be able to publish this" is a different bar from "I need to know whether to bother investigating further."
If any of those five fields is hard to fill in, the investigation isn’t ready to start. Push back to the requester. The most common cause of a bad finding is starting an investigation whose question you misunderstood.
I keep a one-page intake template. Yours should fit on a single screen. Resist the urge to make it longer; the discipline is in the constraint.
Stage 2: Scope
Scope is the stage where you decide what’s in and what’s out. It’s the protection against scope creep, against accidentally investigating the wrong person, and against the most common failure mode: collecting too much information and drowning in it.
Three sub-questions:
- Who or what is the target? Specific, with identifiers. "John Smith, born approximately 1975, named as a director of Acme Consulting Ltd, UK Companies House number 12345678." Not "the John Smith from that consultancy."
- Who or what is explicitly out of scope? Family members, business partners, prior employers’ staff. Anyone whose information you don’t have a legitimate purpose to investigate, even if they show up in your collection.
- What sources are in scope? Public records, social media, professional registries, court filings, press archives. What’s not in scope: scraping behind login walls you don’t have authorisation for, paid people-search aggregators that source from breach data, anything that requires impersonation or pretexting.
Write this down. The scope statement is what you reach for when, three days into the investigation, you find a relative of the target who looks interesting and you have to decide whether to follow that thread. (The answer is almost always no.)
Stage 3: Collect
This is where the tools earn their keep. The collection stage is straightforward in the abstract: run your queries, capture what you find, log everything. The discipline is in the capture.
A few non-negotiables:
- Use a capture tool. Hunchly is the standard. It runs in your browser and silently archives every page you visit while a case is open, with a hash, a timestamp, and a fully searchable local archive. If you can’t afford Hunchly, the Wayback Machine browser extension at least gets you a public archive URL for everything you save. Doing this manually with screenshots is for amateurs.
- One case folder per investigation. Date-prefixed, structured: raw captures, exports from automated tools, your notes, your timeline, your draft report. The folder is the audit trail.
- Log every query. What you searched, where, when, and what you got. Future-you (and future-your-editor or future-your-court) will need to retrace the investigation. Make that easy.
- Score your confidence as you go. When you record a finding, write next to it: "high / medium / low confidence" with a one-line reason. Don’t promote a low-confidence lead to a high-confidence finding without a verification step.
If you’re collecting information about identifiable people, this is also the stage at which the data-protection regime in your jurisdiction starts to apply. In the EU, GDPR Article 6 requires a lawful basis for processing personal data; for investigative journalism the usual basis is the public-interest exception, but you have to be able to articulate it. The intake-stage paperwork is not just bureaucracy; it’s the defence-in-depth that lets you show you operated lawfully.
Stage 4: Verify
Verification is what separates an OSINT analyst from a researcher with a Pinterest board of suspicious-looking facts. Every claim that’s going to make it into the final report has to clear a verification bar.
The bar I use:
- Two-source rule for any factual claim that names a person. Two independent sources, not one source quoted twice. "LinkedIn says X and the company website confirms X" is one source: the same person controls both surfaces.
- Provenance for every image, video, and document. Where did it come from, when, who first published it, has it been modified. Run images through reverse image search. Run videos through InVID. Run documents through whatever metadata extractor your toolchain favours.
- Hostile review. Take fifteen minutes at the end of collection and try to disprove your own finding. What’s the strongest counter-argument? What evidence would you expect to see if you were wrong, that you haven’t looked for? Cases die at this stage when they should die.
- Second analyst on important findings. Have someone else repeat the critical step independently, without telling them the answer first. If they reach a different conclusion you need to understand why.
Verification is the stage that’s most often skipped by analysts under deadline pressure. It’s also the stage whose absence is most obvious in retrospect when an investigation goes wrong. If you have to cut a corner, cut it somewhere else.
Stage 5: Report
The report is the deliverable. Its job is to communicate the answer, the confidence level, the evidence, and the limitations to an audience that wasn’t in the investigation with you. A good report can be re-read in six months by a stranger and they can both reach the same conclusion you did and identify any of the same weaknesses.
A working template:
- Question. One sentence. The same one from intake.
- Bottom line. The answer, with confidence level. "Yes, with high confidence." "No, with medium confidence; the strongest counter-evidence is X."
- Key findings. Three to seven, each with: the finding, the supporting evidence, the confidence level, the limitations.
- Methodology. What sources you used, what you didn’t, why.
- Timeline. What was done when. (Hunchly will produce this for you.)
- Limitations. What you couldn’t find out, where the evidence runs thin, what assumptions you’ve made.
- Annexes. Captures, exports, raw data.
The report should not contain new information that wasn’t in the investigation. If you find yourself adding a finding while writing the report, it goes back to verification first.
The trap
The trap is that none of this is glamorous. The five-stage workflow doesn’t produce TikTok-worthy moments. It produces investigations that are defensible, repeatable, and useful to the people who paid for them. The analysts who skip the workflow can move faster on easy cases and fall apart on hard ones. The analysts who run the workflow look slow at first and turn out to be the ones still working in the field five years later.
Pick one investigation in the next month. Run it through the five stages, with the paperwork, even if it’s a tiny case. The discipline transfers. By the third investigation it’ll feel automatic. By the tenth, you’ll wonder how anyone works any other way.
Further reading
- SANS SEC487 syllabus
- The Intelligence Cycle (Office of the Director of National Intelligence)
- First Draft / Shorenstein verification handbook
- Bellingcat’s case studies, read three and reverse-engineer the workflow
