Resources
This is a list of links, tools and resources that can be useful for doing investigative and other stories. They have been sourced from the Global Investigative Journalism Network, Google, OSINT, International Consortium of Investigative Journalists, Knight Foundation, GitHub, Organized Crime and Corruption Reporting Project, Polis at London School of Economics, JournalismUK, Investigative Reporters and Editors and other sites.
If you have other resources you think could be useful or if any of these links do not work, please let me know: Bill Birnbauer
Google Tools
Various tools that empower journalists to do their work more efficiently, creatively, and securely: https://journaliststudio.google.com
Digital journalism tools: https://newsinitiative.withgoogle.com/resources/strengthen-digital-journalism/
Notebook
NotebookLM is great if you have numerous different formats of information such as PDFs, audio, videos, slides, images, websites etc. Upload them and Notebook will produce useful summaries of the data either in writing, audio or video - you choose. I do not support publishing stories written by AI, but Notebook is useful for researching and understanding vast numbers of documents that otherwise may take weeks to digest.
Getting started: https://blog.google/innovation-and-ai/products/notebooklm-beginner-tips/
Pinpoint
https://journaliststudio.google.com/pinpoint/about/
Pinpoint is designed specifically for journalists and requires a form to be completed stating you are a journalist. It is a research tool to help explore and analyse large collections of documents. Using Pinpoint you can upload and search hundreds of thousands of documents, images, emails, hand-written notes, and audio files for specific words or phrases, locations, organisations, and people. It's great for transcribing audio and videos into searchable text. It can cope with 200,000 documents, videos and audios and summarise / analyse them.
Getting started: https://www.youtube.com/watch?v=Hn9xgSMxawg&t=17s
Factcheck Explorer
https://toolbox.google.com/factcheck/explorer/search/list:recent;hl=en
This is a search engine designed for journalists, researchers, and the public to verify information by finding fact checks from independent organisations worldwide. It allows searching by keywords, topics, or image uploads to quickly identify debunked claims, providing, ratings, and source links for, recent, and, accurate information.
Gemini
Click on 'Tools' then 'Deep research' – great place to start an investigation.
Dorking
Google Dorking or Google Hacking is a term used to refer to advance searching on Google to access information not accessible with the standard searches. It involves writing precise search queries to locate hidden information, or not properly secured and gathering detailed information that is crucial for investigative reporting. Dorking is not illegal.
Google Dorking involves using commands to make your search precise for example:
site: "site:example.com "confidential"" finds pages on example.com that contain the word "confidential".
filetype: filetype:pdf "financial report" finds PDF files that include the phrase "financial report".
inurl: inurl:admin finds pages with "admin" in the URL, often related to administrative sections of websites.
intitle: intitle:"index of" finds directories and lists of files.
AROUND(X), produces a proximity search between two words. If you wanted to search for "Goldman Sachs AROUND(5) fraud," you could see all of the times Goldman Sachs, appeared within five words of "fraud" in a Google search.
Adding a "-" character before a search operator will exclude the indicated operator from the results—for example, "-filetype:pdf" if you'd rather Google not return direct links to PDF files.
Tips: https://www.imperva.com/learn/application-security/google-dorking-hacking/
Synth Id
https://deepmind.google/models/synthid/
SynthID adds an invisible digital watermark to an AI-generated image (or video segment). Gemini will check for a SynthID watermark, and let you know if it finds one
Osint
https://github.com/jivoi/awesome-osint
Huge number of resources and tools for open-source intelligence: things like searching the hidden web, social media, blogs, forums and heaps more specialised searches including username checks, image searches & analysis, geospatial research etc.
Columbia Journalism Review article on OSINT with more tools and how to use them: https://www.cjr.org/tow_center_reports/guide-to-osint-and-hostile-communities.php-
Bellingcat tools
Toolkit includes satellite and mapping services, tools for verifying photos and videos, websites to archive web pages, and much more. Most of the tools that are included can be used for free.
https://bellingcat.gitbook.io/toolkit
Toolkit in CSV format:
https://github.com/bellingcat/toolkit/releases/tag/csv
More tools available, some free others paid for, on this site: https://indicator.media
Who posted what? Check who posted on FB: It lets you search keywords on specific dates. https://whopostedwhat.com
Ai
AI reporting and tips from GIJN: https://gijn.org/resource/tech-focus-project/?mc_cid=6c44e1b7cf&mc_eid=b2e3804566
More on AI: https://gijn.org/resource/tech-focus-project-investigate-power-using-ai-tech/
Using AI in journalism:
https://gijn.org/stories/new-ai-large-language-model-tools-journalists/
https://gijn.org/stories/top-investigative-journalism-tools-2024/
AI produced formulas for Excel: https://gptexcel.uk
Creating AI content (see links):
https://docs.google.com/document/d/1gNC2MGTfg7e6pHYPwyY-DvISIwMY7zlW3ehW0Oh9eHo/edit?tab=t.0
Issues with AI: https://docs.google.com/document/d/12S0FN1hrZeWdVE5ekR_OolVZNVuyHRvzDELtrUh8fCw/edit?tab=t.0#heading=h.2f5o6wrqcwhm
Newsroom and AI uses: https://generative-ai-newsroom.com/the-role-of-ai-powered-search-tools-in-modern-newsrooms-fb9639785801
Ten newsletters about AI for journalists:
https://www.journalism.co.uk/ai-newsletters-for-journalists/
Journalism AI Starter Pack: a guide designed to help news organisations learn about the opportunities offered by artificial intelligence (AI) to support their journalism. This guide will be of use to any news organisation approaching AI technologies but it is especially targeted at small and local publishers.
https://docs.google.com/document/d/1pWwbqPERg0bUbMHMbYYDWmFQmWJYvK8N2Dmbenp4Qu0/edit?tab=t.0
Hands-on machine learning for journalists: https://www.journalismai.info/resources/training
Investigating AI images: https://gijn.org/academy/investigate-image-misinformation-artificial-intelligence/?mc_cid=29bfa50bb3&mc_eid=3cae7a9df5
Pangram detects the use of AI and plagiarism and allows four checks a day for free. A monthly subscription applies if you need to do more checks.
Website And Username Checks
Whatsmyname: https://whatsmyname.app
This app searches for a username across nearly 600 different online platforms. can be useful for backgrounding and finding connections.
WaybackMachine: The nonprofit Internet Archive created the Wayback Machine as a way to build a digital library of internet sites. Started in 1996, the website archives more than a billion web pages every day. This is useful for journalists when searching for information about a subject that may have been deleted.
Images
Reverse image searching
To know if the image is new or has been published on the internet before; You can upload images to perform "reverse image searches" on Google Images and Bing.
Also try: Yandex; I like Tin Eye but image should be max of 10MB; more available here: https://www.clearvoice.com/resources/reverse-image-search-tools/
Use a service such as WolframAlpha to find the weather for the location and time the user claims the picture was taken. If it's January in Times Square and everyone in the Instagram photo is wearing shorts, but the historical weather data says it was 4 degrees fahrenheit, be suspicious.
Reveye Chrome's photo search: https://chromewebstore.google.com/detail/reveye-reverse-image-sear/keaaclcjhehbbapnphnmpiklalfhelgf
Investigating AI images: https://gijn.org/academy/investigate-image-misinformation-artificial-intelligence/?mc_cid=29bfa50bb3&mc_eid=3cae7a9df5
Is it authentic, photoshopped, or an AI product?
With the current technology, any image can be manipulated or simply be AI-generated. Read an example in this link when an image of Eiffel Tower on fire went viral.
To know if it is manipulated or not upload the image to this website: https://fotoforensics.com/ or use the fake image detector website: https://www.fakeimagedetector.com/#google_vignette.
Facial Recognition
When dealing with faces, you may need to verify that the photo of the person you have is indeed of the individual they claim to be. For this purpose, you can perform a biometric facial search using tools like Pimeyes.com. It will search the internet and show the results but if you want details such as where the photos were published you will have to pay.
How to use Pimeyes: https://pimeyes.com/en/blog/how-to-use-pimeyes-a-brief-guide
Searching the internet by face: Facecheck.id
Facial Comparison
In some investigations, you may need to compare two faces to determine if they are of the same person. For this, you can use tools like Amazon Rekognition, FacePlusPlus, or MXFace, though these services may involve fees.
Verify video
You can add this to your internet browser: InVID Verification Plugin. This plugin can reverse videos and provide more information, such as the metadata of the video, and it can also be used for images.
Another method is to take screenshots of key frames from the video and use Google Images to perform a reverse image search. This can help you determine when the video first surfaced and whether it is linked to different locations or incidents.
Search for keywords used in the video's description to compare them to other versions, when applicable. Be sure to search outside of YouTube to websites like Vimeo and also fringe platforms like Bitchute, Pewtube, and D Tube.
Unblurring photos, text, geolocation:
Deep/ Dark Web/ Searching
Twelve search engines for the invisible web:
https://www.makeuseof.com/tag/10-search-engines-explore-deep-invisible-web/
Boardreader.com is a search website that indexes forums. The advanced search tool will give you results from specific websites, including Reddit, Voat, 4chan, 8chan, and others.
DuckDuckGo does not store your search history, personal information, or IP address, providing a truly private, non-personalised search experience.
https://monstercrawler.com/#gsc.tab=0
Monster Crawler combines the power of all the leading search engines together in one search box to deliver the best combined results. This is what we call metasearch.
https://www.bing.com - Microsoft search engine
Data Journalism
Listed books aimed at beginners in data journalism: https://infogram.com/blog/9-must-read-books-for-beginners-in-data-journalism/#comments
Data.gov.au is a central source of open data published by federal, state and local government agencies. It has many thousands of datasets created by more than 1100 different organisations. https://data.gov.au
Public data sources in other countries: https://id.occrp.org/databases/
Columbia University also makes available a summary of data journalism resources. The Knight Center for Journalism in the Americas, Datajournalism.com, and IRE offer courses and resources that help understand how to use some of the tools and programming languages.
The Data Journalism Handbook one and two from the European Journalism Centre and Google News Initiative.
Some background and 'how to' tips from Brant Houston:
Let the Sheet Do the Math so You Can Focus on the Story by Brant Houston.
To master spreadsheets, check out the Basics of Google Sheets. Coursera or edX, meanwhile, offer free video courses in Excel.
If you are more advanced, programming/ statistical tools/ cleaning/ formatting etc:
Python
R
SQL
Open Refine
PDF processing tools:
Tabula
pdfplumber/pdfminer
Files processing, exploration, and collaboration tools:
Aleph (apply to use, see: https://gijn.org/resource/using-aleph/?utm_campaign=linkinbio&utm_medium=referral&utm_source=later-linkinbio)
Datashare: https://datashare.icij.org/
Using the command line. You can understand the foundations of the command line from this Missing Semester at MIT material.
Transcription Tools
Seven transcription tools for journalists: https://www.journalism.co.uk/7-automated-transcription-tools-for-journalists/
Doing Investigative Journalism
Mark Lee Hunter: Story-based inquiry: a manual for investigative journalists: UNESCO
https://unesdoc.unesco.org/ark:/48223/pf0000396955
Introduction to investigative journalism: Brant Houston & Emilia Diaz-Struck
https://gijn.org/resource/introduction-investigative-journalism/
Training videos, masterclasses: https://gijn.org/academy/?mc_cid=c458a74d7c
Freedom Of Information
'How to Uncover Secrets': New FOI guide for Australian journalists: William Summers.
Whistleblowing
This guide is aimed at tech workers but can apply more broadly to those thinking of blowing the whistle in the public interest: https://techworkerhandbook.org
Off the record, background and deep background tips: http://blog.chrislkeller.com/aps-guidelines-for-off-the-record-background/
http://www.theguardian.com/media/2008/mar/17/pressandpublishing1
Metadata and what you need to know:
http://www.abc.net.au/news/2015-03-17/metadata-data-retention-what-is-it/6324962