This week’s blog post is the first part of a two-part series on how to use Google for OSINT. I will talk about three things, namely:
- How to set up your browser for basic OSINT
- How Google Search works
- Some tips & tricks for searching effectively and efficiently
Setting up your web browser for basic OSINT
There are many web browsers out there, including ones that respect your privacy, such as Qwant. For OSINT, however, I generally recommend Google Chrome or Firefox, mainly because of their web developer tools – amongst other things. Of course, there are also disadvantages, but I will not discuss this here. If you don’t know anything about dev tools for OSINT, check out our OSINTCurious 10-minute tip on browser developer tools by Micah.
Once you’ve selected your web browser, Google Chrome, in my case, let’s quickly have a look at useful browser extensions for searching more effectively and efficiently. These mini-applications help you in many ways and should be part of any basic OSINT set-up. Furthermore, you should check out Dutch OSINT Guy’s guide on Basic OPSEC tips & tricks for OSINT researchers because OPSEC should be part of any investigation regardless of the threat level. In Dutch’s post, you will also find useful extensions to improve your OPSEC.
Top 5 Chrome Extensions for OSINT
Google Translate is a must-have. With just one click, you can translate the whole page or parts of the text. But always keep in mind that these translations aren’t perfect. If you’ve found something crucial to your investigation, double-check with a native speaker because they understand not only the language but also the cultural context. And don’t forget that Google Translate can’t help with text embedded in images. For this you’d need an OCR-featuered tool, such as Yandex. For more tips and tricks around OSINT in foreign languages, check out @MW-OSINT’s latest blog post.
How many times have you clicked on a link and got an error message, saying the page is not available or some other issue? With the Internet Archive Wayback Machine, you can forget about these problems and focus on what’s more important – finding information. Once installed, this extension automatically detects error codes (4XX and 5XX) and checks if an archived version of the requested URL is available. If so, it will ask you if you’d like to view the archived version (Fig. 1).
Go Back in Time
Similar to Wayback Machine, you can check if there’s an archived copy of your target website available. To do so, you need to right-click somewhere on the page and select one of the options. In comparison to Wayback Machine, you can also search for cached copies. Really useful and a must-have when working with websites (Fig. 2).
This extension has been designed for journalists, fact-checkers, and others as a verification “Swiss Army knife” for time-saving and workflow-efficiency purposes (Fig.3). There are so many functionalities and things to point out that I’ll save this for another post. It is such a fantastic tool that it deserves more attention than just a paragraph. If you work a lot with images and videos, you should definitely check it out! For more information, visit their website.
Last but not least, I want to show you Extensions Manager. This extension is a brilliant tool with which you can easily enable and disable your extensions. With just a few clicks, you can tidy up your browser and finally see the full URL again. If you’re a big fan of extensions, I highly recommend this tool (Fig.4).
How Google Search works
Now that we’ve set up our browser, it’s time to talk about basic principles for effective and efficient Google-ing. The first basic principle is to understand how Google Search actually works. It’s crucial if you want to leverage Google Search for OSINT because it can either work for or against you, but I assume you prefer the former. So, let’s have a look at how it works.
For the official explanation, click here.
There are three main components to Google Search:
Crawling and Indexing
The first component refers to crawling and indexing publicly available web pages. While you’re reading this blog post, web crawlers are organising information from web pages and other publicly available content in the Search index. In essence, Google’s web crawlers visit known web pages and use the outgoing links to discover other pages – amongst other techniques. Google focuses on finding new sites, changes to existing pages and dead links. What’s important here is that site owners can either allow or disallow Google to crawl their website. This can be quickly done by changing the so-called robots.txt file. If you want to learn how you can leverage this file for OSINT, check out our 10-minute OSINT tip here.
What I want to point out in this context is that you could easily overlook crucial information on a website just because the website owner hasn’t allowed the Google bot to index this particular page. So, keep that in mind when you use search engines as the restrictions in the robots.txt files can be applied to some, all, or none of the internet’s search engine crawlers! This is one of the reasons why you should use multiple search engines and visit the website of interest, but I’ll come back to this later.
The second component of Google Search is the search algorithms. Without these, Google would not be able to serve you the most useful information. In other words, it’d present you results that are not relevant to your query and most likely not relevant to your questions. These algorithms take several factors into account, such as relevance, usability, the expertise of sources but most importantly, also your location and settings. This is very important to understand if you want to find information faster and easier.
For example, if you’re looking for relevant information about a person and his business associates in Germany, it wouldn’t make sense to use google.co.uk and have a London-based IP address. Don’t get me wrong, you’d still be able to find information, however, it’d take much longer and you will most likely miss the important stuff! You might think that you can do the same thing with changing Google’s settings from UK to Germany, but it doesn’t work as well. Here’s a quick experiment I ran. What I found was that the total number of results and the ranking was different (see Fig.5).
The final component of Google Search refers to the development of useful responses. Google is continuously working on improving its search algorithms to make it easier for us to find answers much quicker and easier. To that end, Google has been working on enhancing Search by introducing new response types. These can take many forms. One such response type is the Knowledge Graph – a huge database comprised of “more than one billion real-world people, places and things and over 50 billion facts as well as connections among them”. The Knowledge Graph is just one of several response types, which are immensely valuable for OSINT. For example, let’s assume you’re doing some chronolocation and you want to know at what time sunset was in London on a particular date. Well, it sounds like a lot of work, but with Knowledge Graph, you can simply type that question into Google – even without the quotation marks (see Fig.6).
The bottom line is, if you want to leverage Google Search, you need to understand how it works. Most importantly, you need to be very familiar with all its features and limitations. Don’t get me wrong, you can still use Google without that knowledge, but your searches will not be as efficient and effective as they could be.
Tips and Tricks for searching more effectively and efficiently
Now that we know how Google Search actually works, it’s time for some tips and tricks to make your searches more effective and efficient:
- Always use Incognito Mode when searching online (cmd+Shift+N for Mac Users). This is to prevent Google from presenting us results that are based on previous searches, our preferences, and other stuff. Important to highlight is that incognito mode does not make you anonymous! To hide your IP, you’d need to use a VPN, for example.
- Use multiple search engines. Google is just one of many search engines. Even though it’s excellent, its search algorithms are different than those from Bing, DuckDuckGo, and Yandex. As there are billions of web pages, it could be that the information you’re looking for ranks higher on Yandex’s search index than on Google’s index. In other words, the result you’re after appears on page 1 on Yandex while Google displays the same result on page 5.
- Use a VPN and country-specific domains. As I showed earlier, if you search in Germany, you should use google.de and not google.co.uk or something else. However, as we know about Google Search, the location is one of the factors that impact search algorithms. So, if your location is London, you will probably not get the most relevant results.
- Keep refining your keywords. Keywords form the basis for any initial data collection. You start with a couple of keywords but end up with a long list at the end. And it should be like that. Revising your questions, changing the keywords and adding new info as it comes in is a natural process. I recommend, if your OPSEC allows, keeping a Google Sheet with all your keywords. This is important for three reasons. First, it is good practice as part of a structured methodology. Second, it helps you revise and create new ones as well as variations (e.g. spellings or languages). Finally, you can do cool stuff with Google Sheets, such as translating 100 keywords from English to Russian, Arabic, German and a bunch of other languages that are supported by Google – within seconds! (Fig. 7). If you want to learn how to do that, click here. For all other Google Sheets functions, click here.
- Use the language of your target. If you look into Salafi-jihadist groups in northern Syria, use Arabic. If you look for Russian business owners, use keywords in Russian. If you are mapping out a neo-Nazi cell in Hungary, use Hungarian keywords. If you don’t change the language, your searches will not be effective and efficient. In fact, you won’t find much relevant stuff.
I hope you enjoyed this blog post and found some of the principles, tips & tricks useful. If you have additional recommendations that you’d like to share with us, please comment below or use the hashtag #OSINTCurious on social media. Next week’s post will focus on more tips & tricks and how you can leverage Google for finding information about people, businesses and more!
3 thoughts on “How to search effectively and efficiently – Part I: basic principles, tips and tricks for OSINT”
Excellent into to effective OSINT’ing via Google. Looking forward to Part 2!
Woah! I’m really enjoying the template/theme of this blog. It’s simple, yet effective. A lot of times it’s difficult to get that “perfect balance” between usability and visual appearance. I must say you have done a awesome job with this. Additionally, the blog loads super fast for me on Opera. Outstanding Blog!
I appreciate you sharing this article post.Thanks Again. Will read on…