Python, Your Friendly OSINT Helper

Ho ho ho and…Merry Christmas! It’s #OSINTuesday, and here’s our first blog post: Python, Your Friendly OSINT Helper!

On this week’s blog, we have three smaller blog posts, each of which discusses Python in relation to OSINT.

In the first one, Lorand briefly highlights some advantages and disadvantages of automating OSINT with Python and provides useful online learning resources.

In the second blog post, @WebBreacher talks about his experience when learning and using Python. If you’re interested in learning Python, make sure to check out his recommendations!

Last but not least, @Sector035 explains how to download and run scripts. If you haven’t used any Python scripts before, check out his brief how-to!

We hope you enjoy this week’s blog post and join our discussion on Twitter! #OSINTuesday #OSINTCurious #OSINT #Python

Automating OSINT with Python (by Lorand Bodo)

Let me introduce your friendly OSINT helper, Python – a powerful, fast and easy to learn programming language, thanks to its elegant syntax, dynamic typing and interpreted nature. It is, therefore, no surprise that it’s widely used across many domains. Python’s standard library supports many Internet protocols, such as HTML, XML, JSON and others, making it the ideal OSINT helper. And, that’s not all. There are numerous useful modules, packages and libraries that you can use for different purposes, such as data scraping, network analysis, natural language processing and many others  – all for free!

There are three main reasons why I learned Python and why I’d recommend it for OSINT-related purposes. First, you can automate tasks that are either labour-intensive, tedious or even both. Automating certain processes during your research would save you valuable time, which in turn could be spent on data analysis or report writing. But to be clear, automation is no silver bullet. It also has its limitations and being aware of this is very important.

Second, you don’t want to pay for tools, especially those that often look better than they actually are. But more importantly, when you start learning Python and make your first HTTP request or API call, you will start to understand how these “OSINT tools” actually work and above all, where the limitations lie. This will provide you with a solid understanding of automating OSINT. For example, just because a Python script hasn’t found that piece of information you were looking for, doesn’t necessarily mean it’s not there. Automating OSINT can aid the analyst, but it should not be a replacement! Finding the needle in the haystack is still a human task.

Lastly, having a good understanding of Python will also allow you to customise scripts, powerful scripts that were developed by amazing Python-Gurus. This could be in the form re-using snippets of code or even contribute by adding entire new modules that others could use. And if something goes wrong and you can’t find the bug, you can either use Google or ask for help on Stackoverflow.

When it comes to “automating OSINT”, you don’t have to reinvent the wheel. In most cases, someone has already written a Python script for what you want to do. So, do the research first before you start developing something from scratch.

One last thing I want to point out is, don’t get intimidated by a bunch of lines of code. I don’t have a computer science or tech background, so learning how to code was something completely new. But I also want to say that it was lots of fun and still is! There’s no downside to it. In fact, you will only benefit from it. So, if you’re interested in learning Python, here are a couple of resources that I highly recommend (in no particular order) that will get you quickly up to speed. Happy coding!

Free Python courses (general purpose)

  1. https://wiki.python.org/moin/BeginnersGuide (the official beginner’s guide)
  2. https://www.python-course.eu/ (excellent, free course)
  3. https://www.learnpython.org/ (learn the basics for free)

Automation with Python

  1. https://automatetheboringstuff.com/ (Excellent book by @AlSweigart – for free!)
  2. http://www.automatingosint.com/blog/python-for-beginners/ (excellent Python for OSINT course, offered by Justin Seitz)

How I Use Python (By: WebBreacher)

I remember when I learned how to code in Python in 2012. I did all the tutorials I could find. I read books and listened to podcasts. But when I tried to write my own scripts, I found that I really needed a reason to code. That reason was OSINT.

My Code

Over the years since those days, I’ve written a bunch of modules for the Recon-ng tool, written my own scripts (https://github.com/webbreacher), and modified others’ work. Originally, I used the version 2.7.x branch of Python but have since upgraded most of my work to the 3.7.x.

I use Python for two reasons:

  1. To scrape data from remote websites
  2. To manipulate data and file that I already have

When I began learning to code, I tried to figure out the overall action I needed to perform (ex. visit a web site, grab data, store it on my system) and then I built the final project in steps (1. figure out how to visit a web site; 2. how do I grab the data I want from it; 3. how do a write that to a file). Then, I’d put those pieces together into my uber script.

A while back, I made a couple of basic scripts at https://github.com/WebBreacher/pythonbasics that illustrate these concepts. With coding, I’m heavily relying on https://docs.python.org/3/ and https://stackoverflow.com/ for suggestions, tips, and code snippets.

My Suggestions for You

  • Start coding with Python 3.7.x not the 2.7.x branch
  • Find something that interests you (crypto, data science, web hacking, web scraping, file manipulation, creating a Twitter bot…whatever). Python can do ALL of that and so much more!!
  • Treat learning python just like you would learning a foreign language.
  • You need to take time to learn the vocabulary (functions, methods, and variables) and how it works together (the syntax and grammar).
  • Finally, you have to practice regularly if you want to be able to remember it. Think back to when you were in school and maybe were forced to take a language. I’m betting that, if you haven’t been using it, you’ve probably forgotten much of what you learned. Same goes for coding. Practicing it makes it easier to remember.

Useful Scripts (by Sector035)

Once you have gotten yourself into learning the basics of Python and followed the advice of WebBreacher, you are ready for some testing. Of course you need Python itself, which is included in nearly all Linux distributions. And if you are running Windows, there are different ways you can run Python3 on it, with the easiest being to install Python for Windows.

To start with a simple script that does not need any extra configuration, setup or installing of libraries, we will be having a look at GitHub-OSINT. You don’t even have to pull the whole GitHub repository to your own computer, since downloading the “github-osint.py” is enough to get you started. To do that open the page on GitHub, click the button that says “raw” and download the file to the directory of your choice. Then open a terminal session, go to the directory and start the script by typing:

python github-osint.py vulnbe

The first part says Python, that tells the operating system we want to run a script that needs to be interpreted by Python. The second part is the script itself that we want to run. And the very last part is what we input into the script, namely a GitHub user accound. In this case it is the maker of this script, named vulnbe. The script queries some API endpoints on GitHub and requests information about the user that is being investigated. The result is shown here:

Running the OSINT-Github script

Cloning From GitHub

Scripts that have multiple files, will have to be ‘cloned’ from GitHub before you can use them. Simply copying a single file won’t be enough in that case. We need to use a “git clone” command to create a local copy of the complete tool and all of its files. For that I recommend you to first chose a directory where you want your custom programs to land. A lot of people choose the directory /usr/local but you can of course also run the “git clone” command in your Documents folder. The advantage of using a directory like /usr/local is that it will be automatically indexed via the $PATH environment, so you can run the downloaded tools right away, no matter where you are.

To clone a tool from GitHub, you go to the repository and look for the “Clone or download” button in the repository. The script I am going to mention here does also need access to the Twitter API, so do bare in mind that after following all the steps you also need to apply for a Twitter API key.

Copy the URL that is displayed, go to your shell and run the following command:

git clone {URL copied from GitHub}
Cloning a tool from GitHub and it shows up in a subdirectory with the same name

After that you will find the tool in the desired directory “tweets_analyzer”. Most scripts, like this one, need some extra libraries before they can work. For that Python has a little tool called “pip” that stands for Python Installs Packages. When you come across a script that has certain dependencies, you find them in a file called “requirements.txt” file in the GitHub repository and now also in your local directory. You don’t have to open the file to look inside, but you can simply tell Python to install all the needed libraries by running:

pip install -r requirements.txt

With this command you instruct Python to look at the file and install all the libraries mentioned there. To make sure that this script works, we need to run this command

Installing libraries that are needed

After “pip” is done, this script is ready to be used. There are scripts that require you to run a “setup.py” or a similar script to install and configure the script before use. The steps needed to get it ready are usually described in the “README.md” file in the repository.

But this script is now ready to use. So you can start investigating tweets from users!

An analysis into the tweets of Donald Trump

Resources

In regards to resources, a lot has been mentioned already. For absolute beginners I would like to direct people towards Codecademy. They cover the bare basics, it is interactive and fun. After completing this the basics of Python should be known enough to understand most scripts.

After that, one can look at another free course, this time at Cybrary. The course is meant security professionals, but can be a very useful basis for some readers.

  1. Codecademy: https://www.codecademy.com/learn/learn-python-3
  2. Cybrary: https://www.cybrary.it/course/python/


Blog written by: Lorand Bodo, Sector035, Micah Hoffman.