The world wide web is a deep space overflowing with data and a lot of it is public and freely available to use. All you need to know is where to look.
In the cyberworld of hacking, OSINT is like doing your homework about the target.
OSINT or Open Source Intelligence is the process of collecting information that is freely available for public use. It’s what do you when you had to research a topic for a school project. You find out all the sources (online and offline) from where you could dig up information and make some sense out of it.
In hacking, OSINT is one of the most important and basic aspects of intelligence gathering. You can use websites, search engines, social networks, blogs, videos, podcasts, or even newspapers to gather crucial information.
The wonderful part about this technique is that it is so common and obvious that everyone already uses it, they just don’t know it yet. Don’t we all search for information online, all the damn time?
At the slightest doubt, we Google whether Tom Hanks was in that movie, or Wikipedia about when that company was founded or watch a Youtube video about how to fix that issue in our computer.
We are constantly using free and public information on the internet for our personal use. Even right now, when you searched about ‘OSINT’ and got to this article, you used OSINT!
But for the purpose of hacking, it would be impractical and stupid to sift through the internet for bits of information that you can obviously expect to not be so easily found. It would be like searching for a needle in a haystack. That’s where these tools come in handy.
1. Google Dorks
Aim: Searching Web Pages
Did you know that if you asked Google the right questions, you will be surprised what it can tell you.
Use Google dork query to conduct smart and advanced search operations and find:
- Information that would be otherwise hard to find
- Information that was not meant for public viewing, but wasn’t well protected
- Sensitive information like usernames and passwords, email lists, personally identifiable financial information (PIFI)
- Vulnerable websites/systems
Understanding naming scheme of a website:
List of advanced search operators:
See web pages stored in Google cache
See web pages linked to a specific web page
See web pages related to a specific web page
See what Google knows about a specific web page
See web pages in a specific web domain
Restrict results to those with specific keywords in the title
Restrict results to those having all the mentioned keywords in the title
Restrict search results to those with specific keyword in the URL
Restrict results to those having all mentioned keywords in their URL
See information about a specific location
Search for a specific type of file on a website
To see login pages of Indian websites, use
Pro tip: Use Google Hacking Database (GHDB) – for massive database of Google Dorks
Aim: Searching Employee information
Get an inside view of a company using LinkedIn. It has enough information about employees (full names, job roles, software used) that could be used to carry out social engineering attacks like impersonation.
You can even see the technologies being used by that company by deep diving into the employee’s profile.
3. Wappalyser Plugin
Aim: Finding Technologies used on a website
Many companies use vulnerable technologies that provide an easy entry for hackers. While targeting companies or institutions, you can use this tool, to understand their website framework and look for vulnerabilities in the technologies used by them.
4. CT and Sublist3r
Aim: Enumerating Subdomains
A website has many subdomains. For example, blog.website.com and shop.wesbsite.com are subdomains for website.com. These subdomains could be vulnerable to many cyberattacks. Using these two tools, you can see all the subdomains for your target website.
a. Certificate Transparency (CT)
All SSL/TLS certificates issued for domains are released for public viewing by a Certificate Authority. This is known as Certificate Transparency (CT).
Using CT logs , you can search for all such certificates issued for your target company and thus, find vulnerable domains.
Example: type %.stackoverflow.com on https://crt.sh to find out all their subdomains.
A python tool specially designed for this OSINT technique is Sublist3r. It uses a number of search engines and other websites like Google, Yahoo, Bing, Virustotal, DNSdumpster etc, to churn out subdomains of websites. Hackers can use this to find vulnerable domains. Subbrute and sublist3r were integrated to combine their domain enumerating capabilities and create a powerful tool.
Aim: Finding employees emails
One of the most used methods to target an organisation via social engineering is phishing. You can find out the emails of all the employees using this OSINT tool- theHarvester. It comes pre installed in Kali Linux and uses multiple data sources (public obviously) to gather emails, subdomains, URLs, IPs, etc.
Aim: Getting domain information
If you know how to use it, any information about a company could be valuable. Every domain has its registration record that contains particulars like date of creation, expiration date, updated date, name servers, admin contact, registrant email, organisation name, addresses, phone number and other technical information. Use WHOIS tool to get domain information about your target.
You can use this information to create a structure and try to find a way inside the target. For example, you could use the registrant email (which is usually the developer’s) to break in to the website server using something like 12345678 as the password. Can you guess why this could work? Comment below and we’ll tell you if you are right.
Aim: Finding DNS information
DNS information gathering is no wonder a basic requirement for pentesting due to the amount of assistance it provides in mapping a network infrastructure. A useful tool for this OSINT technique is DNSRecon (a python script) that can enumerate general DNS records like MX, SOA, SRV, DNSSec, SPF, TXT, etc.
It can also do Google lookup, check for zone transfers, brute force subdomain, do reverse lookup, and cache snooping.
Try brute forcing subdomains, with the following command:
dnsrecon -D /usr/share/wordlists/subdomains-top1mil-5000.txt -d website.in -t brt
8. WayBack Machine
Aim: Finding old webpages
Websites keep on updating but they might still have their old webpages running on the backend. If you could see all the webpages ever, of your target websites, you might find some of them using outdated technologies which renders them vulnerable.
So, how do you go back in time and access the old webpages?
We know what you are thinking. But no, you don’t need a time machine for this (although it would come in quite handy). The Wayback Machine has a huge database of web pages saved over time. Check it out!
AIM: Digging up a person
It would require painstaking effort to dig up all the information about one person on the internet. Thanks to people search engines, you can collect every bit of data you legally can about someone to sketch out a profile of them, which could further be used to create a potential password list against their email account.
Pipl.com is one such tool (paid but worth every penny) that lets you get your hands on the entire online presence of a person. Their phone numbers, usernames, emails, deep web results..basically everything.
Aim: Accessing every device (feel like God)
Forget websites, you could dig up intelligence from all the things that are connected to the internet like webcams, smart TVs, buildings, power plants, refrigerators, security systems and we could keep on going but you got the point.
Use Shodan.io to search the Internet of things. Go!
That’s it peeps! Go start getting some leads and come back and tell us what worked the best for you. If there is any tool or hacking technique you would like us to cover next, let us know (with a ‘why’, ofcourse!).
Read more. Know more. Grow more.