Q&A: Google hacking

Robert Abela is a Technical Manager at Acunetix and in this interview he discusses the importance of Google for security research, provides tips on Google for information gathering and more.

Based on your experience, how important is Google for security research?
Everyone who uses the internet knows that Google is the answer to every question. Google is a powerful search engine, and also a tool. Being such a good tool, almost like everything else on the internet today, Google’s capabilities are unfortunately being misused. To keep up to date, typically a security researcher refers to Google. This helps him learn more about new hacking trends, tools, and previous hacking incidents. Unfortunately, unless you are a hacker yourself, you cannot imagine what a hacker can be up to, since a typical security researcher can be quite naive and innocent, when compared to a real hacker. Therefore by searching and learning more on previous incidents, one can increase his knowledge and will perform better at securing websites and web applications.

Apart from that, when securing a web application, one should make sure that he looks at the whole picture. So basically, fixing all the vulnerabilities in the web application itself, and implementing perimeter network is not enough. One should also use tools which hackers typically use to get to know more about your web application and infrastructure, such as Google.

Let’s say you’re doing a penetration test. What kind of information about a target can you find out by using Google?
Anything connected to the web, is indexed by Google. Even administrator’s portals of devices connected to the web, such as printers and webcams are crawled and discovered by Google. You’ll be surprised by how many unprotected webcams are connected to the internet, streaming live video from people’s living rooms, or university dormitories.

By using Google, one can find out more about a configuration or version of a web server, web technology, such as PHP or .NET, and also well known web application, such as WordPress. Having access to a configuration of specific software, or its version, can be enough to help me start an attack. Unfortunately when web and network administrators encounter specific application problems, they seek for support from public forums where they tend to post extra configuration and setup information. Such information exposure can be enough to help a hacker know more about the actual web application he wants to attack.

One can also find specific web application or web technology unhandled error messages, which typically expose a lot of debug information. Other information we’ve typically found with Google searches were; contact lists, plain text files containing usernames and passwords, database dumps, pages containing login portals, installation and configuration files of an installed web application, vulnerable servers, pages containing network and vulnerability data, and many others.

What tips would you offer to those that want to use Google for information gathering?
There are both commercial and free automated tools available on the market which can help you automate most of these Google hacking checks, such as Acunetix WVS. I would recommend taking advantage of such tools and use them. Although such tools can ease the process of running such tests, and save you time, I would still recommend to everyone working in the web application security business to learn and get to know more about Google hacking techniques.

It is very important to know the details of Google hacking, and what automated tools do in the background. It will help you fine tune and customize Google hacking checks to suite your needs and web applications. There is no software that can replace human intelligence, yet. There are also a number of books, websites and YouTube videos available on the web, from where one can learn about Google hacking techniques. Use them wisely.

What advice would you give to those that wish to protect themselves from leaking information via a search engine like Google?
As doctors say, prevention is better than cure. Therefore the problem of Google hacking, or information exposure via Google hacking techniques must be tackled at source. If you do not want a hacker to find sensitive information about your website via Google hacking techniques, first make sure that you follow the following rules:
1. Testing and development of web applications should ALWAYS be done in a testing environment and NOT on production servers. This will help in avoiding leaving unwanted files on the server which may contain sensitive information, and accessible from Google.
2. Where possible, obfuscate software versions. If the software is not up to date with security patches, obfuscating its version might delay an attack, but still will not save you. Still, I would recommend to always installing the latest security updates provided from software vendors.
3. Any kind of sensitive information, such as database files, configuration files and log files should never be published or accessible via web. If remote users or applications need access to such files, there are many other ways and means how to provide them with such access.
4. When installing a known web application, such as WordPress or PHPbb, make sure to follow the installation instructions properly till the end. Most of the time, the most important and security related information is at the bottom of the installation documentation. E.g. make sure that all files and folders used for installation purposes only, are to be deleted after the installation.
5. After installing a particular web application, check if there are any security guidelines documents available about hardening that software. If such guidelines are available, make sure that you follow such guidelines and apply the required tweaks. Following such guidelines usually only take a couple of hours.
6. Error and exception handling in custom web applications have always been one of the major sources of problems, because of the amount of debug information they provide. As always, make sure all error and exceptions are handled properly. Always expect the unexpected.
7. Scan and secure your website. As we all know, there are a number of commercial and open source tools out there available which can ease your life.
8. When posting on a forum or blog seeking for support about a specific web application, do not post any configuration and setup details which can expose enough information about your actual web application.
9. Run Google Hacking checks every now and then to make sure Google is not indexing what you don’t want it to index. If you do not have the time to learn or launch such checks yourself, use an automated tool or hire someone you trust to do it for you.

Don't miss