The “Google Hacking Database (GHDB)” is a categorised index of Internet search engine queries designed to uncover interesting, and usually sensitive, information made publicly available on the Internet. In most cases, this information was never meant to be made public but due to any number of factors this information was linked in a web document that was crawled by a search engine which subsequently followed that link and indexed the sensitive information.
The breadth of information that can be quickly and easily uncovered is astounding and almost limitless, including all manner of “Personally Identifiable Information (PII)”, secret documents, passwords, network data and much, much more. All of this can be unearthed without complex tools. All that is required is a web search engine (such as Google), the use of basic search operators, a healthy dose of creativity and most often the recursive use of advanced search operators to narrow down search results to, for example, target and isolate specific web sites or domains, search for specific file types, search the URL space or search for specific sensitive phrases.
This process, known as “Google Hacking” was popularised in 2000 by Johnny Long, a professional hacker, who began cataloging these queries in a database known as the Google Hacking Database. His initial efforts were amplified by countless hours of community member effort, documented in the book Google Hacking For Penetration Testers and popularised by a barrage of media attention and Johnny’s talks on the subject such as this early talk recorded at DEFCON 13. Johnny coined the term “Googledork” to refer to “a foolish or inept person as revealed by Google“. This was meant to draw attention to the fact that this was not a “Google problem” but rather the result of an often unintentional misconfiguration on the part of a user or a program installed by the user. Over time, the term “dork” became shorthand for a search query that located sensitive information and “dorks” were included with may web application vulnerability releases to show examples of vulnerable web sites.
As the GHDB grew, hundreds upon hundreds of queries were added to the database, providing one-click access to sensitive information. The rise of this activity spurred wave after wave of creativity that resulted in the discovery of new search techniques that uncovered powerful, often undocumented search engine capabilities. The end result was that search engine companies began to take a more active hand in protecting against search engine query “attacks” and stepped in to find ways to avoid revealing sensitive crawl data. However, the Internet has grown exponentially in the past decade and “Google hacking” is still a valuable, surprisingly effective, and fun technique and an integral part of a mature security assessment process even though some operators have dropped (including some old favourite) and others have been added.
After nearly a decade of hard work by the community, Johnny turned the GHDB over to Offensive Security in November 2010, and it is now maintained as an extension of the Exploit-Database. Today, the GHDB includes searches for other online search engines such as Bing, and other online repositories like GitHub, producing different, yet equally valuable results.