GHH is the reaction to a new type of malicious web traffic: search engine hackers. GHH is a “Google Hack” honeypot. It is designed to provide reconaissance against attackers that use search engines as a hacking tool against your resources. GHH implements honeypot theory to provide additional security to your web presence.
Google has developed a powerful tool. The search engine that Google has implemented allows for searching on an immense amount of information. The Google index has swelled past 8 billion pages [February 2005] and continues to grow daily. Mirroring the growth of the Google index, the spread of web-based applications such as message boards and remote administrative tools has resulted in an increase in the number of misconfigured and vulnerable web apps available on the Internet.
These insecure tools, when combined with the power of a search engine and index which Google provides, results in a convenient attack vector for malicious users. GHH is a tool to combat this threat.
GHH emulates a vulnerable web application by allowing itself to be indexed by search engines. It's hidden from casual page viewers, but is found through the use of a crawler or search engine. It does this through the use of a transparent link which isn't detected by casual browsing but is found when a search engine crawler indexes a site. The transparent link (when well crafted) will reduce false positives and avoid a fingerprint of the honeypot.
The honeypot connects to a configuration file, and the configuration file writes to a log file which is chosen during configuration. The log file contains information about the host, including IP address, referral information, and user agent.
Using the information gathered in the log file, an administrator can learn more about attackers doing reconnaissance against their site. An administrator can cross reference logs and view a better picture of specific attackers.
GHH can be a minimum of 3 files, including the honeypot, the configuration file, and the log. The transparent link is made into a pre-existing webpage. GHH is written in PHP and is issued under the GNU Public License.
SF Project Page