Once again, a search engine exposes private data

August 21st, 2007 by mb

I almost feel embarrassed writing a post like this because it is such old news. Google hacking really shouldn’t be that interesting anymore. But it still is.

Although Google Code Search hacking has been mentioned in the news many times already, the power of regex searches and the fact that it indexes files inside zip files and other archives still makes it quite a gold mine.

Today I was playing around with filename regex patterns, adding this to my queries:

file:\.(log|csv|xls)

It took me a few tries to find the right search terms but once I did I was disappointed with the huge amount of personal information that even big companies expose to search engines. Then to make it more interesting I tried those same searches, this time adding:

package:\.gov

So there you have it. Free personal info on a bunch of americans.

So the lesson is this:

  1. You really should take the time to google your own name, your business name (or government office), your web sites, and other key information to see what turns up.
  2. Use search engine alerts to subscribe to those results.
  3. And don’t forget to check google code search
  4. And use a robots.txt file to limit what is indexed
  5. And don’t keep junk like backups just laying around your web server

Seems easy enough, doesn’t it?

No tag for this post.

Related posts

Posted in Windows Security |

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.