Weird 404 error email I sent myself yesterday (url hidden to prevent linking to an adult domain):
HTTP_REFERER: [blank] HTTP_HOST: www.domain.com PHP_SELF: /fgdfgfert4534.html REQUEST_URI /NONEXISTENTURL.html REMOTE_ADDR: 66.249.65.69 TIMESTAMP: 5/24/2006 9:15 PM
Quick explanation: I rigged my dynamic pages, so a request to retrieve “maroon-widget.html” 404s and triggers an email if I don’t have “maroon widget” in my database.
REQUEST_URI is linked from nowhere; it exists solely in the supplemental index. I’ve seen Yahoo do this kinda thing, but this week I’m starting to see Google do the same thing. I guess Google’s basically crawling my site using its own database instead of following links. Is this a common behavior/part of a normal crawl, or is Google trying to clean up supplementals?
Looking up 66.249.65.69 in Google returns 208,000 jibberish results (mostly those that display your IP on their page). So I guess its just a regular bot, not supplemental bot?
Update: After cleaning out my inbox, I found similar emails going back to May 20.
Googlebot Refreshing Supplementals? - Read More...