Weirdness in the referrer log
The Googlebot is usually fairly well-behaved, but today I saw this:
/pycs_search/htdig-pycs-snapshot-20030402.tar.gz - 245 hits (32276480 bytes) 23: 64.68.84.42; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 20: 64.68.84.51; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 18: 64.68.84.31; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 17: 64.68.85.13; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 15: 64.68.84.76; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 15: 64.68.84.39; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 15: 64.68.84.143; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 13: 64.68.84.149; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 12: 64.68.84.16; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 12: 64.68.84.137; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 11: 64.68.85.6; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 11: 64.68.84.15; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 9: 64.68.84.6; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 9: 64.68.84.153; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 9: 64.68.84.144; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 7: 64.68.84.46; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 7: 64.68.84.132; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 7: 64.68.84.131; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 5: 64.68.85.9; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 4: 64.68.84.49; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 4: 64.68.84.134; Googlebot/2.1 (+http://www.googlebot.com/bot.html) 2: 64.68.84.43; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
It looks like a whole heap of different instances of Googlebot have been downloading the file in 128K chunks. It’s about 5 MB long, so I guess they’ve got six different copies of it in the cache right now. I didn’t realise they indexed .tar.gz
files ;-)