I was tracking template hits on my website, and noticed what I thought were unusually high counts for my 404-page not found template. So, I installed an extension which sends me an email from the 404 template.
First, internal redirects don’t provide the original url which wasn’t found. However, I figured out the cause, and have corrected the problem.
What’s left are a few unexplainable URLs which do show up since they are not part of the EE template hierarchy. Here’s the prototype for the email body:
Page Not Found: {uri_string} by {httpagent}‘uri_string’ is whatever was sent following my root website (e.g. example.com/uri_string. ‘httpagent’ supposedly identifies the sender’s browser.
Here are a few of the unexplained URL’s:
1. Page Not Found: admin/module-builtin.xml by Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
2. Page Not Found: vtigercrm/modules/com_vtiger_workflow/sortfieldsjson.php by
3. Page Not Found: sitemap.php by msnbot/2.0b (+http://search.msn.com/msnbot.htm)
4. Page Not Found: recordings/index.php by Mozilla/5.0 (Windows NT 5.1) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.112 Safari/534.30
5. Page Not Found: phpmyadmin/translators.html by Mozilla/4.0 (compatible;MSIE 6.0; MSIE 5.5; Windows NT 5.1) Opera 7.01 [en]
6. Page Not Found: admin/config.php by Python-urllib/2.4
Some of this might be search bots (3, for example is almost certainly msn). But others look like hacking attempts. Note that 2 is anonymous.
I’d be interested in your opinions as to what these might be, and what, if anything, should be done.
Thanks! —Jim