ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Stopping access to a site - How do people do it?

October 15, 2008 9:33am

Subscribe [4]
  • #1 / Oct 15, 2008 9:33am

    Mark Bowen

    12637 posts

    Hiya,

    Just a very quick question this one. Was wondering if there is some way that people can make their sites not available to sites such as http://builtwith.com?

    It seems that some sites that I perform checks on come back with either 404 or 403 forbidden messages although the sites are perfectly on-line and can be viewed.

    Is there something you can do server-side perhaps with some .htacess goodness that can stop these sites from finding out information or is this something that the certain host or host server does inherently?

    Thanks for any information on this.

    Best wishes,

    Mark

  • #2 / Oct 15, 2008 12:19pm

    If it has a fixed IP you could block it with .htaccess. I don’t really know what the site is though why would you want to block it? Is it doing something bad?

    Best Regards

    Emily

  • #3 / Oct 15, 2008 3:50pm

    Derek Jones

    7561 posts

  • #4 / Oct 15, 2008 3:56pm

    Mark Bowen

    12637 posts

    ::cough::

    That sounds nasty although through the germs I see what you mean 😉

    Silly me!! I just thought that maybe there was something else that people might be doing but I suppose that would do it too!! 😊

    Thanks Derek.

    Best wishes,

    Mark

  • #5 / Oct 15, 2008 4:01pm

    Derek Jones

    7561 posts

    Yes, and if you utilize the feature to allow it to write to your .htaccess you can block domains from any type of access to your site whatsoever.

  • #6 / Oct 15, 2008 4:07pm

    Mark Bowen

    12637 posts

    Hmm,

    Having a slight problem with this. I tried adding in the URL in the control panel after downloading the ExpressionEngine blacklist and the list wouldn’t update. It’s coming up with a precondition error which I guess we probably have some mod_security setting set somewhere on the server.

    To that end I tried adding in the URL into the database table following the same format as the others in there but it doesn’t seem to work as http://builtwith.com is still able to access the site.

    Is there something that I might be doing wrong somewhere?

    Best wishes,

    Mark

  • #7 / Oct 15, 2008 4:14pm

    Derek Jones

    7561 posts

    Did you write to your .htaccess, and verify that it’s there?  The site in question may be making the requests via a different domain or an IP address.  You’ll want to check your logs to see the details of the actual request(s) that a search on builtwith.com triggers.

  • #8 / Oct 15, 2008 4:24pm

    Mark Bowen

    12637 posts

    Yep wrote it to the .htaccess file but they are still getting through. WIll have to download the server logs and take a look to find out what exactly is making the request.

    Thanks again.

    Best wishes,

    Mark

  • #9 / Oct 15, 2008 6:55pm

    Pascal Kriete

    2589 posts

    They save requests - at least for a little while.  So if you put them in the blacklist, you may still see the saved information.

    Also, the request doesn’t come from their primary domain.  My python webserver logs it as such:

    caesium.lon.periodicnetwork.com - - [15/Oct/2008 17:09:56] “GET / HTTP/1.1” 200 -

    The easiest way to block them is the custom user agent:

    ‘HTTP_USER_AGENT’: ‘Mozilla/5.0 (compatible; BuiltWith/0.1; +http://builtwith.com/bot.html)’

  • #10 / Oct 16, 2008 12:57am

    trif3cta

    148 posts

    If you can spot their bot, you could also try to block it via robots.txt with a disallow.

    User-agent: bot_name
    Disallow: /
  • #11 / Oct 16, 2008 11:31am

    Mark Bowen

    12637 posts

    Hiya,

    Thanks for the info. Periodic Network, yep that came up when I did a ping on their site but wasn’t too sure what I was seeing there. Not been able to download the site logs yet as the server is now playing up for some reason but I’m sure I’ll get at them soon though.

    Thanks for the help.

    Best wishes,

    Mark

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases