Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Aug 2003 11:30:14 -0400
From:      Michael Conlen <meconlen@obfuscated.net>
To:        "Jack L. Stone" <jackstone@sage-one.net>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Script help needed please
Message-ID:  <3F3BAB06.1060109@obfuscated.net>
In-Reply-To: <3.0.5.32.20030814084949.012f40e8@sage-one.net>
References:  <3.0.5.32.20030814084949.012f40e8@sage-one.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Jack,

You can setup Apache to deny access to people using that browser. The 
catch is that it's easy to work around it by changing the browser 
string. If they are that desperate to do this after you deny access to 
people using HTTRACK or other clients you can place a link that no human 
would access that runs a CGI that runs the firewall rule to deny them 
access. You probably want it to return some data and wait a bit so the 
user can't figure out easily what URL is killing their access.

You can also put on your website that users are not allowed to use the 
site using non interactive browsers. Then when you find them you send a 
nasty gram to their ISP and notify them that continued abuse could be a 
crime under the Computer Fraud and Abuse Act (if you and they are in the 
US) and let their ISP take care of it.

--
Michael Conlen

Jack L. Stone wrote:

>Server Version: Apache/1.3.27 (Unix) FrontPage/5.0.2.2510 PHP/4.3.1
>The above is typical of the servers in use, and with csh shells employed,
>plus IPFW.
>
>My apologies for the length of this question, but the background seems
>necessary as brief as I can make it so the question makes sense.
>
>The problem:
>We have several servers that provide online reading of Technical articles
>and each have several hundred MB to a GB of content.
>
>When we started providing the articles 6-7 years ago, folks used browsers
>to read the articles. Now, the trend has become a more lazy approach and
>there is an increasing use of those download utilities which can be left
>unattended to download entire web sites taking several hours to do so.
>Multiply this by a number of similar downloads and there goes the
>bandwidth, denying those other normal online readers the speed needed for
>loading and browsing in the manner intended. Several hundred will be
>reading at a time and several 1000 daily.
>
>Further, those download utilities do not discriminate on the files
>downloaded unless the user sets them to exclude certain types of files they
>don't need for the articles. All or most don't bother to set the
>parameters. They just turn them loose and go about their day. Essentially a
>DoS for normal readers who notice the slowdown, but not with malice.
>
>This method downloads a tremendous amount of unnecessary content. Some
>downloaders have been contacted to stop (if we spot an email address from a
>login) and in response they simply weren't aware of the problems they were
>making and agreed to at least spread downloads over longer periods of time.
>I can live with that.
>
>A possible solution?
>Now, my question: Is it possible to write a script that can constantly scan
>the Apache logs to look for certain footprints of those downloaders,
>perhaps the names, like "HTTRACK", being one I see a lot. Whenever I see
>one of those sessions, I have been able to abort them by adding a rule to
>the firewall to deny the IP address access to the server. This aborts the
>downloading, but have seen the attempts constantly continue for a day or
>two, confirming unattended downloads.
>
>Thus, if the script could spot an "offender" and then perhaps make use of
>the firewall to add a rule containing the offender's IP address and then
>flush to reset the firewall, this would at least abort the download and
>free up the bandwidth (I already have a script that restarts the firewall).
>
>Is this possible and how would I go about it....???
>
>Many thanks for any ideas on this!
>
>Best regards,
>Jack L. Stone,
>Administrator
>
>SageOne Net
>http://www.sage-one.net
>jackstone@sage-one.net
>_______________________________________________
>freebsd-questions@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
>  
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F3BAB06.1060109>