logo
Apache Lounge
Webmasters

 

About Forum Index Downloads Search Register Log in  RSS Apache Lounge  


Keep Server Online

If you find the Apache Lounge, the downloads and overall help useful, please express your satisfaction with a donation.

or

Bitcoin

A donation makes a contribution towards the costs, the time and effort that's going in this site and building.

Thank You! Steffen

Your donations will help to keep this site alive and well, and continuing building binaries. Apache Lounge is not sponsored.



Post new topic   Forum Index -> Apache View previous topic :: View next topic
Reply to topic   Topic: Question about apache, found a way to ban Bad Bot, crawlers
Author
nightfly13



Joined: 07 Dec 2019
Posts: 2
Location: Portugal

PostPosted: Sun 08 Dec '19 1:41    Post subject: Question about apache, found a way to ban Bad Bot, crawlers Reply with quote

Dear all...

My system is ubuntu server 14.04, Apache 2.4 with ispconfig3

On the last month i realize a big amounth of trafica coming from bad bots, like dotbot, SeznamBot, AhrefsBot
, and other fellows....

After search google i realize we can block them with the following code...

Go to: etc/apache2/apache2.conf
Code:

SetEnvIfNoCase User-Agent "^DotBot" bad_user
SetEnvIfNoCase User-Agent "^AhrefsBot" bad_user

<Directory />
       <RequireAll>
              Require all granted
              Require not env bad_user
       </RequireAll>
</Directory>


But the problem i think is because in ispconfig3 i have a rewrite from HTTP to HTTPS... And this is the main cause for the above code do nothing...

One of experiences i made is the following...

I have for example 2 sites and the default apache page in /var/www

/var/www/html - Default apache file
/var/www/site1.com - Site1
/var/www/site2.com - Site2

I try to make a small experience block all sites on my server, for that i place in default apache2.conf in /etc/apache2/
Code:

<Directory /var/www>
        Options Indexes FollowSymLinks
        AllowOverride All
        Require all denied
</Directory>


With this i expect to block all sites in my server, but i realize one default apache page is blocked and when i try to access typing the ip on browser i got the message

Forbidden
You don't have permission to access / on this server.

But the other sites (Site1.com and site2.com) si still can access with no issues...

With this i beleave the problem can be th HTTP to HTTPS rewrite set on ispconfig....

How can make the above code work for the sites with HTTPS rewrite????

I appretiate any help...

Best Regards
Joćo Carrolo[/code]
Back to top
Steffen
Moderator


Joined: 15 Oct 2005
Posts: 2768
Location: Hilversum, NL, EU

PostPosted: Mon 09 Dec '19 14:47    Post subject: Reply with quote

See other methods at https://www.apachelounge.com/viewtopic.php?t=5438
Back to top
nightfly13



Joined: 07 Dec 2019
Posts: 2
Location: Portugal

PostPosted: Mon 09 Dec '19 23:00    Post subject: Not Working... Reply with quote

Thanks for your reply...

I try all examples in the link and none of them seems to work...

If i type:
Code:

curl -A "DotBot" example.com


I have allways < 301 Moved Permanently >, and this is not what i'm expecting... I Expect a 403 redirect...

I thinks is something wrong, and related with https, what i think is made an HTTP Post Header, and when i swith to HTTPS the ISPCONFIG3 is rewriting the rules and HTTP Post Header is lost because i'm in HTTPS....

This is one of the line in one of my logfiles
Code:

xx.xx.xxx.xxx - - [09/Dec/2019:20:39:04 +0000] "GET /search/sShowAs,gallery/category,130/sOrder,dt_pub_date/iOrderType,desc HTTP/1.1" 301 577 "-" "Mozilla/5.0 (compatible; AhrefsBot/6.1; +http://ahrefs.com/robot/)"




Like i told one of the experiences i made is....

I have for example 2 sites and the default apache page all inside /var/www

/var/www/html - Default apache file
/var/www/site1.com - Site1
/var/www/site2.com - Site2

If i try to block all sites with
Code:

<Directory /var/www>
        Options Indexes FollowSymLinks
        AllowOverride All
        Require all denied
</Directory>


only the default apache site is bloqued with the message 403 Forbidden but the site1 and site2 still be accessible...

The code above should bloque all site inside /var/www

A small detail... Im behind a NAT...

Best Regards
Joćo Carrolo
Back to top
James Blond
Moderator


Joined: 19 Jan 2006
Posts: 6684
Location: Germany, Next to Hamburg

PostPosted: Wed 11 Dec '19 12:18    Post subject: Re: Not Working... Reply with quote

nightfly13 wrote:

If i type:
Code:

curl -A "DotBot" example.com




Correct domain name? www missing? Does it maybe change to https?
Back to top


Reply to topic   Topic: Question about apache, found a way to ban Bad Bot, crawlers View previous topic :: View next topic
Post new topic   Forum Index -> Apache