logo
Apache Lounge
Webmasters

 

About Forum Index Downloads Search Register Log in RSS X


Keep Server Online

If you find the Apache Lounge, the downloads and overall help useful, please express your satisfaction with a donation.

or

Bitcoin

A donation makes a contribution towards the costs, the time and effort that's going in this site and building.

Thank You! Steffen

Your donations will help to keep this site alive and well, and continuing building binaries. Apache Lounge is not sponsored.
Post new topic   Forum Index -> Apache View previous topic :: View next topic
Reply to topic   Topic: bingbot+apache end of my rope
Author
captainsubt3xt



Joined: 19 Jun 2014
Posts: 1
Location: United States

PostPosted: Thu 19 Jun '14 17:40    Post subject: bingbot+apache end of my rope Reply with quote

Hey folks,

I have a shared cPanel (Easy Apache 3.18.17) box running Apache/2.2.24

For the life of me, BingBot is unable to crawl my site. It keeps getting 403'd. Google is fine. CSF and LFD are employed, but the problem persists when they are disabled. Could this be some cPanel eccentricity of which I'm not aware? Can anyone shed some light on my stupidity?

.htaccess only contains a few redirects

robots.txt
Code:
# Default .htaccess file for all new customers.

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Allow: /

User-agent: bingbot
Allow: /

# The lines below are for specific bots and possible Malware
# distribution User-agents.  You should leave them listed in this
# .htaccess file.  Feel free to append to it, but removing these
# lines is NOT recommended.

User-agent: Mediapartners-Google*
Disallow:

User-agent: BotRightHere
Disallow: /

User-agent: larbin
Disallow: /

User-agent: b2w/0.1
Disallow: /

User-agent: Copernic
Disallow: /

User-agent: psbot
Disallow: /

User-agent: Python-urllib
Disallow: /

User-agent: NetMechanic
Disallow: /

User-agent: URL_Spider_Pro
Disallow: /

User-agent: CherryPicker
Disallow: /

User-agent: EmailCollector
Disallow: /

User-agent: EmailSiphon
Disallow: /

User-agent: WebBandit
Disallow: /

User-agent: EmailWolf
Disallow: /

User-agent: ExtractorPro
Disallow: /

User-agent: CopyRightCheck
Disallow: /

User-agent: Crescent
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: ProWebWalker
Disallow: /

User-agent: CheeseBot
Disallow: /

User-agent: LNSpiderguy
Disallow: /

User-agent: Alexibot
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: TeleportPro
Disallow: /

User-agent: MIIxpc
Disallow: /

User-agent: Telesoft
Disallow: /

User-agent: Website Quester
Disallow: /

User-agent: WebZip
Disallow: /

User-agent: moget/2.1
Disallow: /

User-agent: WebZip/4.0
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebSauger
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: NetAnts
Disallow: /

User-agent: Mister PiX
Disallow: /

User-agent: WebAuto
Disallow: /

User-agent: TheNomad
Disallow: /

User-agent: WWW-Collector-E
Disallow: /

User-agent: RMA
Disallow: /

User-agent: libWeb/clsHTTP
Disallow: /

User-agent: asterias
Disallow: /

User-agent: httplib
Disallow: /

User-agent: turingos
Disallow: /

User-agent: spanner
Disallow: /

User-agent: InfoNaviRobot
Disallow: /

User-agent: Harvest/1.5
Disallow: /

User-agent: Bullseye/1.0
Disallow: /

User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
Disallow: /

User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
Disallow: /

User-agent: CherryPickerSE/1.0
Disallow: /

User-agent: CherryPickerElite/1.0
Disallow: /

User-agent: WebBandit/3.50
Disallow: /

User-agent: NICErsPRO
Disallow: /

User-agent: Microsoft URL Control - 5.01.4511
Disallow: /

User-agent: DittoSpyder
Disallow: /

User-agent: Foobot
Disallow: /

User-agent: SpankBot
Disallow: /

User-agent: BotALot
Disallow: /

User-agent: lwp-trivial/1.34
Disallow: /

User-agent: lwp-trivial
Disallow: /

User-agent: BunnySlippers
Disallow: /

User-agent: Microsoft URL Control - 6.00.8169
Disallow: /

User-agent: URLy Warning
Disallow: /

User-agent: Wget/1.6
Disallow: /

User-agent: Wget/1.5.3
Disallow: /

User-agent: Wget
Disallow: /

User-agent: LinkWalker
Disallow: /

User-agent: cosmos
Disallow: /

User-agent: moget
Disallow: /

User-agent: hloader
Disallow: /

User-agent: humanlinks
Disallow: /

User-agent: LinkextractorPro
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Mata Hari
Disallow: /

User-agent: LexiBot
Disallow: /

User-agent: Web Image Collector
Disallow: /

User-agent: The Intraformant
Disallow: /

User-agent: True_Robot/1.0
Disallow: /

User-agent: True_Robot
Disallow: /

User-agent: BlowFish/1.0
Disallow: /

User-agent: JennyBot
Disallow: /

User-agent: MIIxpc/4.2
Disallow: /

User-agent: BuiltBotTough
Disallow: /

User-agent: ProPowerBot/2.14
Disallow: /

User-agent: BackDoorBot/1.0
Disallow: /

User-agent: toCrawl/UrlDispatcher
Disallow: /

User-agent: suzuran
Disallow: /

User-agent: TightTwatBot
Disallow: /

User-agent: VCI WebViewer VCI WebViewer Win32
Disallow: /

User-agent: VCI
Disallow: /

User-agent: Szukacz/1.4
Disallow: /

User-agent: Openfind data gatherer
Disallow: /

User-agent: Openfind
Disallow: /

User-agent: Xenu's Link Sleuth 1.1c
Disallow: /

User-agent: Xenu's
Disallow: /

User-agent: Zeus
Disallow: /

User-agent: RepoMonkey Bait & Tackle/v1.01
Disallow: /

User-agent: RepoMonkey
Disallow: /

User-agent: Microsoft URL Control
Disallow: /

User-agent: Openbot
Disallow: /

User-agent: URL Control
Disallow: /

User-agent: Zeus Link Scout
Disallow: /

User-agent: Zeus 32297 Webster Pro V2.9 Win32
Disallow: /

User-agent: Webster Pro
Disallow: /

User-agent: EroCrawler
Disallow: /

User-agent: LinkScan/8.1a Unix
Disallow: /

User-agent: Keyword Density/0.9
Disallow: /

User-agent: Kenjin Spider
Disallow: /

User-agent: Iron33/1.0.2
Disallow: /

User-agent: Bookmark search tool
Disallow: /

User-agent: GetRight/4.2
Disallow: /

User-agent: FairAd Client
Disallow: /

User-agent: Gaisbot
Disallow: /

User-agent: Aqua_Products
Disallow: /

User-agent: Radiation Retriever 1.1
Disallow: /

User-agent: Flaming AttackBot
Disallow: /


Access log from home directory in question:
Code:
157.55.33.38 - - [19/Jun/2014:10:27:36 -0500] "GET / HTTP/1.1" 403 - "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"



Apache error_log:
Code:
[Thu Jun 19 10:27:36 2014] [error] [client 157.55.33.38] client denied by server configuration: /home/downtown/public_html/
[Thu Jun 19 10:27:36 2014] [error] [client 157.55.33.38] client denied by server configuration: /home/downtown/public_html/403.shtml



httpd.conf for that virtual host

Code:
   
    UserDir disabled
    UserDir enabled downtown
    <Directory /home/downtown/public_html>
      Order allow,deny
      Allow from all
    </Directory>
    <IfModule mod_suphp.c>
        suPHP_UserGroup downtown downtown
    </IfModule>
    <IfModule concurrent_php.c>
        php4_admin_value open_basedir "/home/downtown:/usr/lib/php:/usr/php4/lib/php:/usr/local/lib/php:/usr/local/php4/lib/php:/tmp"
        php5_admin_value open_basedir "/home/downtown:/usr/lib/php:/usr/local/lib/php:/tmp"
    </IfModule>
    <IfModule !concurrent_php.c>
        <IfModule mod_php4.c>
            php_admin_value open_basedir "/home/downtown:/usr/lib/php:/usr/php4/lib/php:/usr/local/lib/php:/usr/local/php4/lib/php:/tmp"
        </IfModule>
        <IfModule mod_php5.c>
            php_admin_value open_basedir "/home/downtown:/usr/lib/php:/usr/local/lib/php:/tmp"
        </IfModule>
        <IfModule sapi_apache2.c>
            php_admin_value open_basedir "/home/downtown:/usr/lib/php:/usr/php4/lib/php:/usr/local/lib/php:/usr/local/php4/lib/php:/tmp"
        </IfModule>
    </IfModule>
    <IfModule !mod_disable_suexec.c>
        <IfModule !mod_ruid2.c>
            SuexecUserGroup downtown downtown
        </IfModule>
    </IfModule>
    <IfModule mod_ruid2.c>
        RMode config
        RUidGid downtown downtown
    </IfModule>
    <IfModule itk.c>
        # For more information on MPM ITK, please read:
        #   http://mpm-itk.sesse.net/
        AssignUserID downtown downtown
    </IfModule>
                   


modsec.conf
Code:
<IfModule mod_security2.c>
SecRuleEngine On
# See http://www.modsecurity.org/documentation/ModSecurity-Migration-Matrix.pdf
#  "Add the rules that will do exactly the same as the directives"
# SecFilterCheckURLEncoding On
# SecFilterForceByteRange 0 255
SecAuditEngine RelevantOnly
SecAuditLog logs/modsec_audit.log
SecDebugLog logs/modsec_debug_log
SecDebugLogLevel 0
SecDefaultAction "phase:2,deny,log,status:406"
SecRule MULTIPART_STRICT_ERROR "!@eq 0" "phase:2,t:none,log,deny,status:44,msg:'Multipart request body failed strict validation: PE %{REQBODY_PROCESSOR_ERROR}, BQ %{MULTIPART_BOUNDARY_QUOTED}, BW %{MULTIPART_BOUNDARY_WHITESPACE}, DB %{MULTIPART_DATA_BEFORE}, DA %{MULTIPART_DATA_AFTER}, HF %{MULTIPART_HEADER_FOLDING}, LF %{MULTIPART_LF_LINE}, SM %{MULTIPART_MISSING_SEMICOLON}, IQ %{MULTIPART_INVALID_QUOTING}, IP %{MULTIPART_INVALID_PART}, IH %{MULTIPART_INVALID_HEADER_FOLDING}, FL %{MULTIPART_FILE_LIMIT_EXCEEDED}',id:1234123456"
SecRule REMOTE_ADDR "^127.0.0.1$" nolog,allow,id:1234123455
Include "/usr/local/apache/conf/modsec2.user.conf"
</IfModule>



Any assistance would be appreciated.

Thanks in advance,

captainsubt3xt[/b]
Back to top
James Blond
Moderator


Joined: 19 Jan 2006
Posts: 7288
Location: Germany, Next to Hamburg

PostPosted: Fri 20 Jun '14 12:22    Post subject: Reply with quote

Maybe mod security it is blocking it. Check your audit log.
Back to top


Reply to topic   Topic: bingbot+apache end of my rope View previous topic :: View next topic
Post new topic   Forum Index -> Apache