Showing posts with label bot analysis. Show all posts
Showing posts with label bot analysis. Show all posts

Friday, June 5, 2009

How to block robots.. before they hit robots.txt - ala: mod_security

As many of you know, robots (in their many forms) can be quite pesky when it comes to crawling your site, indexing things that you don't want indexed. Yes, there is the standard of putting a robots.txt in your webroot, but that is often not highly effective. This is due to a number of facts... the least of which is not that robots tend to be poorly written to begin with and thus simply ignore the robots.txt anyway.

This comes up because a friend of mine that runs a big e-com site recently asked me.. "J, how can I block everything from these robots, I simply don't want them crawling our site." My typical response to this was "you know that you will then block these search engines and keep them from indexing your site"... to whit "yes, none of our sales are organic, they all come from referring partners and affiliate programs".... That's all that I needed to know... as long as it doesn't break anything that they need heh.

After puting some thought into it, and deciding that there was no really easy way to do this on a firewall, I decided that the best way to do it was to create some mod_security rules that looked for known robots and returned a 404 whenever any such monster hit the site. This made the most sense because they are running an Apache reverse proxy in front of their web application servers with mod_security (and some other fun).

A quick search on the internet found the robotstxt.org site that contained a listing (http://www.robotstxt.org/db/all.txt) of quite a few common robots. Looking through this file, all that I really cared about was the robots-useragent value. As such, I quickly whipped up the following perl that automaticaly creates a file named modsecurity_crs_36_all_robots.conf. Simply place this file in the apt path (for me /usr/local/etc/apache/Includes/mod_security2/) and restart your apache... voila.. now only (for the most part) users can browse your webserver. I'll not get into other complex setups, but you could do this on a per directory level also, from your httpd.conf, and mimic robots.txt (except the robots can't ignore the 404 muahahaha).

#####################Begin Perl#######################
#!/usr/bin/perl

##
## Quick little routine to pull the user-agent string out of the
## all.txt file from the robots project, with the intention of creating
## regular expression block rules so that they can no longer crawl
## against the rules!
## Copyright JJ Cummings 2009
## cummingsj@gmail.com
##

use strict;
use warnings;
use File::Path;

my ($line,$orig);
my $c = 1000000;
my $file = "all.txt";
my $write = "modsecurity_crs_36_all_robots.conf";
open (DATA,"<$file");
my @lines = ;
close (DATA);

open (WRITE,">$write");
print WRITE "#\n#\tQuick list of known robots that are parsable via http://www.robotstxt.org/db/all.txt\n";
print WRITE "#\tgenerated by robots.pl written by JJ Cummings \n\n";
foreach $line(@lines){
if ($line=~/robot-useragent:/i){
$line=~s/robot-useragent://;
$line=~s/^\s+//;
$line=~s/\s+$//;
$orig=$line;
$line=~s/\//\\\//g;
#$line=~s/\s/\\ /g;
$line=~s/\./\\\./g;
$line=~s/\!/\\\!/g;
$line=~s/\?/\\\?/g;
$line=~s/\$/\\\$/g;
$line=~s/\+/\\\+/g;
$line=~s/\|/\\\|/g;
$line=~s/\{/\\\{/g;
$line=~s/\}/\\\}/g;
$line=~s/\(/\\\(/g;
$line=~s/\)/\\\)/g;
$line=~s/\*/\\\*/g;
$line=~s/X/\./g;
$line=lc($line);
chomp($line);
if (($line ne "") && ($line !~ "no") && ($line !~ /none/i)) {
$c++;
$orig=~s/'//g;
$orig=~s/`//g;
chomp($orig);
print WRITE "SecRule REQUEST_HEADERS:User-Agent \"$line\" \\\n";
print WRITE "\t\"phase:2,t:none,t:lowercase,deny,log,auditlog,status:404,msg:'Automated Web Crawler Block Activity',id:'$c',tag:'AUTOMATION/BOTS',severity:'2'\"\n";
}
}
}
close (WRITE);
$c=$c-1000000;
print "$c total robots\n";


#####################End Perl#######################

To use the above, you have to save the all.txt file to the same directory as the perl.. and of course have +w permissions so that the perl can create the apt new file. This is a pretty basic routine... I wrote it in about 5 minutes (with a few extra minutes for tweaking of the ruleset format output (displayed below). So please, feel free to modify / enhance / whatever to fit your own needs as best you deem. **yes, I did shrink it so that it would format correctly here**

#####################Begin Example Output#######################
SecRule REQUEST_HEADERS:User-Agent "abcdatos botlink\/1\.0\.2 \(test links\)" \
"phase:2,t:none,t:lowercase,deny,log,auditlog,status:404,msg:'Automated Web Crawler Block Activity',id:'1000001',tag:'AUTOMATION/BOTS',severity:'2'"
SecRule REQUEST_HEADERS:User-Agent "'ahoy\! the homepage finder'" \
"phase:2,t:none,t:lowercase,deny,log,auditlog,status:404,msg:'Automated Web Crawler Block Activity',id:'1000002',tag:'AUTOMATION/BOTS',severity:'2'"
SecRule REQUEST_HEADERS:User-Agent "alkalinebot" \
"phase:2,t:none,t:lowercase,deny,log,auditlog,status:404,msg:'Automated Web Crawler Block Activity',id:'1000003',tag:'AUTOMATION/BOTS',severity:'2'"
SecRule REQUEST_HEADERS:User-Agent "anthillv1\.1" \
"phase:2,t:none,t:lowercase,deny,log,auditlog,status:404,msg:'Automated Web Crawler Block Activity',id:'1000004',tag:'AUTOMATION/BOTS',severity:'2'"
SecRule REQUEST_HEADERS:User-Agent "appie\/1\.1" \
"phase:2,t:none,t:lowercase,deny,log,auditlog,status:404,msg:'Automated Web Crawler Block Activity',id:'1000005',tag:'AUTOMATION/BOTS',severity:'2'"

#####################End Example Output#######################

And that folks, is how you destroy robots that you don't like.. you can modify the error that returns to fit whatever suits you best.. 403, 404.....

Cheers,
JJC

Thursday, October 18, 2007

HeX Live 1.0 Release

After 6 months of heavy development and debugging I am pleased to announce the release of the HeX Live CD 1.0 Release. What is HeX Live? HeX Live is the worlds first and foremost Network Security Monitoring & Network Based Forensics liveCD. The intent is to provide a wide array of highly usable tools in a pre-packaged format that the analyst can use to investigate and monitor real-time network activity, whether security related or in the course of reviewing traffic to determine bandwidth over utilization sources and so on...

This will be the final major release of HeX LiveCD until the release of FreeBSD 7.0 Rel, this is of course pending no major bugs are located in HeX 1.0R. If there are any major bugs found, then a bug-fixed HeX will be released prior to FreeBSD 7.0 Rel.\\

For a detailed list of what applications can be found on HeX Live 1.0R check out the actual project at rawpacket.org.

I have also included in this posting the CD covers that were created by vickz, fantastic work man! You can download the HeX LiveCD 1.0R from the following locations:

  1. US Server (East Coast) | MD5 | SHA256 | User Guide
  2. Malaysia Server | MD5 | SHA256 | User Guide
I will try to get some decent screenshots posted soon so that everyone can see just how slick the HeX LiveCD 1.0R really is. I would also suggest that you download it and play with it. There are a good number of tools on here for packet monkeys of all ages and skill to have a good old time!

I'll leave it at that for now, and again would like to thank the community for their support and feedback throughout the development process of this tool.

Shout to Geek00l for organizing everything and kicking some a$$!
Shout to ch4flgs_ and zarul for everything!
Shout to all others involved in this project (esp for putting up with me)

Cheers,
JJC

Tuesday, July 24, 2007

COX Communications HiJacking DNS

Recently while perusing the interweb, I came across the following article; "ISP Seen Breaking Internet Protocol to Fight Zombie Computers". The short of this article is that Cox Communications is attempting to remove bots from customers PC's by way of redirecting infected systems (by way of hijacked DNS records) to a c&c server that they control and issue standard bot uninstall commands to said bots. While I think that this is conceptually a good idea, I foresee several issues with it.

By design, bots are built with some level of security concerning who can issue commands to them, as noted in my previous blog about the disassembly of the RxBot, not to mention the differing commandsets that are built into them. Couple this with the new Fast-Flux Service Networks that we are starting to see and this method that Cox is attempting becomes an all but futile effort.

I am also curious where they are obtaining their list of c&c servers from. Perhaps off of the c&c list that Shadowserver.org maintains, or from another location? How do they filter out good IRC traffic from bad IRC traffic on public IRC servers that may have been listed as being a c&c in addition to a legitimate IRC server. From the looks of the article, they don't and this poses an issue by way of blocking legitimate IRC traffic for those that connect to those servers.

A brief list of commands issued:
[INFO] Channel view for “#martian_” opened.

-->| YOU (Drew) have joined #martian_

=-= Mode #martian_ +nt by localhost.localdomain

=-= Topic for #martian_ is “.bot.remove”

=-= Topic for #martian_ was set by Marvin_ on Monday, July 23, 2007 9:50:03 AM

=-= Topic for #martian_ is “.remove”

=-= Topic for #martian_ was set by Marvin_ on Monday, July 23, 2007 9:50:03 AM

=-= Topic for #martian_ is “.uninstall”

=-= Topic for #martian_ was set by Marvin_ on Monday, July 23, 2007 9:50:03 AM

=-= Topic for #martian_ is “!bot.remove”

=-= Topic for #martian_ was set by Marvin_ on Monday, July 23, 2007 9:50:03 AM

=-= Topic for #martian_ is “!remove”

=-= Topic for #martian_ was set by Marvin_ on Monday, July 23, 2007 9:50:03 AM

=-= Topic for #martian_ is “!uninstall”

=-= Topic for #martian_ was set by Marvin_ on Monday, July 23, 2007 9:50:03 AM


.bot.remove


.remove


.uninstall


!bot.remove


!remove


I would also like to review their customer agreement and see if it indeed gives them the authorization to remove files / uninstall things from the end-users computer. Granted the goal is to remove malware; but what if I have been infected by just such malware and need to glean some information, such as what exactly exfiltrated my system? What if I am a business owner and my system contains information that is sensitive to myself, my business or my clients and I need to know what data exfiltrated my network so that I know what corrective or legal measures need to be taken?


All of this said, they also did not notify anyone that they were effectively hijacking DNS records, this somewhat gets back to my second point concerning legitimate IRC traffic that was obviously interupted enough to cause investigation into the matter. This further investigation is what led to the discovery of said hijacking, more here: http://www.exstatica.net/hijacked/

To my mind, the concept was an interesting one albeit innefective but the execution was absurd from unauthorized software removal down to DNS hijacking. This makes you wonder what else they are doing that has not yet been discovered.

Cheers,
JJC

Wednesday, July 18, 2007

RxBot

Recently one of my clients became infected with the RxBot, I was able to detect it using SNORT 2.6.1.4 on a FreeBSD 6.2 system running the latest rules from bleedingthreats.net. That being said, the issue did not originally manifest as a Bot or c&c destination but as a TCP:3306 or MySQL worm scan / propagation attempt.

Specifically it was sid 1:2001689 and sid 1:2404003 that first alerted us to the issue using the aforementioned system with BASE and Sguil. Further research down the line revealed IRC commands on non-standard ports...as found in the bleeding-attack_response.rules.

Without getting into the nitty gritty of the whole thing, disassembly of the bot revealed it to be an RxBot with the following characteristics.
Some of the bot commands and other findings:

auth, logout, wget, port, stop, stats, threads, procs, open, godie, reboot, nick, join, part, http, tftp, rndnick, secure, unsecre, httpstop, logstop, ftfpstop, procsstop, securestop, reconnect, disconnect, quit, status, botid, aliases, clearlog, testdlls, getclip, flusharp, flushdns, crash, killthreads, prefix, server, killproc, killid, delete, list, mirc, read, gethost, addalias, action, cycle, mode, repeat, delay, execute, rename, httpcon, upload, pstore.

Once the bot has found a vulnerable MySQL server it creates a database called 'clown' and dumps a file encoded with base64. The file is then extracted to clown.dll in c:\windows\system32.

This means its a self contained spreader and doesn't need to create additional network connections to spread.

If that fails, it will also use sql xp_cmdshell commands to tftp or ftp the binary from another host.

Over 200 passwords are hardcoded into the binary, which it uses when connecting to both sql and smb shares. Some of those passwords:

staff, teacher, student, intranet, main, winpass, blank, office, control, nokia, siemens, compaq, dell, cisco, oracle, orainstall, sqlpassoainstall, db1234, databasepassword, data, databasepass, dbpassword, dbpass, access, database, domainpassword, domain, domainpass, hello, hell, backup, technical, loginpass, login, mary, kate, george, eric......etc.

channels it sends traffic to:

#nBot-udf pass
#infected
#patch
##sniff##
##keylog##
#cracked
#vnc
#lan
##full##
#dbot
#1
#2
#3
#4
#5
#rose
##dns
#edoo
#dns
#miBot
#MYSQL#
#moh
#sql
#db0t
#nbot-3306
#dbot
##asn
#psyBNC
##final
#final#
#stable
#gecko
#mbot
##mBot
#own#
#vBot
#vCal
##yb
#nBot
#yahoo
#miBot
#rx#
#x1
#x2
#sqltest

some file drops:
c:\cmd.exe
cdmd.exe
dbot.exe
fileWin.exe
nig.exe
windowsVNC.exe
C:\ffd.exe
nrose.exe
c:\pp.exe
C:\pk.exe
C:\OG.exe
C:\ud2.exe
C:\120.exe
C:\lol.exe
C:\ne.exe
C:\fg.exe
c:\dump.exe
C:\ucla.exe
C:\eggdrop.exe
c:\210.exe
C:\faa.exe
C:\full.exe
C:\sql.exe
C:\setps.exe
sgffg.exe
C:\S.exe
C:\vsyncadi.exe
C:\g.exe
C:\npk.exe
C:\Print.exe
C:\MSDEVS.exe
MSD.exe
mswin.exe
C:\bbv.exe
C:\sql.exe
C:\bbnc.exe
C:\pBNC.exe
C:\bot.exe
C:\UD_PI.exe
C:\vbot.exe
yang.exe
qb.exe
ucla.exe
C:\secret.exe
C:\seddcret.exe
C:\S.exe
c:\l0l.exe
c:\MSDEVs3.exe
bbv.exe
C:\h1ggd3n.exe
C:\H9de.exe
C:\xx1.exe
hhiden.exe
C:\setups.exe
C:\n.bat
nwsz.exe
C:\ne.exe

The bot then joins https.easypwn.net with the password s3cr3t.
The bot administrator must have the user host "symtec.us" to issue commands.
The bot has anti-debugger and anti-vmware code, and is packed with AsPack.
The bot registers as version 2, however we've seen evidence a version 3 exists as well.

I would like to thank Nicholas, Jason and Jamaal for their invaluable assistance in the disassembly and work on this fun.

Aside from detecting IRC commands on non-standard ports and portscans, here are a few rules (more to follow) that should help detect this specific bot:

alert udp $HOME_NET any -> $DNS_SERVERS 53 (msg:"RxBot Trojan Client Lookup of easypwn.net"; content:"easypwn.net"; nocase; classtype:trojan-activity; reference:url,global-security.blogspot.com/2007/07/rxbot.html; sid:3000005; rev:2;)


JJC