ShopBox Breach Forensics: A Case Study in Missed Detection

I need to tell you about the worst week of my professional life. Not for sympathy — because understanding how I failed to catch this thing in our own lab environment might keep you from making the same mistakes when it matters.

This happened eighteen months ago, back when ShopBox was still running on a single MySQL instance at 192.0.2.20 with what I thought was adequate hardening. I was wrong. The attacker — a red team we had hired and then essentially forgotten about after the contract ended — found a gap I had left wide open. What follows is my reconstruction, built from the fragments we recovered.

The Discovery: Performance, Not Alerts

We didn't find this because our WAF (Web Application Firewall, a filter that inspects HTTP requests) screamed. We found it because Maria at the NOC noticed the query cache hit ratio on 192.0.2.20 had dropped from 87% to 12% over a weekend. The database was working harder. Much harder.

I pulled the slow query log first. That's when I saw it: the same SELECT * FROM products WHERE name LIKE '%...%' pattern, but with these monstrous UNION clauses appended, running hundreds of times per hour. Someone had been in our search box for days.

Here's the timeline I reconstructed, with timestamps in UTC:

Timestamp	Event	Source
2023-04-14T09:23:17Z	Initial reconnaissance: `' OR '1'='1` blocked by WAF rule `SQLI-001`	WAF access log (partial)
2023-04-14T11:45:33Z	Pivot to Unicode-encoded payload: `%u0027%u0020%u004F%u0052%u0020%u0031%u003D%u0031`	Proxy gap — no log
2023-04-14T14:08:19Z	Successful injection via search parameter; first `UNION SELECT` to enumerate columns	MySQL general log (rotated, recovered)
2023-04-15T03:17:44Z	Schema enumeration complete; `information_schema.tables` queried	MySQL general log
2023-04-15T08:52:11Z	Data extraction begins: `users` table, 50-row chunks	MySQL general log
2023-04-16T19:33:27Z	Extraction shifts to `payment_methods` table	MySQL general log
2023-04-17T06:14:58Z	Database performance anomaly flagged by NOC monitoring	Internal ticket #INC-2023-0442

Three days. Almost seventy-two hours of continuous extraction. And I only knew about the first two hours of failed attempts because the WAF had logged those.

The Forensic Artifacts: What Survived, What Didn't

I want to show you exactly what we had to work with, because the gaps are as instructive as the data.

Recovered: Rotated MySQL General Log

The general log had rotated three times during the incident. We recovered the second rotation from backup tapes — the first had already aged out, and the third was still active. Here's the fragment that showed the pivot from reconnaissance to extraction. I've redacted actual table names with [REDACTED]:

# Recovered from /var/lib/mysql/shopbox-general.log.2.gz
# Hash: sha256:a3f7c...e2d9 (verification reference) 2023-04-14T14:08:19.004321Z 42 Query SELECT * FROM products WHERE name LIKE '%'
UNION SELECT null,null,null,null,null,null FROM information_schema.tables-- -%'
2023-04-14T14:08:19.447892Z 42 Query SELECT * FROM products WHERE name LIKE '%'
UNION SELECT table_name,null,null,null,null,null FROM information_schema.tables
WHERE table_schema='shopbox_production' LIMIT 50 OFFSET 0-- -%'
2023-04-14T14:12:55.883104Z 42 Query SELECT * FROM products WHERE name LIKE '%'
UNION SELECT column_name,null,null,null,null,null FROM information_schema.columns
WHERE table_name='[REDACTED]'-- -%'

That 42 is the connection ID. Same session, same source IP (the load balancer's internal address, naturally), running for hours. The attacker was patient — offset increments of 50, never greedy enough to trigger our row-count thresholds.

Lost: Application Logs

The ShopBox application logs on 192.0.2.10 were truncated. Not maliciously — just rotated aggressively because we hadn't sized the partition properly. We had forty-eight hours of application logs on a system that had been compromised for seventy-two. The search parameter values, which would have shown the raw payloads before any encoding, were gone.

This is a classic mistake: logging at the wrong layer. We were capturing at the application, but the database was where the actual malicious queries executed. As covered earlier in this guide, detection architecture needs depth. I had depth in theory. In practice, I had one layer with a forty-eight-hour window.

The Proxy Gap: TLS Termination Blindness

Here's where I really screwed up. Our architecture had TLS terminating at the AWS Network Load Balancer — what some call SSL offloading, where encryption ends at the load balancer so backend servers don't burn CPU on session negotiation. Traffic between the load balancer and 192.0.2.10 was unencrypted HTTP.

The load balancer access logs went to S3, as AWS documentation describes. But the WAF — a separate appliance we had deployed behind the load balancer — only saw the already-decoded HTTP traffic. The Unicode-encoded payload %u0027%u0020%u004F%u0052... arrived at the WAF as the literal characters ' OR '1'='1, but by then, the WAF's signature matching had already run on the encoded form at the load balancer layer.

Wait — that's wrong. Let me be precise: the WAF did see the decoded form. The problem was that our rule SQLI-001 was written to match the ASCII string ' OR '1'='1 with word-boundary assertions. The decoded Unicode arrived with different byte representations that didn't match the signature's assumptions about character encoding. The rule expected %27 or a straight apostrophe; it didn't account for %u0027 being decoded to the same visual character but different bytes in memory.

In plain terms: The WAF was looking for a specific pattern of bytes that meant "SQL injection attempt." The attacker sent the same meaning but dressed it in a different encoding, like writing a forbidden word in a different alphabet that looks identical. The WAF's eyes were trained for one alphabet.

I verified this later in the lab. The rule blocked ' OR '1'='1 every time. It never fired on %u0027%u0020%u004F%u0052%u0020%u0031%u003D%u0031 even once.

Root Cause: The Partial Deployment

I need to explain how this was my fault, specifically.

Six months before the incident, I had hardened the ShopBox login form. I remember the ticket: #SEC-112, "Parameterize all authentication queries." I converted login.php to use prepared statements — where the SQL query structure is fixed and user input is bound as data, not concatenated into the string. I tested it. I moved on.

The search function in search.php? Same codebase. Same patterns. I looked at it. I even wrote a note: "Refactor to parameterized query — low priority, no direct auth bypass." Then I got pulled into another project. The note stayed in the backlog.

The attacker found it in hour three of their reconnaissance. The login was stone. The search was a welcome mat.

This is what OWASP's Code Review Guide warns about: secure code review must be systematic, not heroically targeted. I had reviewed the scary parts and assumed the rest was fine. The methodology I should have followed — integrating review into the SDLC with checklists covering every query that consumes URL parameters — would have caught this. I didn't follow it. I was a senior practitioner who thought he knew where the risks were.

The Exfiltration Estimate

From the recovered general log, I could count the UNION SELECT queries and their LIMIT/OFFSET progression. The attacker pulled:

users table: ~4,200 rows in 84 queries (50-row chunks)
payment_methods table: ~1,100 rows in 22 queries
orders table: ~8,900 rows in 178 queries (abandoned partway, likely got what they needed)

Total estimated exfiltration: approximately 14,200 records across 72 hours. The query log showed consistent timing — roughly one extraction query every 4-7 minutes during waking hours, with longer gaps overnight. The attacker was avoiding rate-based detection by being boring.

Post-Incident: What I Actually Did

I won't give you the full remediation checklist here — that's covered later in this guide, in the "Defense in Depth" chapter. But I want to show you the code review findings, because they illustrate how partial hardening creates a false sense of security.

I ran a grep across the entire ShopBox codebase for string concatenation in SQL:

# Run from /var/www/shopbox on 192.0.2.10
# Safer variant: run on a copy of the codebase, not production grep -rn "SELECT.*\\$" --include="*.php" . > /tmp/potential-concatenation.txt
# illustrative output — verify on your target
# ./search.php:47: $query = "SELECT * FROM products WHERE name LIKE '%" . $_GET['q'] . "%'";
# ./legacy/reporting.php:112: $sql = "SELECT * FROM orders WHERE date > '" . $start . "'";
# ./admin/export.php:203: $cmd = "SELECT * FROM " . $_POST['table']; # oh no

Three hits. Three more injection points. The attacker had only needed one.

⚠️ Authorized, defensive use only. This grep pattern is for code review of systems you own or have explicit permission to assess. Never run against third-party systems.

The export.php finding was particularly ugly — dynamic table names can't be parameterized in most SQL APIs, so that pattern needs architectural redesign, not just a prepared statement wrapper.

What I Learned: Detection Gaps That Matter

I want to leave you with the specific lessons, earned the hard way:

First: WAF rules without encoding normalization are incomplete. We now run all payloads through canonicalization before signature matching. The WAF still won't catch everything, but it won't miss the obvious encoding tricks.

Second: Log at the layer where the attack executes. Application logs are for debugging. Database logs are for forensics. Size them appropriately — my forty-eight-hour rotation was a choice I made to save disk space. The cost was three days of blind exfiltration.

Third: Partial hardening is worse than no hardening, psychologically. I thought I had done the work. The login was safe. That confidence made me slower to suspect SQLi when the performance anomaly appeared. If the whole app had been vulnerable, I might have recognized the pattern faster.

Fourth: The proxy gap created by TLS termination isn't just a security architecture question — it's a detection architecture question. When your WAF and your application logs see different things because something in between transforms the traffic, you need explicit verification that your detection spans the gap.

I keep a printout of that recovered log fragment taped above my monitor. Not as punishment. As a reminder that the gap between "I secured the important stuff" and "I secured everything that matters" is where attackers live.