I have long advocated to clients a "layered response" to cyberintrusions. The question is not "if" but "when", and it is important to have not just preventative measures in place.

Prevention, of course is important. A robust security configuration, good coding practices are a must. But sometimes, it's not enough. And that's where my story begins as the other day, I had an unwelcome opportunity to put my thoughts into practice.

Monitoring

Before an intrusion can be managed, it has to be detected. And in order for it to be detected, regular monitoring of the affected systems is essential. I have had a variety of monitoring tools in place for a great many years, among them regularly generated plots of key system activity metrics. I receive regular e-mails from all the systems under my care, informing me of the systems' current state using both tabulated data and some graphical plots. Usually, it's just a matter of a quick glance before I discard the e-mail; one plot is like any other, there's usually little day-to-day change in system activity.

Detection

Getting regular e-mails is one thing, knowing what you are looking for is another. A few days ago, in one of the regular system e-mails I noticed something unusual: a plot showing sustained network activity.

Usually, this turns out to be something completely benign, such as someone downloading several calculator manual PDFs from my calculator museum, or a longer-than-usual backup. Indeed, I was ready to dismiss the e-mail but... well, I had the nagging feeling that this might have been something else so I decided to check. Sure enough, in the indicated time frame I found a very large number of purposefully malformed requests directed at one of my Web servers.

Analysis

What were these requests? It became clear that they were SQL injection attempts. The query strings had the peculiar property that they contained characters that were encoded twice using standard URL encoding. So for instance, an apostrophe became %27 which then became %2527. For instance, a request like this (except the actual string was much longer):

GET /CMS/exhibit-hall/?view=article&id=7&manufacturer=-8948%2527%2520UNION%2520ALL%2520SELECT%2520NULL...

All these requests were targeting a very specific page, a page on my calculator museum that normally serves up details about specific calculators, identified by the manufacturer name and model number. The page was written roughly 20 years ago and it underwent some minor revisions to properly handle accented characters that are present in the names of some (non-US) manufacturers, like Hungary's Híradástechnika.

This was a strong hint. Why the double encoding?

Looking at the PHP source code of the page revealed the problem. These two lines, in particular:

    $title = mysqli_real_escape_string($conn, $_GET['manufacturer']);
    $man = urldecode($title);

First, the $title variable is assigned a value received from the user, and sanitized to remove any harmful content.

Then, this variable is further decoded (it was automatically decoded once already when the query string was processed by PHP) to form another variable that is later used in a SQL query string.

The problem: The variable is not sanitized again after the second decode. As a result, doubly-encoded characters survive.

This structure might also explain how the attacker (or perhaps malicious AI?) was able to stumble upon this exploit: The value of the $title variable is barfed back to the user in the form of the title of the resulting HTML page.

Correction

Once spotted, the error was easy to fix: Just sanitize the darn string again! A better long-term fix would be to do away with building SQL query strings in this manner, but that will require a more extensive rewrite of old, otherwise perfectly functioning, software code. (There's a bit of a twist in the code logic here, which is why the rewrite is not trivial.)

Mitigation

What did the attacker accomplish? Fortunately, I have detailed logs. From the logs I ascertained that they managed to obtain over 100,000 records from a selection of tables in my SQL database. Most of that data are worthless, e.g., tens of thousands of records containing vehicle positions from a traffic simulation. The privileges under which the calculator museum code can access the database are rather limited. Still, some sensitive information was stolen, notably some API keys to paid services. These keys had to be invalidated and replaced.

Intriguingly, the attacker made no attempt to change anything on the system, only download information. Considering the quality of the attack vs. the very low value of the data they obtained, I strongly suspect that my system was not targeted in particular; rather, it was likely part of a broader effort to find vulnerable systems that could be exploited later for some nefarious purpose. Curiously, they also made no attempt to cover their tracks.

Recovery

Once I had a full understanding of the problem, it was time to return things to normal. The coding error was corrected. The offending IP addresses were blocked though of course it does not prevent the same attackers from accessing my systems from a different IP block. I am performing some additional monitoring, just in case. I saved a full dump of the relevant logs. And I keep looking at it, also requesting the help of non-malicious AI (Claude, from Anthropic) which already helped me greatly by analyzing the attack through log samples and writing for me quick-and-dirty code to filter and decode the attempts.

⋆ ⋆ ⋆

How I felt about this whole incident is best captured by this Midjourney AI art of a gentlecat berating the hacker.

The takeaway, which I hope may benefit those reading this, is that incident response should involve (at the very least) these seven layers:

  • prevention
  • monitoring
  • detection
  • analysis
  • correction
  • mitigation
  • recovery

Not to be followed slavishly of course. For instance, you may want to begin with mitigation (to prevent further damage) before correcting a complex problem. And there may be iterative aspects, depending on the depth and complexity of the incident.

In the end, this intrusion mostly just hurt my pride. On the other hand, I take pride in the fact that I practice what I preach: the intrusion was detected within a few hours and I acted exactly in accordance with what I've been advocating to my clients for many years.

Of course it does leave me wonder, were there other intrusions in the past, either through this or through some other, still unpatched vulnerability that I never even noticed? As the AI Claude remarked, when I solicited its help as part of my analysis, "Too often intrusions go undetected for months or years."