Tracking unusual traffic on your WordPress website using Splunk Community Edition – Part 2

Welcome back to the second part of my three part series on using Splunk Community Edition to track for unusual activity on your Apache based WordPress site. Or, as I am now beginning to call it – me versus China! (Again, nothing against China – but – man, you trap one bot and another just starts up it seems!).

Anyhow, in the first part I set the scene about the problems that I was experiencing with some unusual traffic that I was seeing on in regard to script executions. I wanted to be able to be able to track the source of these executions, determine what they were and then hopefully put in place some counter measures to prevent them from reoccurring (which, as I was about to find was a little more difficult than I thought).

So, given the steps in the first part; I now had some log file data within Splunk – so now I needed to analyse it.

From the Splunk Home screen (as you log into it) Select “Search & Reporting” – if you do not see this option click on the Splunk logo at the top left hand side of the screen which should give you access to the menu option (see below):

This will take you to the search screen (yes, that’s Iron Man in my browser – I’m an unashamed Avengers fan) – there are a number of options and ways to search the data that are in the indexes, but for the purposes of this article I will go through the most simple way to query the data which has been uploaded.

From the Search screen click on the “Data Summary” button (highlighted in red below).

The data summary window will open, select the host which you imported your data into (in my case, this is highlighted in red below):

You will be presented with a view which contains all of the imported events from the Apache logs. You will also a number of options such as a time line of events and a really good option entitled “Patterns” – which does what you might expect, intelligently picks out patterns in your data which you can view for any telemetry of interest.

I will be honest, when I was using the Patterns function, which it revealed some interesting stuff – it wasn’t helpful in ultimately finding what I was ultimately looking for – so that is why this article will focus on the manual way.

From the left hand side of the screen you will see two areas:

  • Selected Fields
  • Interesting Fields

The default selected fields didn’t provide me with the information that I was looking for, so I decided to add more to the selected area. I wanted to add “clientIP” and “File” (highlighted in red below) as they seemed to be of interest as there was over 100+ events associated with them.

Clicking on the “clientIP” revealed more details, which were interesting to say the least. Rather helpfully, when you select one of the “Interesting Fields” you get an overview of the top 10 values from that field with the number of times that it has appeared within the source data with an associated percentage of commonality when compared to other fields in the results.

This was very useful as I could see that the number one source IP Address had hit my site over 28 thousand times, accounting for around 7.3% of all hits.

Aside from this just looking strange generally, if you remember from part one of this series I had initially used AWStats and the cPanel dashboard to give me some very high level information – none of which could equate to a single IP Address hitting the site 28 thousand times.

So this represented something scanning the site, causing a script execution – but was not being reflected clearly in the logs (I say clearly, as the hits were probably there – just not as clear as I had needed).

Interestingly enough, I also noticed that the second highest address in the list (denoted as is from an IP address range used by my hosting provider (SiteGround).

I’m not overly concerned about this specific IP address; but I will be asking them what that address is doing due to the fact it’s adding to my script executions that count against my allowance which doesn’t seem fair as it does not appear to be something that I have configured.

To add the clientIP address as a specific field in the report layout (rather than just being part fo the wider data returned) click on the “Yes” button from the “Selected” section (highlighted in red below).

Anyhow, so I now had the who – now I needed the what!

As mentioned previously, another field that was available was “File” (as this also had 100+ values). The file field in this context denotes the page (or file) that was accessed by the source IP address. This would be useful as it would give me an idea to what service or content was being repeatedly accessed.

With the clientIP (highlighted in red below) and the file field (highlighted in green below) I could now see that the source IP addresses where accessing the /xmlrp.php file on my website. So, combined with the information that I had from my SiteGround dashboard (see part 1) plus the information in the logs I now knew that my site was under some form of attack from a source using an unusual user agent (highlighted in yellow below).

Why did I suspect this?

Well, analysing the main offending IP Address using Ultra Tools I discovered the country of origin was China (see below highlighted in red):

Combine that with the fact that the file being accessed 28 thousand times was XMLRPC.php using the POST method from an uncommon user agent.

XMLRPC.php was quite popular in the early days of WordPress as it provided a mechanism to post content from remote sources and editors to your WordPress site (anyone who used Microsoft Live Writer or similar back in the day will be familiar with this). Whilst it is still in use within WordPress it will be depreciated eventually and replaced with a new API.

However in the context of this article – this kind of access to my site was highly unusual as:

  1. There is no reason why anyone or anything from a Chinese origin should be trying to use remote code to publish anything to my site.
  2. I have no use for xmlrpc.

So, what did I do?

I decided that the best course of action was to disable access from anywhere to the xmlrpc.php file via .htaccess – this would mean that any requests to the file would be denied before they were passed to WordPress before execution or processing.

To do this I added the following lines of code to the .htaccess 

<Files xmlrpc.php>
order deny,allow
deny from all

The above denies access to the file from any source.

Furthermore I also added the following lines to .htaccess:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (.*Apache-HttpClient/4.5.2.*) [NC]
RewriteRule .* - [F,L]

The above should also block any source user agent matching the fuzzy logic wildcard from being able to execute anything on the website.

In the next part

Now that I had found the main culprit for the high level of executions on my site, and indeed – hopefully put a stop to it for the foreseeable – in the next part of this series I would like to go though how you can (and I) have had some fun with Splunk’s visualisation tools in the context of Dashboards. I will cover off how you can build a simple Dashboard that charts unusual activity on your site visually which updates as you add in further log information.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.