Working Through Splunk's Boss of the SOC - Part 2

June 27, 2020 Chris Long

13 minute read

I hope you enjoyed part 1 of this series and learned a few things along the way. I’m going to jump right into questions for part 2.

Question 11 (214)

What is the short hostname of the only Frothly endpoint to actually mine Monero cryptocurrency? (Example: ahamilton instead of ahamilton.mycompany.com)

NOTE: This answer writeup contains spoilers for question 9!

Wow, that’s a broad question. To me, the question implies that the attempt to mine cryptocurrency was actually successful and proof of work may have been submitted over the network. However, it’s not totally clear how the word “actually” is being used here.

When I think about identifying crypto-miner behavior on networks based on logs, a few search options come to mind:

Identifying miner traffic by looking for terms like “coin” and “pool” in DNS queries
Identifying miner processes by looking for protocols (e.g. http/https/stratum/etc.) in commandline arguments
Using yara rules to search binaries for strings commonly found in cryptominers

I’m still a bit irritated with question 9 - nothing about that Chrome process screamed “crypto-miner” to me, unless that process wasn’t actually what it said it was. I decided to dig back into question 9 and see if I could verify that the process was what it says it was.

index=botsv3 earliest=0 BSTOLL-L chrome sysmon

Sure enough, there are a bunch of chrome processes launching during that time with SHA256 of 268A0463D7CB907D45E1C2AB91703E71734116F08B2C090E34C2D506183F9BCA, which VirusTotal says is signed by Google:

This leads me to believe we’re dealing with a browser-based coin miner, and the most common one that comes to mind is (the now defunct) CoinHive.

A wildcard search for index=botsv3 earliest=0 *coin* OR *monero* brings up a lot of noise from the lambda:DNS source (not 100% sure what that is yet), but we also see some alerts from Symantec Endpoint Manager:

So, to me, the fact that there are AV alerts for BTUN-L means that this was probably NOT a successful mining attempt. However, back in question 9, we saw mining from <redacted - question #11 answer>, so I think that might be the answer by process of elimination.

checks answer

And it is!

Question 11 answer:

BSTOLL-L

Question 12 (215)

How many cryptocurrency mining destinations are visited by Frothly endpoints?

I already don’t like this question. What is a “cryptocurrency mining endpoint”? Is a website hosting coinhive code an endpoint? Is a mining pool an endpoint? Is a download of a mining tool from a website an endpoint? Leave it to CTFs to find joy in the vagueries. I get it though - it’s hard to craft questions to not be too specific, otherwise they’d be too easy to solve.

We now know BTUN-L has successful detections from Symantec EP, but maybe <redacted - question #11 answer> doesn’t have AV installed.

I’m assuming this question is only referring to the intrustions we know about, and is asking us to count them.

I’m gonna be honest here, I don’t understand what the question is asking exactly or how they crafted the answer. I got close by listing all of the Intrusion_URLs from the Symantec EP, stripping off the URL parameters using regex, and counting the different endpoints. My query was:

index=botsv3 earliest=0 jscoinminer 
| stats count by Intrusion_URL 
| rex field=Intrusion_URL "(?<endpoint>[^\?]+)" 
| dedup endpoint

Explaining my regex a bit more, I wanted to take a URL like www.brewertalk.com/attachment.php?thumbnail=2 and strip off the arguments to make it looks like www.brewertalk.com/attachment.php. To do that, the regex basically says:

We’re using the Intrusion_URL field
We’re going to make a new field called endpoint and that field will consist of all characters that are not a question mark. The ^ is basically a “not” character in regex. The + means we want 1 or more characters all the way up until we hit a question mark.

So, ultimately I came up with 5. That is not the answer. I don’t know how the creators derived the answer, but the answer is posted below. If anyone knows what I missed here, post it on Twitter!

Question 12 answer:

Question 13 (212)

Using Splunk’s event order functions, what is the first seen signature ID of the coin miner threat according to Frothly’s Symantec Endpoint Protection (SEP) data?

Now this is a question I can get behind. We already looked at some of this data in the symantec:ep:security:file sourcetype.

I always love making tables to visualize the data:

index=botsv3 earliest=0 sourcetype=symantec:ep:security:file 
| table _time, Application_Name, CIDS_Signature_ID, CIDS_Signature_String, Event_Description
| sort + _time

In the meantime, I figured out a quick way to check my answers using bash without revealing the actual answer:

# Store the answer to question 13 in a variable (first line is a header)
answer=$(sed -n 14p ctf_answers.csv  | cut -d ',' -f 2)
if [ "$answer" == "my_guess" ]; then
  echo "Correct"
else
  echo "Nope. Try again."
fi

I checked what I thought was the right answer, but it wasn’t correct. Looking back at the data, many events had the same timestamp!

Looking deeper into the event, they actually contain a field called Begin_Time. Let’s add that field to our table and sort by that.

index=botsv3 earliest=0 sourcetype=symantec:ep:security:file 
| table _time, Application_Name, CIDS_Signature_ID, CIDS_Signature_String, Event_Description, Begin_Time
| sort + Begin_Time

However, once again I find myself scratching my head. There’s an event with a begin time of 2018-08-18 20:51:13 and signatureID of 30356. I test 30356 as the answer, and it’s also wrong.

The next event has a Begin_Time of 2018-08-18 20:51:14 (which last time I checked, was 1 second after 2018-08-18 20:51:13) and signatureID of <redacted - question 13 answer>. I test this signatureID as the answer, and it is correct. I’m not sure why that is right and 30356 is wrong, but ¯\(ツ)/¯. Once again, feel free to post corrections!

Question 13 answer:

30358

Question 14 (213)

According to Symantec’s website, what is the severity of this specific coin miner threat?

A little bit of Google-fu is the only thing needed here. Plus, when it comes to severity, there’s usually only 3-4 options, right? ;)

Question 14 answer:

Medium

Question 15 (2014)

What is the short hostname of the only Frothly endpoint to show evidence of defeating the cryptocurrency threat? (Example: ahamilton instead of ahamilton.mycompany.com)

Just take a look back at the Symantec EP sourcetype and the Host_Name field for this one.

Question 15 answer:

BTUN-L

Question 16 (215)

What is the FQDN of the endpoint that is running a different Windows operating system edition than the others?

Maybe another question solvable with osquery! Maybe not. I just like osquery :)

However, looking at the osquery queries doesn’t provide much hope that the operating system version is being captured:

index=botsv3 earliest=0 sourcetype=osquery:results *windows* 
| stats count by name`

Let’s see which sourcetypes contain the string “windows”:

index=botsv3 earliest=0 windows 
| stats count by sourcetype

The WinHostMon sourcetype looks promising! Let’s expand on that and see if there’s any Windows 10 or Windows 7 in place: index=botsv3 earliest=0 sourcetype=winhostmon "windows 10" OR "windows 7"

Bingo! Let’s use stats to find the outlier:

index=botsv3 earliest=0 sourcetype=winhostmon "windows 10" OR "windows 7" 
| stats count by OS, host 
| stats values(host) by OS

I try the resulting outlier hostname, but wait! The question is asking for the FQDN, not just the hostname. I’ll just search the hostname and see if I can find it elsewhere, like in the WindowsEventLogs or sysmon data for example. I find the full FQDN in the sysmon data.

Question 16 answer:

BSTOLL-L.froth.ly

Question 17 (216)

According to the Cisco NVM flow logs, for how many seconds does the endpoint generate Monero cryptocurrency? Answer guidance: Round to the nearest second without the unit of measure.

By “the endpoint”, I’m guessing they mean the one from question 11. Now let’s find the NVM flow logs. I check out the Cisco NVM app which is installed, and it’s looking in sourcetype=cisco:nvm:sysdata. However, that sourcetype doesnt exist in our index :[

I see a reference to a pap field in the app dashboard saved searches (shown above), so I decide to search the whole BOTS index for the presence of the pap field: index=botsv3 earliest=0 pap=* | head 100

Bingo! There is a cisconvmflowdata source in the syslog sourcetype, and plenty of entries contain “BudStoll”.

Let’s take a look at all the domains accessed by Bud:

index=botsv3 earliest=0 source=cisconvmflowdata pap=BudStoll | stats count by dh

I see some entries to coinhive.com domains and also entries to www.brewertak.com (which was the domain Symantec EP flagged on). Let’s assume that’s the “malicious” domain in question. Here’s where things get complicated: look at the times that BSTOLL-L had Chrome pegged at 100% - there are two distinct instances; one at 6:37-6:38am and one at 7:59am.

There are also 3 distinct accesses to a brewertalk.com domain:

But I’m guessing the JS miner gets loaded through an iFrame or external JS resource, so the requests would actually hit a coinhive.com subdomain:

So lets calculate the total time captured by each flow event to coinhive.com domains using Splunk. There’s a fst and fet field (I’m guessing this means flow start time and flow end time) that we can use to calculate the number of seconds spent in each flow. We’ll convert timestamps to epoch time, do the subtraction, and sum all the differences together.

index=botsv3 earliest=0 source=cisconvmflowdata pap=BudStoll dh=*.coinhive.com 
| table fst, fet 
| eval start_epoch=strptime('fst', "%c") 
| eval end_epoch=strptime('fet', "%c") 
| eval difference=(end_epoch-start_epoch)
| stats sum(difference)

Result: 1634 seconds

The answer I came up with using this query is fairly close to the actual answer, but it’s not quite right. I’m not sure if my query is off, or if I’m somehow missing something that would lead to the right answer. Once again, I’ll leave it up to the readers to let me know what I’m missing here!

Question 17 answer:

1666

Question 18 (217)

What kind of Splunk visualization was in the first file attachment that Bud emails to Frothly employees to illustrate the coin miner issue? Answer guidance: Two words. (Example: choropleth map)

When I think of email, I think of SMTP/IMAP logs. I know we’re looking for some sort of an attachment - maybe an image or document. Let’s find sourcetypes that contain email-related data and then drill down into potential attachments.

The first part of the query lists all the sourcetypes quickly and efficiently, then I’m just doing a wildcard string search on that set of results to look for mail related terms:

| tstats count where index=botsv3 by sourcetype 
| search sourcetype=*mail* 
OR sourcetype=*smtp* 
OR sourcetype=*imap* 
OR sourcetype=*pop3* 
OR sourcetype=*mail* 
OR sourcetype=*o365* 
OR sourcetype=*gmail* 
OR sourcetype=*outlook*

Normally a query with a bunch of wildcards like this is a terrible idea, but we’re only searching the output of the first query, which is 107 lines. That’s a piece of cake for Splunk!

The ms:o365:reporting:messagetrace and stream:smtp sourcetypes look interesting! Let’s start with those and see if either of them have “attachment” data:

index=botsv3 earliest=0 (sourcetype=stream:smtp OR sourcetype=ms:o365:reporting:messagetrace) attachment

3 events returned. We know Bud sent an email from his account to Frothly employees, so let’s use his email address as a raw text filter. There seem to be a lot of “sender” field addresses, and I’m a little concerned that choosing the wrong one will exclude certain events, so I’m not going to use a field to search here.

index=botsv3 earliest=0 (sourcetype=stream:smtp OR sourcetype=ms:o365:reporting:messagetrace) bstoll@froth.ly

We know the email is related to malware, so let’s look at all the email subjects:

index=botsv3 earliest=0 (sourcetype=stream:smtp OR sourcetype=ms:o365:reporting:messagetrace) 
bstoll@froth.ly 
| stats count by Subject

We know brewertalk was one of the domains hosting CoinHive miner code, so Postmortem on our issue with brewertalk seems like a likely email subject chain to contain a chart.

Using Subject="Postmortem on our issue with brewertalk" as a search filter shows less events (12) than just a raw string search (13), and that one missing event appears to contain attachment data. That’s because the attachment data is in the stream:smtp source and uses a lowercase subject field and the others are in the ms_o365_message_trace source which uses a Subject field. Tricky! Be careful using field searches for this reason!

Looking at the single event in the stream:smtp field shows allhands@froth.ly as a recipient an attachment name of image002.jpg. We have the entire content body in the content field, so I’m just gonna copy and paste this data into a .eml file and open it.

Hmm, nope, guess I can’t just do that, but the email does say Content-Transfer-Encoding: base64! Let’s take that giant blob of base64 text and try decoding it.

To do this, start copying the text at /9j/4AAQSkZJRgAB and end with RV1X/rzl/9BNFFAH//2Q== into a text file. Using the command line on MacOS, you can simply use cat base64.txt | base64 -d > ~/Desktop/image.jpg.

Yahtzee! The resulting image contains the answer.

Question 18 answer:

Column Chart

Question 19 (218)

What IAM user access key generates the most distinct errors when attempting to access IAM resources?

Reading this question closely, I notice it says distinct errors, not just “errors”. That will be an important thing to keep in mind.

Let’s start broad again: index=botsv3 earliest=0 sourcetype="aws:cloudtrail" error

Oddly enough, this doesn’t return any results. Maybe “error” isn’t the right term to be looking for. I try again, but use wildcards this time in case the word “error” is combined with another word like “errorMessage”.

index=botsv3 earliest=0 sourcetype="aws:cloudtrail" *error*

Yay, 1,100 events this time. The access key field is userIdentity.accessKeyId but I’m not sure if “distinct errors” means distinct error codes, error messages, or if those fields have a 1:1 mapping. Let’s do some stats and find out.

index=botsv3 earliest=0 sourcetype="aws:cloudtrail" *error* 
| stats count by errorCode, errorMessage, userIdentity.accessKeyId

Let’s calculate based on errorCode and errorMessage separately and see if either of those answers end up being correct. Note: the dc() function here is distinctCount().

errorCode

index=botsv3 earliest=0 sourcetype="aws:cloudtrail" *error* 
| stats dc(errorCode) as distinct_error_codes by userIdentity.accessKeyId 
| sort - distinct_error_codes

This gets us a 3-way tie, so I suspect this isn’t the right answer.

errorMessage

index=botsv3 earliest=0 sourcetype="aws:cloudtrail" *error* 
| stats dc(errorMessage) as distinct_error_messages by userIdentity.accessKeyId 
| sort - distinct_error_messages

This set of results has a CLEAR outlier, but it’s not the correct answer. What gives?! I re-read the question and immediately facepalm. I was so busy noticing the “distinct count” that I missed the “attempting to access IAM resources” part! Let’s see if there’s a field we can use to narrow down the destination service. I spend some time looking at the eventName field, but soon after notice the eventSource field contains the service endpoint (e.g. iam, s3, kms, etc.). Let’s try the same query above, but using this as a filter!

errorMessage (IAM only)

index=botsv3 earliest=0 sourcetype="aws:cloudtrail" *error* eventSource=iam.amazonaws.com
| stats dc(errorMessage) as distinct_error_messages by userIdentity.accessKeyId 
| sort - distinct_error_messages

That’s it!

Question 19 answer:

AKIAJOGCDXJ5NW5PXUPA

Question 20 (219)

Bud accidentally commits AWS access keys to an external code repository. Shortly after, he receives a notification from AWS that the account had been compromised. What is the support case ID that Amazon opens on his behalf?

Yikes, Bud. Time to spend some time learning about .gitignore files. Not only was Amazon kind enough to notify Bud, but they even have a playbook published on what to do if you inadvertently expose your keys!

It seems to me that the most likely way Amazon would have contacted Bud would be through email, so it’s back to our index=botsv3 earliest=0 (sourcetype=stream:smtp OR sourcetype=ms:o365:reporting:messagetrace) query.

The following query gets us our answer in the content_body field: index=botsv3 earliest=0 (sourcetype=stream:smtp OR sourcetype=ms:o365:reporting:messagetrace) "access key"

Question 20 answer:

5244329601

Home

Blog

Contact