Category Archives:

ExtraBlu: Checking Flash Content to see if it is from a malicious source using ExtraHop and OctoBlu

The Ransomware epidemic has spread on the internet like a plague in the last 18 months. In fact, Ransomware netted $200 million in Q1 of this year. One might think with numbers like that they would get acquired or start an IPO in the next year!! I predict “Ransomware” makes it into Webster’s dictionary by 2017! (You read it here first)As many (well…to the extent that I could call my readers “many”) of you have read in my earlier posts, I believe that the lack of surveillance, due to budget, staffing or just good ole fashion apathy, is the primary reason for most security breaches. However, in the case of Ransomware, I believe that wire data offers the only way to truly combat this. Ransomware plays out in the blind spot behind the perimeter, in a day and age when your credit report can keep you from renewing your clearance or even getting a job. This coupled with the COMPLETE AND UTTER LACK of advocacy for the consumer or accountability when the information is wrong , an email stating that you are delinquent on a bill is always taken VERY seriously and to think that people will just not open it isn’t necessarily practical. That said, the phishing attempts will continue to evolve and as we block one, they will program another. This is okay! When you use ExtraHop’s wire data analytics, you can pivot too. I know that threat intelligence is still developing but with the availability of restful API’s and an open platform, you have a great shot at keeping your malware/Ransomware exposure to a minimum.

In this post, I want to demonstrate one of those methods. Today I am going to walk through how I set up ExtraHop to integrate with one of our partners (Citrix) OctoBlu platform to just give you an example of how who open platforms can integrate with one another and provide unparalleled visibility as well as access to automatic workflows that can be used to corral infected systems and decrease exposure. I have been studying attack vectors for Cryptowall and in many cases, a user is redirected to a .SWF URI that contains the malicious software or directs them to download/install it. I want to demonstrate how you can leverage ExtraHop and OctoBlu to be able to audit access to these files and ensure that they are, in fact, not from malicious sources.

Materials: (and VERY special thanks to both VirusTotal and who perform an utterly invaluable service to the community with their sites!!!)

  • 3 PCAP’s from Malware Analysis showing Angler Exploit Kit delivering Ransomware
  • A VirusTotal API key
  • An ExtraHop VM
  • An Ubuntu box running TCPReplay

There is some overlapping nomenclature with OctoBlu and ExtraHop so I want to take the time to define it.

  • ExtraHop Triggers: ExtraHop’s triggers are programmable (javascript) objects that allow you to interact directly with your Network and use your Network as a data source. They allow you to set conditions and initiate outcomes. In this case, we are initiating an OctoBlu trigger as well as a precision PCAP to perform a packet capture.
  • OctoBlu Trigger: (I am still learning about OctoBlu but…) An OctoBlu trigger is the item within an OctoBlu flow that initiates the actual workflow that is being performed.


On the ExtraHop System:
We set up a trigger that looks for externally accessed SWF files.

  • Online 14 we are looking for any uri that has .swf in it.
  • On line 15 we indicate that we are looking for non RFC1918 addresses (no 10,192 or 172 networks, just external) serving up the .swf file.
  • Then on lines 23 – 26 we access the OctoBlu trigger URI location to kick off the OctoBlu Flow.
  • Line 24 calls out my specific OctoBlu URI. (to avoid pranksters who will send me 10,000 emails)
  • And Finally, on lines 29 – 41 we are initiating what we call a “Precision Packet Capture” that will also create a PCAP file that I can download and evaluate as digital evidence.


On the OctoBlu system:
The OctoBlu flow has been set up to leverage four tools and one thing. You will see them defined below as follows:

  • Trigger (Tool): The Trigger is what initiates the flow, it has a specific URI assigned to it (called out in Line 24 above) and begins the workflow.
    • It receives the JSON payload from ExtraHop (Line 26) and sends the Server IP delivering the SWF File to the HTTP GET tool which then queries’s API passing my API Key and the Server IP.
    • It also passes the Client and Server IP’s as well as a URL to the “Compose Tool”
  • HTTP Get (Tool): This is the actual query of the VirusTotal API that checks to see if the IP is malicious or not.
  • Compose (Tool): Labeled “Consolidate JSON messages” below, this takes the JSON objects from both the initial trigger and the HTTP Get tools and creates a single set of metrics to be passed to the Template Tool.
  • Template (Tool): Labeled “Prepare Message” below, this takes all of the JSON metrics created in the previous tool and sends them to the “Send Email” thing
  • Send Email (Thing): This is the actual act of sending me an email warning me that I have had a user access a .SWF file from a malicious source.

(Please Click the Image)




The Warning from OctoBlu:
Below is a copy of the email I received when I replayed the pcap file from As you can see, it includes the Server, the Client and a link to the Server IP’s VirusTotal dossier.


The Digital Evidence:
As noted on lines 29 – 41 in the ExtraHop trigger, we have also kicked off a Precision Packet Capture. This makes the actual transaction readily available to download and look at in WireShark and use to determine if there is an actual issue as well as leverage the PCAP itself as digital evidence. As you see below, you have a PCAP named “External SWF File Accessed”.

So, the question was asked by @Brianmadden on twitter as he remarked that OctoBlu was now “True Blue Citrix”, “What can you do with it?”. With the right integration platform I believe that there is quite a bit that can be done with OctoBlu both with ExtraHop but also with their own portfolio of tools. What I love about the two platforms is their open-ness and their ability to increase the aperture of both Security, Dev and Network Operations teams allowing them to have the kind of agility needed to fight in today’s “hand-to-hand combat” world where breaches and vectors pivot, stick and move on a monthly, weekly and daily basis.

Using ExtraHop I have been able to deliver the following integration solutions: (with more to come from an ExtraHop “Blu Bundle”).

  • Get warning about users who are experiencing high latency
  • Get warnings about long logon times that fall outside an SLA
  • Now the post above, where I am warned when an end user accesses an known malicious external flash content.

Other scenarios could include using the Netscaler HTTP Callout feature to warn you when a user launches an ICAPROXY session from outside the US (a breach that actually happened), or when a known malicious actor accesses the company website hosted on a Netscaler VIP. You could also, potentially, use OctoBlu and MeshBlu to shut off Netbios on a system that we see encrypting file shares with Cryptowall.

My comment back to Brian, “the real question is, what CAN’T you do with it” was not meant to be snarky or pithy, it was borne from enthusiasm for open architectures. Sadly, it has been removed from the site but I spoke about this a few years ago at Geek Speak during a session called “Return of the Generalist”. API’s are your friend, those who do not embrace them run the risk of being irrelevant in the new world and may fall prey to “Digital Darwinism”. Embrace Python, Javascript, Go, etc and watch your value to this industry increase as well as your effectiveness. Be it INFOSEC, Citrix, SOA, SDN or Database, API’s will have a major role in tomorrow’s IT.

Thanks for reading!!!

Please watch the Video!!




The “Arm-chair Architect”: Healthcare Dot Gov

Making the news the last few weeks has been the problems associated with the website. Being interested in the Healthcare Exchanges and seeing what might be available I decided to sign up. While doing so I decided to connect to from my lab so that I could see the results in Extrahop and try to get some wire data of the experience.

My Experience:
There were definitely some areas that were slower than others but luckily I signed in during the AM on the east coast so I am guessing the site was not extensively busy at the time. Currently I am waiting to hear back on my eligibility and they sent me a document to download but I am currently unable to download it as it times out repeatedly. Outside of that, the initial sign up took about 15 minutes and while there were some slowdowns, it was not so bad compared to other municipal sites (save those I hosted as a Federal Employee at the CDC which snapped smartly to all requests J )

While several pundits and are having a great time making fun of the Federal Government and the issues with the site, as a former Federal Employe and Contractor for over ten years I can tell you that I worked 50+ hours a week routinely and while there are well noted inefficiencies in the Federal Government some of the smartest people I ever worked with were feds. I am NOT dancing on the sorrows of anyone and I have NO DOUBT that people are busting their asses to make the end user experience as productive as possible. Regardless of how you feel about ObamaCare, we paid for this site to be up and it needs to work as well as possible, if you are involved with this project, I feel ya.

That said, it’s no secret that I am a big fan of wire data and Extrahop and this article is an attempt to promote it and with that, I will go into detail on how I used Extrahop to gain wire data (no agents installed on my workstation, all data was taken directly from the wire) and I will provide info on what could be done if Extrahop were located behind their firewalls and aggregated.

My lab setup: I have a span set up on a Cisco 3550 switch (it’s a lab but it should work on your SDN, Nexus or 6500) that grabs data from the uplink to my firewall. Extrahop has the ability to handle 20GB/s on an appliance and they can Cluster several of them if you want to aggregate several of them and manage them from a single console. For my test, I launched a published desktop within my Citrix farm and signed up for the Obamacare site from behind the span.

While signing up to look at the Healthcare Exchanges I did two things, first Extrahop grabbed the Layer 7 visibility and provided performance metrics on all non-encrypted URI stems. I also used Extrahop Triggers to send the following Events to a Splunk Syslog Server for parsing and reporting.



I will try to keep my suggestions and opinions to a minimum but I do have some suggestions that I will include later. For the most part I just want to report on what Extrahop was able to grab from the wire and either report in the Extrahop Console or Report to Splunk.

Data in the Extrahop Console:
From the Extrahop console I drilled into the Citrix server that I signed up for on. From there I am given a menu of options

Layer 4 TCP:
When I first start to troubleshoot an issue, after verifying Layer 1 and Layer 2 integrity (in this case, my lab is in a two post rack two feet from me so I can verify that) I dive into Layer 4 where Extrahop really starts to give you a solid holistic view of your environment. There are two views within the Layer 4 environment, the first is the L4 TCP node. The L4 TCP Node provides a quick holistic view of the Layer 4 bandwidth data both open, established and closed sessions as well as reporting on the Aborted sessions. You also see a graphing of the RoundTrip time.

If an Extrahop appliance were on a spanned port inside the network, similar metrics could be provided for web farms, back end Database connections and SOAP/RESTful api calls. For this test, I only had the ability to grab the wire data from the client perspective. There would be a much larger breadth of data were the Extrahop appliance located on the side. Also included in the L4 TCP node view are graphs on inbound/outbound congestion and inbound/outbound throttling.

Layer 4 TCP > Details:
The second node is the Layer 4 Details node.  I am a Systems Admin first and a Network Engineer a distant second. While I pride myself on being a generalist, I usually ask for help when looking at Layer 4 details just to make sure I know what I am looking at. I will give you my best effort observation of the L4 TCP > Details node.

Looking below, you see drill-down options on Accepted, Connected, Closed, Expired and Established Sessions. On the In and Out grids I generally look at Resets, Retransmission Timeouts (RTOs), Zero Windows Abords and Dropped Segments. A more experienced Network Engineer may focus on other metrics. Again, if we had an Extrahop Appliance on the inside, we would see the wire data for the actual web server.

As you can see below, when we look at the L4 Detail data we see a much higher number of outbound (From my client to Aborts, Dropped Segments, Resets and Retransmissions. If you had a web farm, you could trigger this data and find a problem node in the group. You can click on any of the linked metrics to drill in to see which hosts are dropping segments, Aborting Connections, etc.

L7 Protocol Node:
The L7 Protocols node provides a holistic view of the protocol utilization during the specified time period as well as the peer devices. From the client perspective, you can see the sites that are providing data either in iframes or the client is sent to directly as a result of redirecting. You see two charts of incoming and outgoing protocol usage broken down by L7 technology. Also, below you see a list of peer devices, I generally look here as well to see if a CRL or OCSP service is not responding fast enough and delaying my site or if I have an infected iFrame that is sending a user to a rouge site. We will get more into peer performance in the trigger section.

From here you can drill into actual Packets and Throughput per Protocol as well as take an interesting look at Turn Timing (also discussed in the trigger section) where you can see the performance of specific protocols.

Layer 7 Protocols > Turn Timing:
Within the turn timing you can see the Network In (Client to Server) Processing Time (Server Performance or Time waiting to respond) and Network Out (Server Response back to the Client).

If the Appliance were on the inside, this could be very valuable to see if there were back end systems that were not responding. From the Client Perspective, looking at the information below, it appears the servers themselves (processing time) seemed to actually perform relatively well on average (we will get into more detail on the triggers) and we seemed to have issues with the web servers responding back to the client. Keep in mind that this data is JUST from the client to the site. It would e considerably more valuable to have information on the performance inside

Layer 7 Protocols > Details:
The details page can also be very valuable, if you are using a specific server for all of your images in your web farm you can take a look at the bandwidth and find out if a specific server has larger images than you have planned on. Also, you can ensure that all of the peer communications are with appropriate IP Addresses. If you are integrating with outside partners, you are only as secure as they are. Sometimes it’s best to periodically verify who your peer nodes are.

DNS: *Disclosure I forgot to use an external DNS Server so in the initial test, my DNS Server was local and therefore did not traverse the span and was not logged in Extrahop. I went back and added an external DNS Server and went back to the site to do some browsing to get these metrics

Few people fully realize the extent to which slow DNS resolution can wreck an application. In the DNS Node you can quickly get a glance at the number of requests and the performance of your DNS Servers as well as drill down into errors and timeouts. If your Web server is consuming a RESTFUL API and the DNS Resolution takes 200ms and the API is called several thousand times a minute, you could see a lot of waiting while using the web app. As previously stated, if I had an Extrahop Appliance inside the network we could see if the web front end were having trouble resolving any names of the tiered API’s they are consuming.

While the majority of the site is delivered via SSL there are a few actions that are delivered by HTTP, the HTTP Node provides a holistic view of the overall environment. In my case, I was the Client so I would set my Metric Type to Client and look at the data. From there I have drilldowns for Errors, URIs and Referrers, if I were looking at a Webserver I would select the Server Metric type and look at the same data. You see below I have Status Codes, methods transaction metrics and transaction rates readily available. If you put the SSL Keys on the Extrahop Appliance (like you would in wireshark) you can also get the Layer 7 performance of Every URI stem that is being delivered via SSL. This could then be used to alert you to slow web applications or downstream API farms where you are consuming web servers from 3rd party partners. I understand that exporting SSL keys is EXTREMELY taboo in the Federal space but I believe you can remove the keys once you have finished troubleshooting.

Due to the data being encrypted there isn’t as much SSL Data as there is with other protocols. When you click on the SSL Node you can ensure that web servers have been configured with FIPS Compliant Ciphers and you can double check key lengths by Clicking on the “Certificates” link. From this menu you will see the Session details, if sessions are being aborted, which versions are being used (root out non-compliant SSLv2/v3 Certs). If I had an appliance on the inside, I would look at the “Aborted” Metric within the “Session Details” area.

Extrahop Trigger Data with Splunk Integration:
My favorite feature of Extrahop is it’s trigger function. Extrahop has the ability to fire off a syslog message, custom metrics or even a pcap compliant capture based on a set of criteria you give it. In the case of, they could set a trigger that states, only syslog REST transactions that take longer than 200ms or Alert me when a database transaction occurs for specific tables in a database. Because I am looking at from a client perspective I can only provide triggers on the Client end but if I had an appliance inside, I could see not only the Client interaction with the site but I could trigger on downstream performance of Databases, IBM Queueing, SOAP/REST calls and Slow DNS Lookups.

As I stated previously, I have triggers on the following. In the case of I may also look at the performance of the DNS Servers. Currently, I am only reporting on the DNS failures and not the performance. This can be added in less than a minute.


Let’s examine a few of these triggers and see what we can glean from the information.

FLOW data:
Within the TCP Flow Triggers I want to look at the FLOW_TURN data as this gives us a good indication of where potential bottlenecks are and how long a client waited for a response from a server. In the FLOW_TURN trigger I am going to grab the following metrics and map them by average to ServerIP. When I make a request there are a number of potential bottlenecks that need to be monitored on the wire. DNS Performance, the client request, the server processing time and then finally the server response. I can wait on any one or all four of them. Within this first Splunk query I am going to look at the client request performance, server processing time and server response. The triggers I use for FLOW_TURN can be found in the Triggers/Downloads section of the blog.

What am I doing in this Query:
The query below uses a Regular expression to convert the ServerIP into “ip”. The “ip” field can be passed to a reverse lookup function within Splunk that will give me the hostname of the ServerIP field. If you are not a big RegEX person there is plenty of RegEx material you can research. I have always found that if you just hit <Shift> and the Number keys enough times, eventually what you are looking for will come up on the screen…J. If you are not interested in learning RegEx then you can simply copy the query below.

Splunk Query:
FLOW_TURN | Search ClientIP=”″ | rex field=_raw “ServerIP=(?<ip>.[^:]+)\sServerPort” | stats count(_time) as Instances avg(TurnReqXfer) avg(TurnRespXfer) avg(tprocess) by ip | lookup dnsLookup ip

Below you see the results of the query above, I have taken the time to sort by the slowest return time and you see that the server had the slowest average response turn time of more than 3 seconds per response. The saving grace is that there were only three instances of it. Far more concerning is 191ms tprocess metric that occurred 676 times.(Please note that it is hitting the Akamai front end, I AM NOT saying Akamai is the bottleneck but there may be a back end server that is causing the slowdown. Again, if I had an appliance on the inside, I could get this metric for each server in the web farm. That said, the nearly 200ms tprocess time is the time that Extrhop observed before the server sent a response packet. This can give you an indication of how long the server took to respond, either due to DNS Resolution (how many DNS Suffixes are in YOUR IP Configuration!?) or just time processing the information.

Once you have the data in Splunk it becomes like “Baseball Stats” where you can get the average Response Transfer time of west coast customers between midnight and six AM on weekdays. The amount of stats you can query is dizzying. From the data below, I can see the average performance by Server.

One other point I want to make about the FLOW_TURN trigger is that it is very valuable in SSL Environments where you cannot get performance metrics because the data is encrypted. While I do not have URI Stems and SOAP/REST calls I do have the basic Layer 4 performance data which can be very valuable in instances where using the Private Key on the appliance is not possible.

*Please note that the data below is what was collected while I was signing up on While I was not knowingly doing anything outside of that, some connections may have been made to other sites that are not affiliated with and may show up in the stats below. There is NO HIDING from wire data and without knowing the application I am not sure who to exclude. Please make a note of it.

Due to the site conducting all but the a few packets in SSL the HTTP data is actually quite lite. I did want to point out what you can get from the HTTP data and show how you can correlate it with a big data back end like Splunk. You can essentially trigger any HTTP Header value as well as the performance of web applications using the HTTP_RESPONSE trigger. Below you are seeing the performance of URI stems for the site as well as the performance of the CRLs. In instances where environments are “Locked Down” the lack of access to CRL’s and OCSP can have a negative impact on Web Applications. Here we note that there are no performance issues with CRLs or OCSP sites. The only URI that we see is the initial site. Everything thereafter was encrypted
Note: With the keys installed on the Extrahop Appliance, you can see the performance of each URI stem and quickly identify web services that are not performing properly.
The Splunk Query:
Note I am removing some “noise” from the results. I had bing as my search and I had to go to gmail to verify my login.

HTTP_* |search ClientIP=”″ Host!=”” Host!=”” Host!=”” | table _time eh_event ServerIP Host HTTP_uri tprocess

Most of the relevant data was from the SSL_OPEN trigger, the only unique item I was triggering from SSL_RECORD was Key Size, I am not sure you can even get a 1024 bit Key anymore, but all keys were 2048 bit so I will not include SSL_RECORD in this article.

While it does not say much in terms of performance, it is sometimes nice to just make sure those Certificates that people are using are what you expect, especially when you are working with 3rd parties and partners. This will allow you to ensure that all SOAP/RESTful web services are meeting the FIPS encryption standards.

The Splunk Query: Note we are, once again, using REGEX to parse out the “ip” so that we can perform a reverse lookup.
eh_event=”SSL_OPEN” | rex field=_raw “ServerIP=(?<ip>.[^:]+)\sServerPort”| eval SSL_EXP=strftime(SSL_EXP,”%+”) | stats count(_time) as Instances by ip SSL_VERSION SSL_CIPHER SSL_SUBJECT SSL_EXP | lookup dnsLookup ip

I am certain that there is no end of “Arm Chair” architects offering DHHS advice. Like I said, I am NOT dancing on anyone’s sorrows here. As a former DHHS (CDC) employee myself I know that they are working around the clock to fix any issues users are having. I feel like the first step in that process is to start gathering Operation Intelligence and Extrahop can do that without any impact on the existing Server architecture. As I stated throughout the post, the data I have is Client side and the data they could collect inside the network would be orders-of-magnitude more valuable.

There are a few VERY important things to note for Feds/Govies (or anyone else) who want to leverage Extrahop’s Wire Data

  • It does NOT require any agents, there will be minimal, if any, changes to the incumbent C & A framework and you should only have to get the Appliance approved. This means that you can call them today, get the appliance rack mounted and tie it into your span and start looking at data without doing ANY configuration changes to the servers.
  • They have a free Discovery Edition that you can use to perform your own Proof of Concept that is a VM.
  • As I stated previously, they can handle up to 20GB/s of data per appliance and they can be clustered so that they are centrally managed as well as aggregated.
  • It will integrate with your existing Splunk environment or any other Syslog server that you have in place. I have used it with both KIWI and Splunk.
  • It augments existing INFOSEC strategies by allowing real-time access to wire data to find Malware, DNS Cache Poisoning (Pharming) and Session hijacking within seconds.

You can check out the Discovery Edition here at: (Please tell them sent you J )
Thanks for Reading