Making Incident Response a Security Program Enabler

Sunday, April 26, 2015 Posted by Corey Harrell 12 comments
Incident response is frequently viewed as a reactive process. As soon as something bad happens that is when the incident response process is activated to respond to what occurred. This view is similar to insurance. Every month we spend money on buying insurance so it is available when we need it. It doesn’t matter if the insurance gets used once in a year or not at all; money is still spent on a monthly basis to buy it. In a way, it’s easy to see the similarity to the incident response process. Resources - such as staffing and technology - are invested in the incident response process. In some organizations there is a sizable investment while in others very little. The hope is something is available when the organizations need it. How can one change an organization’s view of incident response? How can you take a traditional reactive process and make it in to a proactive process that’s an enabler for the organization’s information security program? This post discusses one approach to make incident response a security enabler by addressing: continuous incident response, incident response metrics, root cause analysis, and data analytics.

 

Continuous Incident Response


The traditional incident response models resemble the incident response lifecycle illustrated below that was obtained from the NIST Computer Security Incident Handling Guide.


Image obtained from http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-61r2.pdf 

The first phase involves the organization preparing for future incidents. The next phase is when an incident is detected and analyzed. This is followed by the containment, eradication, and recovery activities. At times when trying to remediate an incident, activities cycle back to detection and analysis to determine if the incident was resolved. After the incident is eradicated and the organization returns to normal operations a post incident activity is performed to see what did and didn't work out as planned. The lifecycle represents the traditional approach to incident response: incident detected -> organization responds -> incident eradicated -> organization returns to normal operations. This is the tradition reactive incident response process where the assumption is that nothing is going on until an incident is detected and after the incident is resolved it goes back to assuming nothing is going on.

To take a traditional reactive incident response process and make it in to a proactive process requires incident response to be seen in a different light. Organizations are under constant attack from daily malware infections to daily probing to daily exploit attempts to daily potential unauthorized access attempts. The model is no longer linear where an organization is waiting to detect an incident and then returning to normal operations. The new normal is being under constant attack and being at different stages in the incident response process concurrently. Richard Bejtlich stated in his book The Practice of Network Security Monitoring on page 188 regarding the model below:

the workflow in Figure 9-2 appears orderly and linear, but that’s typically not the case in real life. In fact, all phases of the detection and response processes may occur at the same time. Sometimes, multiple incidents are occurring; other times, the same incident occupies all four stages at once.”

Image obtained from http://www.nostarch.com/nsm

Richard expressed incident response is not a linear process with a start and end; on the contrary the process can be at different phases at the same time dealing with different incidents. Anton Chuvakin also touched on the non-linear incident response process in his post Incident Response: The Death of a Straight Line. Not only did he say that the “ “normal -> incident -> back to normal” is no more” but he summed up the situation organizations find themselves in.

“While some will try to draw a clear line between monitoring (before/after the incident) and incident response (during the incident), the line is getting much blurrier than many think.  Ongoing indicator scans (based on external and internal sources), malware and artifact reversing, network forensics “hunting”, etc all blur the line and become continuous incident response activities.”

The light incident response needs to be seen in is that it is a continuous process instead of a linear one. Incident response is not something that starts and ends but is an ongoing cyclical process where an organization is constantly detecting and responding to incidents. A process similar to David Bianco's the Intel-Driven Operations Cycle model shown below and was obtained from his The Pyramid of Pain Intel-Driven Detection and Response to Increase Your Adversary's Cost of Operations presentation.

Image obtained from https://speakerdeck.com/davidjbianco/the-pyramid-of-pain-intel-driven-detection-and-response-to-increase-your-adversarys-cost-of-operations

Seeing incident response as a continuous process is one that everyone must see from security practitioners to incident responders to management. Changing people’s perspectives on incident response will take time and every opportunity to sell it will need to be seized (don’t sell FUD but layout the actual threat environment we find ourselves in.) In time the conversation will go from viewing incident response as insurance that may or may not be needed to viewing incident response as continuous where people are detecting and responding to the daily security incidents. The conversation will go from “do we really need to invest in this since we only had a few incidents last year” to “we are continuing seeing these incidents due to this security weakness so how can we address it since it’s an area of concern.”

Operationalize Incident Response Information


Changing the view of incident response from a linear process to a continuous one is not enough to make it a security program enabler. To be a security program enabler incident response needs to contribute to the organization’s security strategy to help influence where security resources are focused. Too often incident response tries to influence the security strategy in a reactive manner. The reactive process resembles the following: incident detected -> organization responds -> incident eradicated -> organization returns to normal operations -> incident response recommendations provided. The attempts to influence the security strategy is based on the most recent incident. In essence, recommendations are being made based on a single event instead being made based on trends from numerous events. Don’t get me wrong, there are times when recommendations from a single event do influence the security strategy but to make incident response a security program enabler there needs to be more.

To re-enforce this point, a story about a local credit union that happened years ago may help. The credit union happened to be located at a busy intersection; its location was very accessible from buses, cars, bikes, and walking. One day a person walked in to the credit union, handed the teller a note, and then walked out with money. As an outsider looking at this single event, there was nothing drastic implemented from any recommendations based on this single robbery. The next week a similar event occurred again with someone handing the teller a note and walking out with cash. This occurred a few more times and each robbery was very similar. The robbery involved a person handing a note to the teller without any visible weapons shown. The credit union looked at all the robberies and they must have seen this pattern. In response, the credit union implemented a compensating control and this control was double doors to trap any individual as they try to exit the bank. After this control was implemented the robberies stopped. This story shows how incident response can become a security program enabler. The first robbery was a single event and the recommendation may had been to install trap doors. However, installing trap doors takes essential resources from other areas and this may not be in the best interest of the organization. As more data is collected from different events it causes a pattern to emerge. Now taking essential resources from other areas is an easier decision since the data analysis shows installing trap doors is not addressing a single event but a re-occurring issue.

The continuous incident response process needs to move from only providing reactive recommendations to producing intelligence by operationalizing the information produced by enterprise incident response and detection processes. To accomplish this, data and information needs to be captured from the ongoing detection and response activities. Then this data and information is analyzed to produce intelligence to be used by the security program. Some intelligence is used by the response and detection processes themselves but other intelligence (especially ones developed through trend analysis) is reported to appropriate parties to influence the organization’s security strategy. Operationalizing incident response information results in creating intelligence at various levels in the intelligence pyramid.


The book Building an Intelligence-Led Security Program authored by Allan Liska describes the pyramid levels as follows:

"Strategic intelligence is concerned with long-term trends surrounding threats, or potential threats to an organization. Strategic intelligence is forward thinking and relies heavily on estimation – anticipating future behavior based on past actions or expected capabilities."

"Tactical intelligence is an assessment of the immediate capabilities of an adversary. It focuses on the weaknesses, strengths, and the intentions of an enemy. An honest tactical assessment of an adversary allows those in the field to allocate resources in the most effective manner and engage the adversary at the appropriate time and with the right battle plan."

"Operational intelligence is real time, or near real-time intelligence, often derived from technical means, and delivered to ground troops engaged in activity against the adversary. Operational intelligence is immediate, and has a short time to live (TTL). The immediacy of operational intelligence requires that analysts have instant access to the collection systems and be able to put together FINTEL in a high-pressure environment."

As it relates to making incident response process a security program enabler, the focus needs to be on making the process contribute to the organization’s security strategy by producing tactical and strategic intelligence. Tactical intelligence can highlight the organization’s weaknesses and strengths then show where security resources can be used more effectively. Strategic intelligence can influence the direction of the organization’s long term security strategy. Incident response starts to move from being viewed as a reactive process to a proactive one once it starts adding value to other areas in an organization’s security program.

Improve Root Cause Analysis Capabilities


Before one can start to operationalize incident response information to produce intelligence at various levels in the intelligence pyramid they must first improve their root cause analysis capabilities. Root cause analysis is trying to determine how an attacker went after another system or network by identifying and understanding the remnants they left on the systems involved during the attack. This is a necessary activity for one to discover information during a security incident that can be operationalized. The Verizon Data Breach Investigations Report is an excellent example about the type of information one can discover by performing root cause analysis. The report highlights trends from “time to incident discovery” to “time to compromise” to exploited vulnerabilities to frequency of attack types to hacking actions. None of this data would had been available for analysis if root cause analysis wasn’t completed on these incidents.

Take the hypothetical scenario of a malware infected system. Root cause analysis discovered the attacker compromised the system using a phishing email containing a malicious Word document. At this point there is various data one can then turn in to intelligence. At the operationally level, the email’s subject line, content, from address, and Word document attachment name can all be documented and then turned in to intelligence for response and detection activities. The same can occur for the URL inside the Word document and the malware it downloads. Doing root cause analysis on all infections can then make data available to do trend analysis. Is it a pattern for the organization employees to be socially engineered through Word documents? Can resources be applied in other areas such as security awareness training to combat this threat? In time, more and more data can be collected to reveal other trends to help drive security. Performing root cause analysis on each incident is needed to operationalize incident response information to produce intelligence in this manner. The Compromised Root Cause Analysis Model is one model to use and it is described in the post Compromised Root Cause Analysis Model Revisited.


Incident Response Metrics


The outcome from performing root cause analysis on each incident is discoverable information. It’s not enough to consistently do root cause analysis to discover information; the information needs to be documented and analyzed to make it into intelligence. Different options are available to document security incident information but in my opinion the best available schema is the VERIS Framework. The “Vocabulary for Event Recording and Incident Sharing (VERIS) is a set of metrics designed to provide a common language for describing security incidents in a structured and repeatable manner.” The VERIS Framework is open and can be modified to meet an organization’s needs.

The schema is well designed but to support an internal incident response process some modifications may be needed. This post won’t go in to great detail about the needed modifications but I will mention a few to make the schema better support internal incident response. In the Incident Tracking section, to make it easier to track security incidents the following can be added: Incident Severity (to match the incident response process severity for incidents), Hostname (of the targeted system), IP Address (of the targeted system), Username (involved in the incident), and Source IP Address (of the attacker’s system). In the Victim Demographics section, these may or may not apply for an internal incident response process. Personally, I don’t see the need for tracking this information if the incident response process supports the same entity. In the Incident Description section, the biggest change is outlining the expected values for the vectors and vulnerabilities. For example, for the vulnerabilities list out each possible vulnerable application - such as Java vulnerability - instead of allowing for specific CVEs. This reduces the amount of work needed on doing root cause analysis without losing too much on the metrics side. The last changes I’ll discuss are for the Discovery and Response section. In this section make sure to account for the various discovery methods the organization may use to detect incidents as well as the intelligence sources behind those methods. This slight change enables an organization to measure how they are detecting security incidents and to evaluate the return on investment for different intelligence sources.

Data Analysis


Information that is documented is only data and does not become intelligence until it is analyzed and refined so it is useful to others. There are different options available for organizations to produce intelligence from the information discovered during root cause analysis. The book Data-Driven Security: Analysis, Visualization and Dashboards goes in to detail about how one can do data analysis with free and/or open source tools. The route I initially took was to allow me to focus on the incident response process without getting bogged down trying to create visualizations to identify trends. At my company (this is the only item in this post directly tied to my employer and I only mention it in hopes it helps my readers) we went with a license for Tableau Desktop and I bought a personal copy of the book Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software. The combination of Tableau Desktop and the VERIS Framework makes it very effective at producing strategic and tactical intelligence that can be consumed by the security program. In minutes, you can create visualizations to highlight what departments in an organization is most susceptible to phishing attacks or to quickly identify the trends explaining how malware is entering the organization. The answers and intelligence one can gain from the incident response data is only limited by one’s creativity and the ability of those consuming the intelligence.

Making Incident Response a Security Program Enabler


The approach an organization can take to take incident response from a reactive process to proactive one involves the following steps:

      - Improving an organization's incident response capabilities
      - Improving an organization's root cause analysis capabilities
      - Improving an organization’s security monitoring capabilities
      - Influencing others to see incident response as a continuous process
      - Operationalizing incident response information
      - Collecting and documenting data for the organization’s incident response metrics
      - Analyzing the organization’s incident response metrics to produce intelligence
      - Presenting the intelligence to appropriate stakeholders

Making incident response a security program enabler is a gradual process requiring organization buy-in and resources to make it happen. As DFIR practitioners, we can only be the voice in the wilderness telling others incident response can be more than a reactive process. It can be more than an insurance policy. It can be a continuous process enabling an organization’s security strategy and helping guide how security resources are used. A voice hoping to influence others to make the right decision to better protect their organization.
Labels:

Python: print “Hello DFIR World!”

Wednesday, April 8, 2015 Posted by Corey Harrell 6 comments
Coursera's mission is to "provide universal access to the world's best education." Judging by their extensive course listing it appears as if they are delivering on their mission since the courses are free for anyone to take. I knew about Coursera for some time but only recently did I take one of their courses (Python Programming for Everybody.) In this post I'm sharing some thoughts about my Coursera experience, the course I took, and how I immediately used what I learned.

Why Python? Why Coursera?


Python is a language used often in information security and DFIR. Its usage is varied from simple scripts to extensive programs. My interest in Python was modest; I wanted to be able to modify (if needed) Python tools I use and to write automation scripts to make my job easier. Despite the wealth of resources available to learn Python, I wanted a more structured environment to learn the basics. An environment that leverages lectures, weekly readings, and weekly assignments to explore the topic. My plan was to learn the basics then proceed exploring how Python applies to information security using the books Black Hat Python and Violent Python. Browsing through the Cousera offerings I found the course Programming for Everybody (Python). The course “aims to teach everyone to learn the basics of programming computers using Python. The course has no pre-requisites and avoids all but the simplest mathematics.” Teaches the basics in a span of 10 weeks without the traditional learning to code by mathematics; the course was exactly what I was looking for.

Programming for Everybody (Python)


I’m not providing a full fledge course review but I did want to provide some thoughts on this course. The course itself is “designed to be a first programming course using the popular Python programming language.” This is important and worth repeating. The course is designed to be someone’s first programming course. If you already know how to code in a different language then this course isn’t for you. I didn’t necessary fit the target audience since I know how to script in both batch and Perl. However, I knew this was a beginner’s course going in so I expected things would move slowly. I could easily overlook this one aspect since my interest was to build a foundation in Python. The course leveraged some pretty cool technology for an online format. The recorded lectures used a split screen between the professor, his slides, and his ability to write on the slides as he taught. The assignments had an auto grader where students complete assignments by executing their programs and the grader confirms if the program was written correctly. The text book is Python for Informatics: Exploring Information, which focuses more on trying to solve data analysis problems instead of math problems like traditional programming texts. The basics covered include: variables, conditional code, functions, loops/iteration, strings, files, lists, dictionaries, tuples, and regular expressions.

Overall, spending the past 10 weeks completing this course was time well spent. Sure, at times I wish times moved faster but I did achieve what I wanted to. Exploring the basics of the Python language so I can have a foundation prior to exploring how the language applies to security work. The last thing I wanted to mention about the course, which I highly respect. The entire course from the textbook to the lecture videos is licensed under a Creative Common Attribution making it available for pretty much anyone to use.

Applying What I Learned


The way I tend to judge courses, trainings, and books is by how much of the content can be applied to my work. If the curriculum is not relevant to one’s work than what is the point in wasting time completing it? It’s just my opinion but judging courses and trainings in this manner has proven to be effective. To illustrate this point as it applies to the Python Programming for Everybody course I’m showing how the basics I learned solved a recent issue. One issue I was facing is how to automate parsing online content and consuming it in a SIEM. This is a typical issue for those wishing to use open source threat intelligence feeds. One approach is to manually parse it in to a machine readable form that your SIEM and tools can use. Another and a better approach is to automate as much as possible through scripting. I took the later approach by creating a simple script to automate this process. For those interested in Python usage in DFIR should check out David Cowen's Automating DFIR series or Tom Yarrish's Year of Python series.

There are various open source threat intelligence feeds one can incorporate in to their enterprise detection program. Kyle Maxwell’s presentation Open Source Threat Intelligence touched on some of them. For this post, I’m only discussing one and it was something I was interested in knowing how to do it. Tor is an anonymity service that enables people to hide where they are coming from as they surf the Internet. Tor has a lot of legitimate uses and just because someone is using it does not mean they are doing something wrong. Being able to flagged users connecting to your network from Tor can add context to other activity. Is the SQL injection IDS alert a false positive? Is the SQL injection IDS alert coming from someone who is also using Tor a false positive? See what I mean by adding context. This was an issue that needed a Python solution (or at least a solution where I could apply what I learned.)

To accomplish adding Tor context to activity in my SIEM I first had to identify the IP addresses for the Tor exit nodes. Users using the service will have the IP address of the exit node they are going through. The Tor Project FAQs provides an answer to the question "I want to ban the Tor network from my service." After trying to discourage people from blocking two options are presented by using either the Tor exit relay list or a DNS-based list. The Tor exit relay list webpage has a link to the current list of exit addresses. The screenshot below shows how this information is presented:


Now we’ll explore the script I wrote to parse the Tor exit node IP addresses into a form my SIEM can consume, which is a text file with one IP address per line. The first part –as shown in the image below - imports the urllib2 module that is used to open URLs. This part wasn’t covered in the course but wasn’t too difficult to figure out by Googling. The last line in the image creates a dictionary called urls. A dictionary associates a key with a value and in this case the key is tor-exit with the value being the URL to the Tor exit relay list. Leveraging a dictionary allows the script to be extended to support other feeds without having to make significant changes to the script.


The next portion of the script as shown below is where the first for loop occurs. The for loop will process each entry (key and value pair) in the urls dictionary. The try and except is a method to account for errors such as a URL not working. Inside the try section the URL is opened in to a variable named file and then it is read in to a variable named data using the urllib2 readlines() option. Lastly, a file is created to store the output using the key value and the file handle is named output.


The next part of the script –image below - is specific to each threat feed being parsed. This accounts for the differences in the way threat feeds present data. The if statement checks to see if the key matches “tor-exit” and if it does then the second for loop executes. This for loop reads each line in the data variable (hence the data listed at the URL.) As each line is read there is additional actions performed such as skipping blank lines and any line that doesn’t start with the string “ExitAddress.” For the lines that do start with this string, the line is broken up in to a list named words. Basically, it breaks the line up into different values by using the space as a separator. The IP address is the second value so it is contained in the second index location in the words list (words[1]). The IP address is then written to the output file and after each line is processed a message is displayed saying processing completed.


The screenshot below shows the script running.


The end result is a text file containing the Tor exit IP addresses with one address per line. This text file can then be automatically consumed by my SIEM or I can use it when analyzing web logs to flag any activity involving Tor.


It’s Basic but Works


Harlan recently said in his Blogging post “it doesn't matter how new you are to the industry, or if you've been in the industry for 15 years...there's always something new that can be shared, whether it's data, or even just a perspective.” My hope with this post is it would be useful to others who are not programmers but want to learn Python. Coursera is a good option that can teach you the basics. Even just learning the basics can extend your DFIR capabilities as demonstrated by my simple script.
Labels: