Friday, September 7, 2012

Integrating Org Data in ELSA

Using Big Data is a necessity in securing an enterprise today, but it is only as useful as its relevance to the specific, local security challenges at hand.  To be effective, security analysts need to be able to use org-specific data to provide context.  This is not a new concept, as the idea has been around in products like ArcSight, NetWitness, and Sourcefire's RNA which use both external data sources as well as extrapolation techniques to map out key details such as IP-to-user relationships.

ELSA (and Splunk, to a slightly lesser degree) takes this a step further.  Any database in the org can be queried in the exact same search syntax as normal log searches, and these results can be stored, sent to dashboards, charted, compared, alerted on, and exported just like any other ELSA result.  Let's take an example of an HR database that has names, emails, and departments in it.  Suppose you want to see all of the emails sent from a non-US email server sent to anyone in the accounting department.  An ELSA search using Bro's SMTP logging can find this for you.

First, we setup the HR database for ELSA.  Open the /etc/elsa_web.conf file and add a new datasource to the datasources config section like this (documentation):
"datasources": {                 
  "database": { 
    "hr_database": { 
      "alias": "hr",
      "dsn": "dbi:Oracle:Oracle_HR_database", 
      "username": "scott", 
      "password": "tiger", 
      "query_template": "SELECT %s FROM (SELECT person AS name, dept AS department, email_address AS email) derived WHERE %s %s ORDER BY %s LIMIT %d,%d", 
      "fields": [ 
        { "name": "name" }, 
        { "name": "department" },
        { "name": "email" }
      ] 

Restart Apache, and now you can use the "hr" datasource just like it were native ELSA data.

The first part of the query is to find everyone in accounting:

datasource:hr department:accounting groupby:email_address

This will return a result that looks like this:

suzy@example.com
joe@example.com
dave@example.com

We will pass this "reduced" (in the map/reduce world) data to a subsearch of Bro SMTP logs which reduce the data to distinct source IP addresses:

class:bro_smtp groupby:srcip

Then, we apply the whois (or GeoIP) transform to find the origin country of that IP address and filter US addresses:

whois | filter(cc,us)

And finally, we only want to take a look at the subject of the email to get an idea of what it says:

sum(subject)

 The full query looks like:

datasource:hr department:accounting groupby:email_address | subsearch(class:bro_smtp groupby:srcip,srcip) | whois | filter(cc,us) | sum(subject)

This will yield the distinct subjects of every email sent to the accounting department from a non-US IP.  You can add this to a dashboard in two clicks, or have an alert setup.  Or, maybe you want to use the StreamDB connector to auto-extract the email and save off any attachments, perhaps to stream into a PDF sandbox.

There are unlimited possibilities for combining datasets.  You can cross-reference any log type available in ELSA, as with the HR data.  If you're using a desktop management suite in your enterprise, such as SCCM, you could find all IDS alerts by department:

+classification class:snort groupby:srcip | subsearch(datasource:sccm groupby:user,ip) | subsearch(datasource:hr groupby:department,name)

The fun doesn't have to stop here.  The database datasource is a plugin, and writing plugins is fairly easy.  Other possibilities for plugins could be LDAP lookups, generic file system lookups, Twitter (as in the example I put out on the mailing list today), or even a Splunk adapter for directly querying a Splunk instance over its web API.

To get data that graphs properly on time charts, you can specify which column is the "timestamp" for the row, like this:

{ "name": "created", "type": "timestamp", "alias": "timestamp" }

And to have a numeric value provide the value used in summation, you can alias it as "count:"

{ "name": "errors", "type": "int", "alias": "count" }

ELSA makes use of this for its new stats pages by hard-coding the internal ELSA databases as "system" datasources available to admins.  This allows the standard database entries to produce the same rich dashboards that standard ELSA data fuels.


The ability to mix ELSA data with non-ELSA data on the same chart can make for some very informative dashboards.  Possibilities include mixing IDS data with incident response ticket data, Windows errors with helpdesk tickets, etc.

Don't forget that sharing dashboards is easy by exporting and importing them, so if you have one you find useful, please share it!