Tue, 20 Feb 2018

We’re big users of Datadog at Trek10, and particularly Datadog’s AWS integration. It’s an incredibly powerful service and an integral part of our 24/7 CloudOps offering. And like any responsible power user, we’re always on the lookout for ways to improve how Datadog works with AWS, and we love to share things we find.

Datadog’s Event Stream centralizes events from a variety of sources into a single thread. You can easy search & filter your stream, graph/dashboard events, overlay events on time series graphs, and create alerts off of events. You can insert your own custom events into the stream from several sources, including the API, emails, Amazon SNS topics, or a variety of other pre-built integrations. This has almost endless useful applications. For example, you might want to overlay your deployment events over your system performance metrics, or put every auto-scaling failure event into the stream and alert if you get more than a few in some time window.

In an AWS-heavy environment, events via SNS are particularly useful. There are a bunch of events that AWS can send to an SNS topic that might be useful to have in the Datadog Event Stream, including:

But Wait!

There’s a problem with this. Here’s how a raw SNS message looks when it comes into the Datadog Event Stream:

Just an ugly pile of JSON. Yuck! Readability is important, but there’s a second and bigger problem. Datadog’s Event Stream supports tagging; this is really critical for effectively filtering your stream for graph overlays and alerting. These events have no tags with useful information for us to filter/alert off of. (Datadog also supports full text search, but we’ve found that it’s never quite enough to reliably filter what you need.)

Our Solution

We’ve developed a pretty simple and elegant solution to this:

  • All AWS services, in any number of accounts, that have useful messages for the Datadog Event Stream are configured to publish to a single centralized SNS topic.
  • A Lambda function is triggered from this topic.
  • This function is powered from a set of mapping templates. Each template has a given matching condition: A key/value pair from the relevant JSON that defines when that mapping template should be used.
  • The templates use the mustache.js template system to represent a prettified Datadog event constructed from data in the raw event JSON… critically, including Datadog Event Stream tags.

(Side note: If you’re as AWS-obsessed as we are you might be wondering, why not use the awesome new Amazon SNS Message Filtering feature that was released recently!?!? Believe me, we wish we could. But the publisher of the message has to set the message attributes. These are messages that AWS services like CloudWatch Events and Config are sending and those services don’t yet offer that. So if you’re reading this, AWS Product Team Member, hook us up! Please.)

An example mapping template that transforms JSON from an AWS Personal Health Dashboard (AWS Health) event:

matchKey: sourcematchValue: aws.healthtitle: 'Personal Health Dashboard Event: {{ "{{ detail-type "}} }}'text: | %%% Affected Service: {{ "{{ detail.service "}} }} Event Category: {{ "{{ detail.eventTypeCategory "}} }} Time: {{ "{{ detail.startTime "}} }} to {{ "{{ detail.endTime "}} }} {{ "{{ detail.eventDescription.0.latestDescription "}} }} [an example link](https://www.trek10.com/) %%%tags: - 'eventType:awsPHD' - 'PHDservice:{{ "{{ detail.service "}} }}'

Now, you get a nicely formatted and easily readable Datadog event with just the information you need and just the tags you need. Like this:

Nice huh? And the system scales very easily. Create one SNS and one Lambda for all possible events, and create a new mapping template for your particular event.

Let us know if you have interest in this. We’re thinking of generalizing and open sourcing this little utility if there is interest from others.

Questions/comments? Feel free to reach us at serverless@trek10.com.

Andy Warzon Trek10
Andy Warzon

Founder & CTO

Founder & CTO, Andy has been building on AWS for over a decade and is an AWS Certified Solutions Architect - Professional.