Caylent Acquires Trek10 to Create the Most Comprehensive Dedicated AWS Services Partner Press Release →

Services
Focus Areas

Areas of Expertise
Engagements

Discover

Build

Support
Areas of Expertise

App Modernization

Public Sector

Serverless

IoT

DevOps

Migration

Data and Machine Learning (ML)

Enterprise Architecture

24/7 Monitoring

Team Support

Datadog

Overview

Are you taking advantage of modernizing your AWS apps to protect your cloud investments?

Overview

Our mission is to accelerate high-quality cloud adoption across the Public Sector.

Overview

Whether you are new to serverless or looking to scale, Trek10 allows you to focus on building applications, not managing servers.

Related Content

AWS Lambda

With AWS Lambda, you can run code without the need for managing servers in a cost-effective manner.

Blog

What is Serverless and Why Does it Matter?

Overview

Whether you’re looking to gain visibility into plant floor machinery or seeking to enhance process efficiency, Trek10 can help.

Related Content

Blog

Serverless Architectures: IoT

Blog

Is IoT Device Shadow Right for You?

or should you build-your-own with DynamoDB?

Overview

Shorten the development lifecycle, increase reliability, and release software faster.

Related Content

AWS CloudFormation

AWS CloudFormation helps you save time and money by configuring and managing resources for you.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

At Trek10, we rapidly migrate your applications with a focus on cost-effectiveness

Related Content

Amazon WorkSpaces

Amazon WorkSpaces allows you to quickly scale according to your virtual desktop needs.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

Uncover insights from your data no matter where you are in your analytics journey.

Related Content

Machine Learning Ops

MLOps constitute best practices for developing, deploying, and monitoring high precision Machine Learning models.

Amazon SageMaker

Amazon SageMaker enables developers and data scientists to easily build ML models.

Overview

Enterprise Architecture (EA) combines business and technology in a proven industry recognized framework to deliver business focused results based on your industry, environment, competition and the ever increasing capabilities of cloud technologies.

Related Content

Developer Acceleration

A series of in-person architect-led training modules designed to help your team develop the necessary skills and best practices to modernize your applications.

Overview

Maximize the uptime and security of your most critical applications.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Experienced solutions architects and developers at your service, on-demand.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Let Trek10 help you hit the ground running with Datadog.

Related Content

AWS Premier Partner

Discover

Cloud-Native Immersion Day

Developer Acceleration

Retail | Industry Overview

SaaS on AWS

Serverless Workshop

Overview

Trek10's Cloud-Native Immersion Days are focused, high impact training sessions that will drench your teams in knowledge of the latest tech and best-practices.

Overview

Trek10’s expert-led Developer Acceleration workshops help enterprise teams quickly and safely jump-start their serverless journey.

Overview

Leveraging the vast capabilities of the AWS ecosystem, Trek10 provides retail businesses with solutions tailored to their unique needs, enabling them to innovate at speed and scale.

Overview

Trek10 helps companies migrate and build their SaaS offering on AWS with a cloud-native approach.

Overview

Whether it’s a greenfield project or re-architecting legacy, Trek10 is your guide to adopting cloud native architectures.

Build

DevOps Transformation

Internet of Things (IoT) Applications

Security

Overview

At Trek10, we leverage the best AWS native and third party tools for code-defined infrastructure, continuous integration, and automated deployment pipelines.

Overview

Trek10 helps you deliver on the promise of IoT by guiding you through the process of connecting your devices to AWS and by designing, implementing, and fully supporting your AWS cloud infrastructure.

Overview

Trek10’s security solutions and services will secure your AWS APIs and infrastructure. Schedule a meeting today to see if you qualify for a free security scan and report.

Support

CloudOps 24/7 Monitoring & Support

CloudOps Team Support

Overview

Trek10 brings managed services to the cloud. Our team works hard to reduce noise and maximize uptime in every AWS environment we manage.

Overview

Trek10 Team Support augments your team’s skills with access to a team of experienced and focused AWS solutions architects and cloud developers that specialize in leveraging AWS to the fullest.

Overview

Everyone who moves to AWS wants to secure their environment, but knowing where to start is hard. That is where Trek10 can help.
Case Studies
About
Careers
AWS Premier Partner
Community
CloudProse Blog

Spotlight

Serverless

Cost and Pricing Analysis

Cloud Native

Developer Experience

Databases

News

IoT

Monitoring, Ops & DevOps

Containers

Security and IAM

Generative AI and Machine Learning (ML)

Search Trek10

Serverless

Operationalizing Serverless: Five Key Considerations

Andy Warzon | Jan 03 2018

Wed, 03 Jan 2018

So you’ve built your first Serverless app (if you haven’t, start here) and are ready to go into production. Given a lot of the rhetoric about Serverless these days (or for our purposes, more clearly… cloud-native apps built with AWS platform services), you might be forgiven for thinking that there is zero operational effort. Sadly, that nirvana has not yet appeared. While there is WAY less operational effort, it’s not zero; there are some important things to keep in mind when planning to run Serverless apps in production. Below are the top five that Trek10 has identified over years of managing Serverless in production for our clients.

Application Errors

This is probably the most obvious and most common. Apps have errors and you have to watch for them… not exactly groundbreaking. What is a bit different is figuring out the right threshold for alerting. For a low-volume system it may make sense to alert on every error, but for any sufficiently high-volume system you need to do some work to weed out the noise that comes from typical transient errors. Typically some rate of errors under 0.1% is always going to occur. With asynchronous Lambda invocations (for example from an S3 object event), we recommend using dead letter queues and only alerting off of those, so you can ignore all transient errors knowing that AWS will usually retry and get success.

You may also want to consider tools that improve your visibility beyond the built-in basics of CloudWatch metrics and Lambda logs in CloudWatch Logs. Error tracking services like Sentry or Rollbar work just as well as in traditional architectures in helping to track errors. When it comes to tracing, though, you’ll need to look at a new generation of tools: AWS X-Ray and IOPipe are two of the more popular options.

Watch the Dials Where Dials Exist

While scaling with AWS platform services is mostly transparent, that’s not 100% the case. There are a few dials in the system; it is important to know where they are and how to monitor them to optimize scalability and costs. Some are obvious and easily visible like DynamoDB Provisioned Throughput (which also has auto-scaling now, by the way) or Kinesis shards, others are slightly more hidden like Lambda Concurrency Limits, and still others like S3 pre-partitioning are completely hidden and can only be monitored by observing symptoms like S3 error rate or PUT latency. Carefully review each part of the system to identify all of the relevant dials.

Security

Security is never a solved problem… the risks just shift. With no long-running VM and often no network to manage, Serverless greatly reduces the attack surface area of many traditional threats. This doesn’t mean security is solved, though: it just allows you to shift your focus to other threat areas.

AWS IAM: Are your policies least-privilege? Are access keys and passwords safe? Is MFA being used everywhere?
Web Application: Of course often you still have a public-facing web application and need to continue to be careful about your typical web app security concerns such as injection attacks and inadvertently exposed data.
Dependencies: In modern open source web applications there is a huge reliance on open source modules from a broad community. A small but emerging ecosystem of tools is focusing on analyzing your project’s dependencies to validate that they are both coded securely and that they are not compromised by attackers. Snyk and PureSec are two interesting ones to keep an eye on.

You should look at automating security scanning against all of these threats in your production infrastructure as well as integrating security analysis into your CI pipeline to avoid deploying vulnerabilities in the first place.

Costs

This is sort of the flipside of a great benefit of Serverless: cost is truly usage-based… but cost is truly usage-based. If you get unwanted or unexpected traffic, costs could spike quickly. So it is important to monitor costs on a daily basis so you can quickly detect any cost spikes and block the offending traffic or optimize your application to minimize costs.

And finally, everyone’s favorite…

AWS Outages!!

All of the AWS platform services are by default running in multiple AWS Availability Zones (AZs, which are one or more data centers with independent power and network within a given AWS Region), so in theory 2-3 AWS data centers would need to go down simultaneously to cause an outage, which is a very uncommon (i.e. much less often than once a year) scenario. But the reality has been that these services in fact have cross-AZ dependencies and have region-wide outages. In the past 15 months there have been multiple outages to services like DynamoDB, S3, and Lambda. So this is a real thing that you need to plan for. Trek10’s very own Jared Short had a great talk at Serverlessconf NYC about this if you’d like to learn more.

The first step is determining the extent to which you can build for multi-region failover or possibly even multi-region active-active. Two new features, API Gateway Regional Endpoints and DynamoDB Global Tables, have made this significantly more realistic, but you still need to conduct an RTO/RPO cost sensitivity analysis to decide if it makes sense for you.

Next, you need to build your operational response plan for these outages. While your ops team may not be able to fix AWS’s issue, it still has a key role to play: Identify as early as possible that there is a problem, trace root cause to the AWS services, look for confirmation from AWS that the problem is on their side (usually, AWS Support initially and then with some lag the AWS Status Page), and then effectively communicate to end users, initiate failover plans as appropriate, and monitor status on the AWS side. In other words… they’ll be busy!

Wrapping Up

So as you can see, Serverless Ops is not entirely new. There are some new things, but it’s mostly just a shift in focus. The overall effort should be significantly less than Ops for an equivalently-sized infrastructure, but it is really important to tackle and master if you are going to be successful in adopting Serverless in your organization.

At Trek10, we are experts in building and operating Serverless infrastructure and enabling others to do so, so let us know if you’d like some help.

Questions/comments? Feel free to reach us at serverless@trek10.com.

Author

Andy Warzon

Go to Stories by Andy

Founder & CTO, Andy has been building on AWS for over a decade and is an AWS Certified Solutions Architect - Professional.

Similar Blog

Spotlight

How to Use IPv6 With AWS Services That Don't Support It

Build an IPv6-to-IPv4 proxy using CloudFront to enable connectivity with IPv4-only AWS services.

Michael Barney | Feb 12 2025
6 min read

Spotlight

AWS Lambda Functions: Return Response and Continue Executing

A how-to guide using the Node.js Lambda runtime.

Joel Haubold | Dec 07 2023
5 min read

Serverless

Replacing Amazon S3 Events with Amazon S3 Data Events

How to synthesize an (almost) identical payload using Amazon EventBridge rules.

Joel Haubold | Nov 02 2023
5 min read

Overview

Overview

Overview

Related Content

AWS Lambda

Blog

What is Serverless and Why Does it Matter?

Overview

Related Content

Blog

Serverless Architectures: IoT

Blog

Is IoT Device Shadow Right for You?

Overview

Related Content

AWS CloudFormation

Containers on AWS

Overview

Related Content

Amazon WorkSpaces

Containers on AWS

Overview

Related Content

Machine Learning Ops

Amazon SageMaker

Overview

Related Content

Developer Acceleration

Overview

Related Content

Amazon CloudWatch

Disaster Recovery

Overview

Related Content

Amazon CloudWatch

Disaster Recovery

Overview

Related Content

AWS Premier Partner

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Serverless

Operationalizing Serverless: Five Key Considerations

Application Errors

Watch the Dials Where Dials Exist

Security

Costs

AWS Outages!!

Wrapping Up

Author

Andy Warzon

Similar Blog

Spotlight

How to Use IPv6 With AWS Services That Don't Support It

Spotlight

AWS Lambda Functions: Return Response and Continue Executing

Serverless

Replacing Amazon S3 Events with Amazon S3 Data Events