InteliBridge MCP: Unlock up to $30k to build your Model Context Protocol (MCP) Server. Join the waitlist →

Services
Focus Areas

Areas of Expertise
Engagements

Discover

Build

Support
Areas of Expertise

App Modernization

Public Sector

Serverless

IoT

DevOps

Migration

Data and Machine Learning (ML)

Enterprise Architecture

24/7 Monitoring

Team Support

Datadog

Overview

Are you taking advantage of modernizing your AWS apps to protect your cloud investments?

Overview

Our mission is to accelerate high-quality cloud adoption across the Public Sector.

Overview

Whether you are new to serverless or looking to scale, Trek10 allows you to focus on building applications, not managing servers.

Related Content

AWS Lambda

With AWS Lambda, you can run code without the need for managing servers in a cost-effective manner.

Blog

What is Serverless and Why Does it Matter?

Overview

Whether you’re looking to gain visibility into plant floor machinery or seeking to enhance process efficiency, Trek10 can help.

Related Content

Blog

Serverless Architectures: IoT

Blog

Is IoT Device Shadow Right for You?

or should you build-your-own with DynamoDB?

Overview

Shorten the development lifecycle, increase reliability, and release software faster.

Related Content

AWS CloudFormation

AWS CloudFormation helps you save time and money by configuring and managing resources for you.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

At Trek10, we rapidly migrate your applications with a focus on cost-effectiveness

Related Content

Amazon WorkSpaces

Amazon WorkSpaces allows you to quickly scale according to your virtual desktop needs.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

Uncover insights from your data no matter where you are in your analytics journey.

Related Content

Machine Learning Ops

MLOps constitute best practices for developing, deploying, and monitoring high precision Machine Learning models.

Amazon SageMaker

Amazon SageMaker enables developers and data scientists to easily build ML models.

Overview

Enterprise Architecture (EA) combines business and technology in a proven industry recognized framework to deliver business focused results based on your industry, environment, competition and the ever increasing capabilities of cloud technologies.

Related Content

Developer Acceleration

A series of in-person architect-led training modules designed to help your team develop the necessary skills and best practices to modernize your applications.

Overview

Maximize the uptime and security of your most critical applications.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Experienced solutions architects and developers at your service, on-demand.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Let Trek10 help you hit the ground running with Datadog.

Related Content

AWS Premier Partner

Discover

Cloud-Native Immersion Day

Developer Acceleration

Retail | Industry Overview

SaaS on AWS

Serverless Workshop

Overview

Trek10's Cloud-Native Immersion Days are focused, high impact training sessions that will drench your teams in knowledge of the latest tech and best-practices.

Overview

Trek10’s expert-led Developer Acceleration workshops help enterprise teams quickly and safely jump-start their serverless journey.

Overview

Leveraging the vast capabilities of the AWS ecosystem, Trek10 provides retail businesses with solutions tailored to their unique needs, enabling them to innovate at speed and scale.

Overview

Trek10 helps companies migrate and build their SaaS offering on AWS with a cloud-native approach.

Overview

Whether it’s a greenfield project or re-architecting legacy, Trek10 is your guide to adopting cloud native architectures.

Build

DevOps Transformation

Internet of Things (IoT) Applications

Security

Overview

At Trek10, we leverage the best AWS native and third party tools for code-defined infrastructure, continuous integration, and automated deployment pipelines.

Overview

Trek10 helps you deliver on the promise of IoT by guiding you through the process of connecting your devices to AWS and by designing, implementing, and fully supporting your AWS cloud infrastructure.

Overview

Trek10’s security solutions and services will secure your AWS APIs and infrastructure. Schedule a meeting today to see if you qualify for a free security scan and report.

Support

CloudOps 24/7 Monitoring & Support

CloudOps Team Support

Overview

Trek10 brings managed services to the cloud. Our team works hard to reduce noise and maximize uptime in every AWS environment we manage.

Overview

Trek10 Team Support augments your team’s skills with access to a team of experienced and focused AWS solutions architects and cloud developers that specialize in leveraging AWS to the fullest.

Overview

Everyone who moves to AWS wants to secure their environment, but knowing where to start is hard. That is where Trek10 can help.
Case Studies
About
Careers
AWS Premier Partner
Community
CloudProse Blog

Spotlight

Serverless

Cost and Pricing Analysis

Cloud Native

Developer Experience

Databases

News

IoT

Monitoring, Ops & DevOps

Containers

Security and IAM

Generative AI and Machine Learning (ML)

Search Trek10

Serverless Best Practices: Operations - Think FaaS Podcast

You've also heard some of the buzzwords in this area like structured logging, tracing, observability, anomaly detection... let's talk ops.

Forrest Brazeal | Jul 17 2018

Thu, 19 Jul 2018

Transcript

And we’re back again, I’m Jared Short at Trek10, and this is ‘Think FaaS’, where we learn about the world of serverless computing in less time than it takes to run an AWS Lambda function. So put five minutes on the clock - it’s time to ‘Think FaaS’.

We are continuing our Serverless Best practices again this week, and we are focusing on operations and operating serverless systems. This is a huge area, with evolving best practices. I’m sure you’ve also heard some of the buzzwords in this area like structured logging, tracing, observability, anomaly detection, etc. There’s a lot to get into, let’s go!

Application Errors

This is probably the most obvious and most common. Apps have errors and you have to watch for them… not exactly groundbreaking. What is a bit different is figuring out the right threshold for alerting. For a low-volume system it may make sense to alert on every error, but for any sufficiently high-volume system you need to do some work to weed out the noise that comes from typical transient errors. Typically some rate of errors under 0.1% is always going to occur. With asynchronous Lambda invocations (for example from an S3 object event), leverage dead letters queues and such and you can usually ignore all transient errors knowing that AWS will usually retry and get success.

You may also want to consider tools that improve your visibility beyond the built-in basics of CloudWatch metrics and Lambda logs in CloudWatch Logs. Error tracking services like Sentry or Rollbar work just as well as in traditional architectures in helping to track errors. When it comes to tracing, though, you’ll need to look at a new generation of tools: AWS X-Ray and IOPipe are two of the more popular options.

Watch the Dials Where Dials Exist

While scaling with AWS platform services is mostly transparent, that’s not 100% the case. There are a few dials in the system; it is important to know where they are and how to monitor them to optimize scalability and costs. Some are obvious and easily visible like DynamoDB Provisioned Throughput (which also has auto-scaling now, by the way) or Kinesis shards, others are slightly more hidden like Lambda Concurrency Limits, and still others like S3 pre-partitioning are completely hidden and can only be monitored by observing symptoms like S3 error rate or PUT latency. Carefully review each part of the system to identify all of the relevant dials.

Security

We had a whole episode on this in this series from Forrest, so consider that security is never a solved problem… the risks just shift. With no long-running VM and often no network to manage, Serverless greatly reduces the attack surface area of many traditional threats. This doesn’t mean security is solved, though: it just allows you to shift your focus to other threat areas. Focus on IAM, tightening down those policies, for Web Applications you still need to follow all your normal OWASP Top 10’s, and finally your application dependencies are a key factor. A small but emerging ecosystem of tools is focusing on analyzing your project’s dependencies to validate that they are both coded securely and that they are not compromised by attackers. Snyk and PureSec are two interesting ones to keep an eye on.

Costs

This is sort of the flipside of a great benefit of Serverless: cost is truly usage-based… but cost is truly usage-based. If you get unwanted or unexpected traffic, costs could spike quickly. So it is important to monitor costs on a daily basis so you can quickly detect any cost spikes and block the offending traffic or optimize your application to minimize costs.

AWS Outages

All of the AWS platform services are by default running in multiple AWS Availability Zones (AZs, which are one or more data centers with independent power and network within a given AWS Region), so in theory 2-3 AWS data centers would need to go down simultaneously to cause an outage, which is a very uncommon (i.e. much less often than once a year) scenario. But the reality has been that these services in fact have cross-AZ dependencies and have region-wide outages. In the past 15 months there have been multiple outages to services like DynamoDB, S3, and Lambda. So this is a real thing that you need to plan for.

The first step is determining the extent to which you can build for multi-region failover or possibly even multi-region active-active. Build your operational response plan for these outages. While your ops team may not be able to fix AWS’s issue, it still has a key role to play: Identify as early as possible that there is a problem, trace root cause to the AWS services, look for confirmation from AWS that the problem is on their side (usually, AWS Support initially and then with some lag the AWS Status Page), and then effectively communicate to end users, initiate failover plans as appropriate, and monitor status on the AWS side.

Whew, we made it! A timely reminder that only one episode of Think FaaS remains until Trek10, myself and my co-host Forrest will be at ServerlessConf San Francisco. We’ll be putting on a couple sessions of “Think FaaS Live” with a bunch of bright folks throwing out nuggets of gold in rapid fire fashion. Will you be there? Let us know on twitter @trek10inc, @shortjared or @forrestbrazeal. Hope to see you there and on the next episode of Think Faas!

Author

Forrest Brazeal

Go to Stories by Forrest

Similar Blog

Spotlight

How to Use IPv6 With AWS Services That Don't Support It

Build an IPv6-to-IPv4 proxy using CloudFront to enable connectivity with IPv4-only AWS services.

Michael Barney | Feb 12 2025
6 min read

Spotlight

Demoing the Blues Wifi + Cell Communication Module

Explore the Blues Cell + Wifi communication module on a Raspberry Pi Zero, Notehub, and thoughts on the pros and cons of utilizing Blues in your IoT project.

Justin Courtright | Dec 21 2024
6 min read

Spotlight

Amazon Q: GenAI a Feature or a System?

Identifying where challenges and advantages exist in the quest for immediate value in Generative AI.

Brenden Judson | Sep 23 2024
6 min read

Hire the Experts

Interested in using our knowledge to further your business goals?

Explore Our Services

Overview

Overview

Overview

Related Content

AWS Lambda

Blog

What is Serverless and Why Does it Matter?

Overview

Related Content

Blog

Serverless Architectures: IoT

Blog

Is IoT Device Shadow Right for You?

Overview

Related Content

AWS CloudFormation

Containers on AWS

Overview

Related Content

Amazon WorkSpaces

Containers on AWS

Overview

Related Content

Machine Learning Ops

Amazon SageMaker

Overview

Related Content

Developer Acceleration

Overview

Related Content

Amazon CloudWatch

Disaster Recovery

Overview

Related Content

Amazon CloudWatch

Disaster Recovery

Overview

Related Content

AWS Premier Partner

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Serverless Best Practices: Operations - Think FaaS Podcast

Transcript

Author

Forrest Brazeal

Similar Blog

Spotlight

How to Use IPv6 With AWS Services That Don't Support It

Spotlight

Demoing the Blues Wifi + Cell Communication Module

Spotlight

Amazon Q: GenAI a Feature or a System?

Hire the Experts

Explore our Services

Work at Trek10