Is your data secure? Find out with our free IBM security assessment! Learn More →

Services
Focus Areas

Areas of Expertise
Engagements

Discover

Build

Support
Areas of Expertise

App Modernization

Public Sector

Serverless

IoT

DevOps

Migration

Data and Machine Learning (ML)

Enterprise Architecture

24/7 Monitoring

Team Support

Datadog

Overview

Are you taking advantage of modernizing your AWS apps to protect your cloud investments?

Overview

Our mission is to accelerate high-quality cloud adoption across the Public Sector.

Overview

Whether you are new to serverless or looking to scale, Trek10 allows you to focus on building applications, not managing servers.

Related Content

AWS Lambda

With AWS Lambda, you can run code without the need for managing servers in a cost-effective manner.

Blog

What is Serverless and Why Does it Matter?

Overview

Whether you’re looking to gain visibility into plant floor machinery or seeking to enhance process efficiency, Trek10 can help.

Related Content

Blog

Serverless Architectures: IoT

Blog

Is IoT Device Shadow Right for You?

or should you build-your-own with DynamoDB?

Overview

Shorten the development lifecycle, increase reliability, and release software faster.

Related Content

AWS CloudFormation

AWS CloudFormation helps you save time and money by configuring and managing resources for you.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

At Trek10, we rapidly migrate your applications with a focus on cost-effectiveness

Related Content

Amazon WorkSpaces

Amazon WorkSpaces allows you to quickly scale according to your virtual desktop needs.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

Uncover insights from your data no matter where you are in your analytics journey.

Related Content

Machine Learning Ops

MLOps constitute best practices for developing, deploying, and monitoring high precision Machine Learning models.

Amazon SageMaker

Amazon SageMaker enables developers and data scientists to easily build ML models.

Overview

Enterprise Architecture (EA) combines business and technology in a proven industry recognized framework to deliver business focused results based on your industry, environment, competition and the ever increasing capabilities of cloud technologies.

Related Content

Developer Acceleration

A series of in-person architect-led training modules designed to help your team develop the necessary skills and best practices to modernize your applications.

Overview

Maximize the uptime and security of your most critical applications.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Experienced solutions architects and developers at your service, on-demand.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Let Trek10 help you hit the ground running with Datadog.

Related Content

AWS Premier Partner

Discover

Cloud-Native Immersion Day

Developer Acceleration

Retail | Industry Overview

SaaS on AWS

Serverless Workshop

Overview

Trek10's Cloud-Native Immersion Days are focused, high impact training sessions that will drench your teams in knowledge of the latest tech and best-practices.

Overview

Trek10’s expert-led Developer Acceleration workshops help enterprise teams quickly and safely jump-start their serverless journey.

Overview

Leveraging the vast capabilities of the AWS ecosystem, Trek10 provides retail businesses with solutions tailored to their unique needs, enabling them to innovate at speed and scale.

Overview

Trek10 helps companies migrate and build their SaaS offering on AWS with a cloud-native approach.

Overview

Whether it’s a greenfield project or re-architecting legacy, Trek10 is your guide to adopting cloud native architectures.

Build

DevOps Transformation

Internet of Things (IoT) Applications

Security

Overview

At Trek10, we leverage the best AWS native and third party tools for code-defined infrastructure, continuous integration, and automated deployment pipelines.

Overview

Trek10 helps you deliver on the promise of IoT by guiding you through the process of connecting your devices to AWS and by designing, implementing, and fully supporting your AWS cloud infrastructure.

Overview

Trek10’s security solutions and services will secure your AWS APIs and infrastructure. Schedule a meeting today to see if you qualify for a free security scan and report.

Support

CloudOps 24/7 Monitoring & Support

CloudOps Team Support

Overview

Trek10 brings managed services to the cloud. Our team works hard to reduce noise and maximize uptime in every AWS environment we manage.

Overview

Trek10 Team Support augments your team’s skills with access to a team of experienced and focused AWS solutions architects and cloud developers that specialize in leveraging AWS to the fullest.

Overview

Everyone who moves to AWS wants to secure their environment, but knowing where to start is hard. That is where Trek10 can help.
Case Studies
About
Careers
AWS Premier Partner
Community
CloudProse Blog

Spotlight

Serverless

Cost and Pricing Analysis

Cloud Native

Developer Experience

Databases

News

IoT

Monitoring, Ops & DevOps

Containers

Security and IAM

Generative AI and Machine Learning (ML)

Search Trek10

Cloud Native

Making Harmony with a Step Function Orchestrator

Be the maestro of your step functions with a simple orchestrator state machine.

Jessica Ribeiro | Jun 19 2022
4 min read

You can use AWS Step Functions to run complex serverless workflows on demand. This extends the utility of AWS Lambda, enabling us to build support for batch jobs, long-running processes, pauses for external processes, and more. However, with complexity, there is also an overhead to both understanding and modifying a workflow. Have you ever been asked to modify a design late in development because of a late-emerging requirement? How about being asked to dig up an old project and add just a “small” new feature? In the worst cases, this can be a moment of dread. Assumptions baked into a system can make it difficult to modify and extend in certain ways, and the worst of these spider throughout an entire system architecture. Just like AWS CloudFormation stacks and microservices, with AWS Step Functions it can be important to separate the concerns and responsibilities of a system while simultaneously making it possible to orchestrate their deployment and usage. One approach to this problem with AWS Step Functions is to break loosely related workflows into separate state machines. In order to make these workflows appear to function as a unified system, something is required to orchestrate the launch of all of the component state machines. In this post, we are going to cover the construction of a simple orchestration state machine for running multiple other state machines with some advanced flow controls and debug hooks.

Recently, I had a need to simplify the execution of multiple loosely-related tasks under a single trigger. Given that some of these were already built as step functions in the same repository, I could have chosen to combine them into a single state machine, but the resulting machine would have been more difficult to understand and extend in the future. Instead, I chose to keep them separate and introduced an orchestration state machine that requires no additional AWS Lambda functions to orchestrate all of the target state machines, allowing it to run from a single trigger as required. Here was the list of requirements for the orchestrator:

The orchestrator needed to have the option to delay on a per-state-machine basis. To make development with this feature easier, a debug flag to skip delays was also needed.
It needed to allow for reconfiguration or running the same state machine in parallel with different configurations.
For ease of development, it had to allow individual state machines to be disabled easily.

Given those requirements, we can dive into the solution. For reference, this solution was built with v1.36.0 of AWS SAM CLI. The following YAML definition and state diagram represent the orchestrator state machine that is explained throughout the rest of this post.

orchestrator.asl.yaml

Comment: An example of combining workflows using a Step Functions StartExecution task
  state with various integration patterns.
StartAt: Inject target state machine data
States:
  Inject target state machine data:
    Comment: Injects ARNs of target state machines
    Type: Pass
    Next: Start in parallel
    Parameters:
      payload.$: "$.payload"
      debug:
        skipDelays: false
      targets:
      - stateMachineArn: "${MyStateMachineArn}"
        disableTarget: false
        delay: 60
        configuration:
          someFeature: true
      - stateMachineArn: "${MyStateMachineArn}"
        disableTarget: false
        configuration:
          someFeature: false
  Start in parallel:
    Comment: Start child state machines in parallel dynamically with map
    Type: Map
    End: true
    ItemsPath: "$.targets"
    MaxConcurrency: 1
    Parameters:
      debug.$: "$.debug"
      payload.$: "$.payload"
      target.$: "$$.Map.Item.Value"
    Iterator:
      StartAt: Skip or Delay or Execute
      States:
        Skip or Delay or Execute:
          Comment: Skip/Delay/Execute
          Type: Choice
          Default: Execute
          Choices:
          - Next: Skip
            And:
            - Variable: "$.target.disableTarget"
              IsPresent: true
            - Variable: "$.target.disableTarget"
              BooleanEquals: true
          - Next: Delay
            And:
            - Variable: "$.target.delay"
              IsPresent: true
            - Variable: "$.target.delay"
              IsNumeric: true
            - Or:
              - Variable: "$.debug.skipDelays"
                IsPresent: false
              - Not:
                  Variable: "$.debug.skipDelays"
                  BooleanEquals: true
        Skip:
          Comment: End
          Type: Pass
          End: true
        Delay:
          Comment: Delay
          Type: Wait
          SecondsPath: "$.target.delay"
          Next: Execute
        Execute:
          Comment: Execute target state machine dynamically from input
          End: true
          Type: Task
          Resource: arn:aws:states:::states:startExecution.sync
          Parameters:
            StateMachineArn.$: "$.target.stateMachineArn"
            Input:
              NeedCallback: false
              AWS_STEP_FUNCTIONS_STARTED_BY_EXECUTION_ID.$: "$$.Execution.Id"
              payload.$: "$.payload"
              configuration.$: "$.target.configuration"

The orchestrator state machine can be broken into a few phases:

Merge input state with target state machine configuration data
Iterate the targets
Skip, delay, or execute each target

The first step uses a Pass state to merge configuration data, including the list of target state machine configurations, into the state. Each state machine configuration is composed of the target state machine’s ARN, the options for disabling or delaying a target state machine’s execution, and configuration specific to the target state machine. A particular state machine ARN, which is supplied via a variable substitution from the SAM template, can be reused in multiple targets with each having its own configuration. In addition to the list of targets, debug settings, such as the option to skip delays, are included in the configuration data. The input to the target state machines is expected to be contained in a payload property on the orchestrator input JSON.

The next step uses a Map state to iterate over the targets in order to decide how to handle each one. The payload, debug settings, and target configurations are passed along to the next step. Here the MaxConcurrency: 0 property assumes that it is ok to run as many of the targets in parallel as AWS limits will permit. Concurrency should be limited if the workflow requires it.

Inside the map iteration, each target is run through a Choice state to decide if it needs to be skipped, executed now, or executed after a delay. It is worth noting that additional decision-making logic, including custom AWS Lambda functions, could be added at this point in the state machine to provide additional control. If the target state machine is executed, it is invoked with the payload property, the target state machine’s configuration property, and some AWS specific flags, one to specify that the target does not use a callback and another to connect the two-step function executions by execution ID. Once all of the target state machines have been executed or skipped, the orchestrator completes its run. Now that we have a complete understanding of the orchestrator state machine, we can look at the AWS CloudFormation needed to deploy it with a set of state machines.

template.yaml snippet

# Resources:
  OrchestrationControlStatemachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: statemachine/orchestrator.asl.yaml
      Policies:
        - Version: 2012-10-17
          Statement:
            - Effect: Allow
              Action:
                - states:DescribeExecution
                - states:StopExecution
              Resource: '*'
            - Effect: Allow
              Action:
                - events:PutTargets
                - events:PutRule
                - events:DescribeRule
              Resource: !Sub arn:${AWS::Partition}:events:${AWS::Region}:${AWS::AccountId}:rule/StepFunctionsGetEventsForStepFunctionsExecutionRule
# add one of these execution policy templates for each target state machine
        - StepFunctionsExecutionPolicy:
            StateMachineName: !GetAtt MyStateMachine.Name # Target Name
      DefinitionSubstitutions:
# add one of these variable substitutions for each target state machine
        BillingAlertsStatemachineArn: !Ref MyStateMachine # Target ARN

With the state machine defined, it just needs to be added to a SAM template and deployed to AWS. The two main things worth highlighting here are the policies and substitutions. AWS Step Functions must be granted the ability to get details about (states:DescribeExecution) and stop (states:StopExecution) any of the target state machines. It also needs to be able to start each target state machine. The SAM built-in policy construct StepFunctionsExecutionPolicy grants states:StartExecution against a given state machine by name. Add one of these policies for each target state machine by name (NOT by ARN). In addition, the target state machine ARNs that need to be substituted into the state machine definition need to be supplied under the DefinitionSubstitutions property.

With this technique in your toolbox, you can now design and deploy complex maintainable state machine architectures that are completely code-defined. Modifications and debugging become much easier to reason about and perform. If you are looking for more hands-on help with this, head over to our contact page to talk to Trek10.