As we said five years ago in this post, “Its more formal and slightly less catchy name is Cloudwatch Events with a Scheduled Event Source and a Lambda Target… but we think “Lambda Cron” just rolls off the tongue a bit better.”
The current 2022 article seeks to revisit the aforementioned blog post and determine how “Lambda Cron” reliability may have changed over the past five years. Check out the 2017 post for more background on the purpose of this series.
The Experiment, Revisited
The setup for this experiment was mirrored as closely as possible to the 2017 test. This means that we deployed a scheduled invocation of a Lambda function that would log both the intended time of its invocation [which is provided within the “event” object the Lambda is invoked with] as well as the observed system time to a DynamoDB table. Unlike the 2017 test, this test ran only for 1 month (Jan 11 - Feb 12) as opposed to 4 months. Moreover, this test was configured such that provisioned concurrency was enabled for the Lambda function. Now, given that the function is being invoked every minute, this configuration should not impact results (the function should constantly be in a “warm” state.) However, this precaution was taken in order to remove all chances of cold-start delays impacting the results.
As mentioned in the 2017 article, in the event that an execution is not logged for a given time interval, it is difficult to say whether or not it was in fact CloudWatch events that failed to trigger, as there are other variables in play such as issues with Lambda or DynamoDB. Because of this, the 2022 test does not concern itself with the likelihood of CloudWatch events failing to invoke the function. Rather, the results are analyzed here only with regard to the expected delay of the invocations.
2022 Execution Time Lag for Lambda Cron, rounded to nearest second, per region, by percentile, with maximum value included1
2017 results, rounded to nearest second, per region, by percentile
- Notice that the 99.9 and 99.99th percentiles for us-east-1 have improved tremendously. Whereas in 2017 the values were 585 and 2537 seconds, respectively, the 2022 results are 36 and 65 seconds, respectively.
- For the most part, there has not been much change in the overall status of “Lambda Cron.” Namely, as seen by the column labeled “MAX”, it is still possible for delays to reach times in excess of half an hour.
Conclusion: Proceed with Caution
Depending on the use case, it is possible that the flaws seen within Lambda Cron will not be of any consequence. The service triggers Lambda roughly within a minute 99.9% of the time. If the occasional half-hour delay in execution is of no consequence, then all of these results should be of little concern. Naturally, given what the data show, Lambda Cron cannot be relied upon to always execute Lambda within seconds. Some executions will take longer than a minute to occur, so therefore systems relying upon Lambda Cron must be able to tolerate delays in Lambda execution.