If you have been following along with part 1 and part 2 of this series, you may be left with some questions on what else can you do to further optimize your Kinesis workflows. Fortunately, in this blog post, we will explore some of the more niche features and help to provide an understanding of when you may or may not use them.
The next two sections depend on a piece of infrastructure called an Event Source Mapping. An Event Source Mapping handles the compute needed to route messages to Lambda functions. Within the AWS console, these are created simply by creating a Trigger for a Lambda function. This resource contains all the configurations needed to manage how AWS Lambda will interact with Amazon Kinesis. By default, this resource will spin up a new Lambda instance for each shard.
Another way to scale up our streams is to provide a parallelization factor. This value does not technically pertain to the Kinesis stream itself, rather it provides a unique implementation of the Event Source Mapping. That means that this solution can only work for problems where the consumer is a Lambda function. The goal of this mechanism is to provide a way to scale up your Lambda consumers without needing to increase the number of shards. This solution provides the capability to launch up to 10 Lambda functions per shard depending on your choice of Parallelization Factor.
The immediate reaction to this information should be an understandable concern for the ordering of your messages, however, AWS has made sure to implement this in such a way in which order is still guaranteed along partition values. How does AWS achieve this?
Looking back to our previous example with our manually split shards, we notice that both shards each contain two unique partition keys. What the event source mapping will do is during each new poll of records, it will batch records based on their unique partition keys. It will then distribute these records evenly to all Lambda functions. The implication here is that given our example of 2 unique values per shard, a factor greater than 2 would be ineffective in improving performance. The Event Source Mapping would never create any more than 2 unique sets of batches, this ensures that order is maintained across a partition key, however, interestingly you no longer have a guarantee of order along a shard. Below is a picture to help illustrate what is happening with a related example and parallelization factor of 3:
Notice in this diagram, you cannot guarantee that payload with sequence value 3 is processed before 5, however, you can guarantee that the payload with sequence value 1 will be processed before 5. This is another example of why picking an appropriate partition key value for ordered messages is a critical choice in your kinesis implementation.
This solution is ideal for cases where the method to process the data takes long enough to cause a backup in processing. So for example, if our records discussed above fall below the maximum threshold for a Kinesis shard (1 MB/s or 1000 records/s write), however, the processing of the records for the shard takes greater than a second. More concretely, say you have 500 records per second on a call, but your Lambda function processes these 500 records in 1.2 seconds. This means your Lambda will not be able to keep up with your throughput. You can resolve this by increasing the shards, however, this process can only be done so often and results in higher cost. Splitting your 500 records into 250 records (once again, assuming evenly distributed records along your partitions) each could result in your Lambda function taking half the time to process those records, meaning you can handle your throughput without having to scale up shards.
Enhanced Fan Out
Enhanced Fan Out (EFO) is a solution to solve a different kind of scaling problem. One of the advantages of Kinesis is the ability to have multiple consumers read from a single Kinesis stream. However, on streams with sufficiently high throughput and a sufficient number of consumers, we can run into a couple of scaling problems strictly on the consumer side. The GetRecords API call has some important limits to keep in mind. First, it can only handle 5 transactions per second. A Lambda Event Source Mapping performs 1 call every second. This means that once you go to a 6th unique Lambda function, you have exceeded your limit. Kinesis Firehose, a favorite pairing with Kinesis, also performs the call once per second. So a firehose reduces the number of Lambda event source mappings down to 4. In addition, if a shard returns more than 10MB within 5 seconds, you will be throttled for the remainder of that 5 seconds. Concretely, if a single call returns 10MB, then all other calls for the 5-second window will be throttled.
The old resolution to this issue was to duplicate your Kinesis stream by having a Lambda function simply forward your records to another stream or your producer would write to multiple streams. AWS has implemented another solution for us and created a resource called a ‘consumer’. A consumer is a special copy of your Kinesis stream with an HTTP/2 endpoint. This means that EFO will actually push records to a subscriber. Each consumer may have exactly one subscriber and each stream can have up to 20 consumers. Calling the SubscribeToShard API call will return an event stream object. This Stream will produce real-time values from Kinesis pushed to you from the HTTP/2 endpoint.
EFO is therefore very powerful for two reasons, it allows you to provide dedicated throughput to a single service while also reducing the effective latency.
Kinesis is a very powerful AWS tool that provides the ability to interact with data in real time. When dealing with data that must be ordered, Kinesis can present many unexpected issues that can be difficult to resolve without a deeper understanding of the service itself. With Kinesis, you must be very aware of the impact of the key values you’ve chosen on the performance of the service.