Saturday, October 1, 2022

[FIXED] How to guarantee to process Kinesis event stream serially when using paralleization factor?

Issue

Kinesis stream has only 1 shard and when creating Lambda, concurrent batches per shard for Kinesis stream source has been set as 10. When there is a spike in stream data, it will increase the concurrencies to 10. That means we will have 10 lambdas working in parallel. My question in this case is, how we can guarantee to process event stream serailly? It seems to me that it is impossible to do that because we can't control concurrencies. Can anyone have an idea for this? I can't get my head round.


Solution

AWS Lambda supports concurrent batch processing per shard and serial event processing, as long as all events in the Kinesis stream have the same partition key.

From AWS documentation:

You can also increase concurrency by processing multiple batches from each shard in parallel. Lambda can process up to 10 batches in each shard simultaneously. If you increase the number of concurrent batches per shard, Lambda still ensures in-order processing at the partition-key level.

References:

  1. Using AWS Lambda with Amazon Kinesis (AWS)
  2. Partition Key (Amazon Kinesis Data Streams Terminology and Concepts)


Answered By - Andrew Nguonly
Answer Checked By - Mildred Charles (PHPFixing Admin)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.