Kinesis Cheat Sheet

Overview

  • Serverless (almost) stream processing
    • “Almost” because you still have to plan, manage and pay for shards
  • Can stream data, video, store stream to file and do stream analytics

Data Stream

  • Real-time streaming
  • 1MB payload
  • Stream partitioned by shards
  • Exactly-once is not achieved by Kinesis itself because of: producer and consumer retries
    • Use a unique ID in message to handle duplicate messages in your application

Firehose

  • Save stream data to S3
  • Near-real-time in AWS definition (~60 seconds latency for buffering)
  • Can do transformation of data before writing to target

Analytics

  • Use SQL to analyze incoming data with a sliding window

Kinesis Client Library

  • Written in Java and other languages
  • Used by producer to send data to stream
  • Data is buffered before sending to stream, thus making it non-real-time
    • Use Kinesis API (PutRecords) directly if real-time streaming is needed

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s