Elasticsearch Service (AES) Cheat Sheet

  • Managed Elasticsearch + Kibana

Components

  • Domain = Elasticsearch cluster + options (instance type, counts, storage options, etc.)
  • Nodes = Instances
    • Dedicated master node (optional) = cluster management nodes, do not process data
      • Required for cluster with 10+ instances
    • Data node = stores data using EBS, processes data
    • UltraWarm node = stores data in S3, processes data

Encryption

  • At rest
    • Can be enabled, uses KMS to manage key
    • Indices, logs, swap, automatic snapshots, application directory are encrypted
      • NOT encrypted: manual snapshots, slow / error logs
    • Only symmetric CMK is supported
    • Packages (custom dictionaries) are always encrypted using AES-managed keys
  • Node-to-node
    • Each AES domain always resides in its own VPC to prevent traffic interception, regardless of whether it uses VPC access
    • Traffic within the AES VPC is not encrypted
    • Can be enabled, so in-VPC traffic uses TLS
    • Can only be enabled during creation, cannot be disabled after creation, create new domain and migrate if needed

Access Control

  • Uses resource-based, identity-based or IP-based access control
  • Access can be controlled at cluster, index, document and field level
  • All requests to AES configuration API must be signed even if a domain is made to allow completely open access

VPC Support

  • Can be launched within a selected VPC and subnets
  • Places a private endpoint into one subnet, or multiple subnets in different AZ if multi-AZ is enabled
  • Also places ENI for each of the data nodes in the subnets
  • Kibana can only be accessed from within the VPC, using VPN or through a proxy
  • Need to reserve a lot of IPs for the data nodes and blue/green deployments
  • A domain can be either in-VPC (private) or public, but cannot be both
  • Domain cannot change its mode from in-VPC to public, nor the reverse

Kibana

  • Preloaded for every domain at domain-endpoint/_plugin/kibana/
  • Access
    • Public access = Use Cognito or IP based policy
    • VPC access = Use Cognito or security group, to grant Internet access you need a proxy
  • Self-hosted Kibana can connect to AES domains

High Availability

  • Use dedicated master nodes to offload management tasks and increase availability
    • 1 master node = no backup when failed, 2 master nodes = necessary quorum of elect new master node, 3 master nodes = minimum
    • Master nodes do not need to be as powerful as data processing nodes
10 instances = 7 data nodes + 3 master nodes
1 master node active, 2 standby
  • Multi-AZ configuration can be enabled
    • You also need to create replicas then AES will try to distribute the replica to another AZ
Balanced Multi-AZ example: 3 AZs in the Region, 6 data nodes, 6 indices, 1 replica for each index
Imbalanced Multi-AZ example: 3 AZs in the Region, 5 data nodes, 5 indices, 2 replicas for each index, one of the AZ is overloaded; if 2 or more replicas are needed, number of nodes should be multiple of 3 to avoid this situation
When Multi-AZ is enabled, dedicated master nodes are automatically distributed to 3 AZs
  • Configuration changes are deployed using blue/green strategy, a new environment is created to apply changes and traffic is switched after changes are applied
    • During configuration change the nodes may be doubled
    • You do not pay for extra resources provisioned during configuration change, unless you change to a new instance type, where you will pay for both instances

      Number of nodes doubling from 11 to 22 during a domain configuration
        change.
During configuration change the number of nodes grows to 22, then back to 11 after the change is applied

UltraWarm

  • Uses S3 for read-only, “warm” data
  • Billed by time for the nodes, and by actual size of storage used
  • Performance is lower than “hot” nodes
    • Hot nodes use EBS and instance store, while UltraWarm nodes uses S3 + sophisticated cache mechanism
  • UltraWarm and hot data can migrate to each other, and this process can be automated using Index State Management at no extra cost
  • UltraWarm nodes cannot be stopped and resumed; to disable UltraWarm, user must delete or migrate all indices to hot nodes

Upgrading

  • AES service upgrade
    • Some are required and automatically applied, some are optional
    • Official document recommends requesting service upgrade at a low traffic time which suggests service upgrade do not interrupt domain operation
  • Elasticsearch upgrade
    • Manually triggered, AES will take snapshot before upgrade
    • 15 minutes to several hours down time

Migration

  • By index snapshot
    • Create a snapshot for self-hosted Elasticsearch cluster, upload to S3, grant S3 permissions for newly created AES domain, restore

Limitations