Skip to content
Overview
- A managed data warehouse
- Based on ParAccel solution, in turn based on PostgreSQL
Node Types
- DC2 = dense compute, SSD-backed, good for compute and I/O intensive workloads
- DS2 = dense storage, HDD-backed, good for huge dataset
- RA3 = good general purpose, recommended to replace DC2 and DS2
- It is odd that AWS does not have a page dedicated to node types, some info is buried in pricing page
Resizing
- Elastic resizing, can resize cluster and change node types, resizing done in minutes
- Classic resizing, resizing done in hours and days, not recommended anymore
- How data is distributed in cluster
- Styles
- AUTO = automatically select from below strategies
- ALL = full copy of data in every node, ok for smaller, not-updated-often tables
- No significant performance gains for smaller table
- EVEN = distribute in round-robin fashion, good if no joins
- KEY = distribute by selected column, good when joining by those columns
COPY
Command
- Use COPY command whenever you can, much more efficient than individual INSERTs
- Use a single COPY command, with multiple files, to enable parallel COPYing
- AVOID using multiple commands
- AVOID using a single file, split the file as a multiple of your slices
Like this:
Like Loading...