Join our "Building Machine Learning Models" interactive demo with Amazon SageMaker in New York July 18th

Register Now
Databases

Amazon Redshift’s Hardware Upgrade Improves Query Speed by up to 5x

Last year we compared the query speeds of Redshift, Snowflake and BigQuery. Thanks to a massive hardware upgrade, Redshift just got a significant boost. With this in mind, let’s revisit query speeds in Redshift!

Redshift offers two different node types — dense compute (DC) and dense storage (DS) — which are both available in two sizes each (Large and 8Xlarge, and Xlarge and 8Xlarge, respectively).

The DC nodes are pricier, but run on blazing-fast solid-state disks. We’ve historically used a mixture of dc1.large and dc1.8xlarge nodes at Periscope Data to best meet customer needs.

But now, second generation dense compute nodes are available! They come in the same two sizes, but have massive hardware upgrades across the board. DC2 nodes feature Intel E5-2686 v4 (Broadwell) CPUs, DDR4 memory, and low-latency NVMe-based solid-state disks (a huge upgrade from those used on the DC1s). Compared with DC1, DC2 nodes can deliver up to 2x more disk and network throughput!

Through some additional optimizations, dc2.8xlarge also improved data storage efficiency and will be a focus in the following benchmarks.

dc2 Specs

Evaluating Redshift’s Upgrades

Needless to say, we were excited to take the new node types for a spin (though there’s nothing really spinning about them)! We turned to some of our heaviest queriers and bucketed testing into three types of commonly seen clusters:

  • Small dcl - Low node count, dc1.large / dc2.large (<5 nodes)
  • Small dc8xl - Low node count, dc1.8xlarge / dc2,8xlarge (<5 nodes)
  • Large dcl - High node count, dc1.large / dc2.large (20–30 nodes)

After that, we carefully watched a variety of metrics, ranging from raw read/write latency to loading bar times in our product. We looked at performance improvements a few different ways:

Cluster-level performance

  • Read latency
  • Write latency
  • CPU utilization

Customer-facing performance

  • Loading bars in the product
  • Query throughput (queries serviced per hour)
  • Query execution time

As data-driven folk, we’re acutely aware of the query patterns that occur in our product. Though everyone’s use case is different, we generally see queries in four buckets:

  • Near-instant queries, which run in under 5 seconds
  • Fast queries, which run in under 10 seconds
  • Slow queries, which take between 20 seconds and four minutes to run
  • ETL queries, which can be quick, but often take up to 20 minutes or longer; these are run asynchronously in the background and produce reporting tables to power the aforementioned near-instant queries

Here’s a histogram that shows the distribution of query volume by current query run time:

Hardware’s Impact on Cluster-Level Performance

Our immediate instinct was to fire up our AWS console and take a look at what these new node types were doing.

As expected, with the new hardware, read and write latencies dropped across the board for all our cluster types: small dcl, large dcl and small dcxl.

Small dcxl

Large dcl

As much as lower read and write latencies are nice, our end users don’t see them. They see loading bars and query run times, so let’s focus on those.

Shorter Loading Bars and Query Run Times With Amazon Redshift

Small dc8xl

Our small dc8xl cluster sees a nearly 2x decrease in loading bar times!

And though loading bars are frustrating, reducing frustration doesn’t equate with a return on investment.

Instead we tried to proxy customer value with the number of queries we could serve in an hour.

Small dcxl

Small dc8xl

At peak load times, we see nearly double the number of queries returned per hour on the small dc8xl and even more on the small dcl!

What about actual query run times? This gets a little tricky. Of the queries run per day on our clusters, a plurality are brand new queries and are attributable to analysts, well, analyzing. It wouldn’t be fair to introduce a sample bias by just tracking the queries we had seen before, so let’s take a look at this by the heuristics introduced earlier: How are our near-instant, fast and ETL queries doing? The slow queries don’t stick around much anyway and are usually refactored into ETL, so we won’t dive into those.

Small dcl

Small dc8xl

Despite already being fast, near-instant queries speed up by almost a second. And in some cases, our ETL averages are nearly 5x the speed!

It wasn’t just outliers shifting around means: The median run times saw tons of improvement too.

We also saw improvements of up to 5x on a per-query level.

Ready to Improve Your Query Speeds?

You’ll be glad to know the Redshift team is providing these new node types at no extra cost. We know we’ll be switching all of our clusters over in just a bit.

Periscope Data and Amazon Web Services combine to provide the fastest and easiest way to deliver scalable, high-performance and secure cloud analytics.

Tags: 

Want to discuss this article? Join the Periscope Data Community!

Leon Tchikindas
Leon Tchikindas has been working in analytics for more than seven years, helping businesses grow sales revenue, increase product engagement, and amplify marketing ROI. He currently serves as Head of Analytics and Business Operations at Periscope Data.