Massive scale for time-series workloads: Introducing continuous aggregates for distributed hypertables in TimescaleDB 2.5

Massive scale for time-series workloads: Introducing continuous aggregates for distributed hypertables in TimescaleDB 2.5

Announcing TimescaleDB 2.5

Time-series data is everywhere, representing how systems, processes, behaviors, and values change over time. By analyzing data across the time dimension, developers can understand what is happening right now, how that is changing, and why that is changing. Time-series data helps developers solve problems in observability, product and user analytics, fintech and blockchain applications, and the Internet of Things, to name but a few industries.

And time-series data is also relentless. Because of the sheer volume and rate of information, time-series data can be complex to store, query, and analyze.

This challenge (and need!) is why we built TimescaleDB, a multi-node, petabyte-scale, completely free relational database for time-series. Multi-node TimescaleDB is a distributed system built with the most demanding time-series workloads in mind, scaling to over 10 million metrics ingester per second, storing petabytes of data and processing queries even faster thanks to better parallelization, yet remains built on the rock-solid foundation and maturity of PostgreSQL.

Today, there are nearly three million active TimescaleDB databases running mission-critical time-series workloads across industries. This includes a growing number of multi-node deployments, from startups to major enterprises, such as a Fortune 100 tech company that has deployed two production clusters of multi-node TimescaleDB, each with more than 20 servers, to ingest more than a billion rows of telemetry events per day per cluster.

TimescaleDB is built for time-series data, and how we designed its scale-out architecture is no exception. While most distributed database systems use one-dimensional sharding to partition data (e.g., by device, stock ticker, or some other application identifier), multi-node TimescaleDB leverages two-dimensional, fine-grained partitioning for greater flexibility, particularly around time-oriented data lifecycle management. This architecture enables not only distributed operation, but also flexible elasticity and data rebalancing, node heterogeneity, data retention policies, efficient data tiering, and more. See our announcement blog for more on the design of multi-node TimescaleDB.

One of our goals for multi-node TimescaleDB is to provide a developer experience that’s very similar to that of single-node TimescaleDB, so that eventually, a developer can start with a small single-node deployment, grow over time, and scale-out automatically when needed. Towards this end, recent releases have added functionality to multi-node TimescaleDB to close this gap, such as distributed compression policies, triggers, and generated columns for distributed operations, distributed restore points, additional distributed query push-down optimizations, and more.

Continuous aggregates for distributed hypertables

With today’s release of TimescaleDB 2.5, we are excited to release a highly-requested new capability: continuous aggregates support for distributed hypertables.

Continuous aggregates pre-compute results, so that queries on the pre-computed results can be orders of magnitude faster. But this isn’t just a simple key-value cache. With TimescaleDB continuous aggregates, users can query these pre-computed results with the full power of SQL and a powerful query engine, enabling a user to JOIN results against other relational tables, re-aggregate or rebucket the results (e.g., grouping minutely data into 10 minute buckets, or aggregating across many devices), and more.

While continuous aggregates are similar to materialized views in PostgreSQL, unlike materialized views, they are incrementally updated with a built-in refresh policy that makes the aggregates stay up-to-date. TimescaleDB’s continuous aggregates also properly handle late or out-of-order data (with minimal overhead), and support the real-time aggregation of pre-computed and raw data. More on this below.

Today’s release of continuous aggregates on distributed hypertables retains the same user experience and API that developers already know with hypertables on single-node TimescaleDB. Now, developers can compute continuous aggregates in a multi-node TimescaleDB cluster, massively speeding up workloads that need to process large volumes of data at the petabyte scale in a distributed environment.

It’s just a few commands to create a distributed hypertable and continuous aggregate:

-- Create a distributed hypertable partitioned on time and device
CREATE TABLE conditions (
   time TIMESTAMPTZ NOT NULL,
   device_id TEXT NOT NULL,
   temperature DOUBLE PRECISION NULL
);
SELECT create_distributed_hypertable('conditions', 'time', 'device_id');

-- Insert some data into the distributed hypertable
INSERT INTO conditions VALUES
   (now() - interval '15m', 'A001', 70.0),
   (now(), 'A001', 90.1),
   (now(), 'B015', 68.5),
   (now(), 'C183', 69.4),
   (now(), 'E602', 73.8);
   
-- Create a continuous aggregate on the distributed hypertable
CREATE MATERIALIZED VIEW conditions_hourly 
   (hour, device_id, mintemp, maxtemp)
WITH (timescaledb.continuous) AS
   SELECT time_bucket('1hour', time), device_id,
      min(temperature), max(temperature)
   FROM conditions
   GROUP BY time_bucket('1hour', time), device_id;

-- Query the continuous aggregate on the distributed hypertable
SELECT * FROM conditions_hourly
   WHERE hour > now() - interval '7day' AND maxtemp > 90;

In addition to continuous aggregates for distributed hypertables, this release also adds PostgreSQL 14 support, new time bucketing functionality for expanded timezone management, and a number of other bug fixes. See the release notes for a complete list of all the changes!

Want to learn more about how to use continuous aggregates in your distributed hypertables, together with other TimescaleDB features like data retention policies? Watch this demo by Miranda Auhl, Developer Advocate here at Timescale:

PostgreSQL 14 support

We are very proud to announce that TimescaleDB 2.5 includes support for PostgreSQL 14, which was released one month ago. As supporters of the Postgres community, we want our users and customers to benefit from the latest and greatest that Postgres has to offer when using TimescaleDB. See the main improvements introduced on this new version of Postgres.

We made the decision early in the design of TimescaleDB to build on top of PostgreSQL. We believed then, as we do now, that building on the world’s fastest-growing database has numerous benefits for our users and customers. Among these benefits is the rock-solid dependability that developers have come to expect from a database with 20+ years of production usage behind it, so that all the goodness and innovation happening in PostgreSQL itself also benefits TimescaleDB users. Building on top of PostgreSQL enables us to innovate rapidly with solutions for ingesting and analyzing time-series data while also leveraging (and contributing to) a database with a reputation for stability.

And in fact, while we were working on releasing support for PG14, our own development and testing processes uncovered a serious memory corruption bug in PG14 for which we proposed a patch to the upstream.

Timezone support in next-gen time bucketing

Additionally, TimescaleDB 2.5 introduces new timezone support to time_bucket_ng, enabling developers to bucket and analyze their data according to a specific timezone - which is particularly useful for applications spanning data or devices worldwide.

This time_bucket_ng function serves as the experimentalnext generation” version of time_bucket, but rewritten to introduce support for non-fixed time intervals (specifically, months and years, which have a variable duration based on which month or leap year). Currently, time_bucket only supports fixed time intervals (e.g., seconds, minutes, hours, days, weeks), which led to a (surprisingly complex) rewrite in time_bucket_ng for more flexible time bucketing, including for use in continuous aggregates. We plan to graduate this function out of the experimental schema in future releases.

Getting TimescaleDB 2.5

TimescaleDB 2.5 is available immediately for Timescale Cloud and self-managed deployments, and will be available in the coming weeks on Managed Service for TimescaleDB.

If you’re new to TimescaleDB, create a free account to get started with a fully-managed TimescaleDB instance (with a free 30-day trial, no credit-card required 🔥).

You can also visit our GitHub to learn more (and, as always, ⭐️ are appreciated!), or join our Slack community to share your results, ask questions, get advice, and connect with 7K+ other TimescaleDB enthusiasts. And, if you are interested in getting a lot more involved, we are hiring worldwide!

Interested in learning more about this new functionality? Read on for a refresher on multi-node TimescaleDB and continuous aggregates, and how we built support for continuous aggregates on distributed hypertables.

Shoutout to all the engineers who worked on features in TimescaleDB 2.5: Markos Fountoulakis, Mats Kindahl, Aleksander Alekseev, Alexsander Kuzmenkov, Sven Klemm, Gayathri Ayyappan, Erik Nordström, Fabrizio Mello, Dmitry Simonenko, Nikhil Sontakke, Duncan Moore, and the entire team of reviewers and testers!

We’d also like to give a special shoutout to all community members who’ve asked for improvements to multi-node TimescaleDB, time_bucket_ng, and PostgreSQL 14 support, all of which informed our choice of functionality introduced today.

A refresher on multi-node TimescaleDB

Multi-node TimescaleDB enables users to run petabyte-scale workloads across multiple physical TimescaleDB instances, allowing users to scale beyond the limits of a single TimescaleDB (or PostgreSQL) instance. To do this, we introduced the concept of a distributed hypertable.

A regular hypertable, one of TimescaleDB’s original innovations, is a virtual table in TimescaleDB that automatically partitions data into many sub-tables (“chunks”) on a single machine, continuously creating new ones as necessary, where each chunk includes data belonging to a specific range of timestamps. Yet, it provides the illusion of a single continuous table across all time (the “hypertable”).

A distributed hypertable is a hypertable that is spread across multiple machines, while still maintaining the illusion (and user-experience) of a single continuous table across all time. When using the recommended time-and-space partitioning, each chunk in a distributed hypertable is defined by both a time interval and some subset of the keys belonging to the additional partition key, and these chunks are automatically created across the distributed hypertable’s data nodes.

The multi-node TimescaleDB architecture consists of an access node (abbreviated as AN), which stores metadata for the distributed hypertable and performs query planning across the cluster, and a set of data nodes (abbreviated as DNs), which store individual (or replicated) chunks of the distributed hypertable, and execute queries against those chunks locally. TimescaleDB remains a single piece of software for operational simplicity; any database can become an access node when a user executes the add_data_node command, passing in the hostnames of other nodes.

A distributed hypertable covering one access node (AN) and three data nodes (DN1, DN2, DN3). A distributed hypertable is partitioned into chunks by both “time” and “space” dimensions, which are spread dynamically across available data nodes.

Once multi-node TimescaleDB is set up, creating a distributed hypertable is as simple as creating a regular hypertable, as also shown above:

-- Create a distributed hypertable partitioned on time and device
CREATE TABLE conditions (
   time TIMESTAMPTZ NOT NULL,
   device_id TEXT NOT NULL,
   temperature DOUBLE PRECISION NULL
);
SELECT create_distributed_hypertable('conditions', 'time', 'device_id');

Whenever data is inserted, the access node routes each data tuple to the proper chunk(s) on one or more of its data nodes (in this example, based on timestamp and device id). And as time progresses, the access node will automatically create new chunks on data nodes (corresponding to new time ranges and space partitions) as data continues to be ingested. This ensures that data is spread across the data nodes evenly.

Today multi-node TimescaleDB supports several capabilities which make it suited for large-volume time-series workloads, including efficient query pushdown, automated compression policies, improved data rebalancing for elasticity and high-availability, distributed object management (e.g., keeping roles, UDFs, and other objects consistent across nodes), and more, all while striving to provide the same experience and functionality as single-node TimescaleDB.

A refresher on continuous aggregates (and why they’re powerful)

Users are often interested in aggregates computed over data, such as the average sensor reading per device per day, the OHLC (open high low close) values per stock per interval, the maximum CPU utilization per 5 minutes, the number of distinct visitors on each web page per day, and so forth.

PostgreSQL has powerful support for computing various forms of aggregates, and TimescaleDB’s hyperfunctions add even more, including various approximate statistics and approximation/sketching algorithms. But computing the aggregates for large tables can be expensive, simply because you have to read and process all the data belonging to the desired interval.

To avoid the constant recomputation each time you want to read the aggregated data, many relational databases, including PostgreSQL, support materialized views. The use of materialized views avoid constant recomputation at query time, but come with three limitations for time-series data:

  1. The materialized view is not automatically refreshed, so it is necessary to do a regular refresh of the materialized view as new data comes in.
  2. When refreshing the materialized view, all the data is read again to compute the aggregate, even if only a small portion of the data is changed.
  3. This recomputation necessitates retaining all the underlying raw data in order to perform the refresh of the materialized view.

TimescaleDB’s continuous aggregates are designed to overcome these three limitations.

Continuous aggregates look and behave like materialized views, but they are incrementally updated with a built-in refresh policy that makes the aggregates stay up-to-date as new data is added. Further, they can efficiently handle late or out-of-order data: if data that is inserted, updated, or deleted actually has an older timestamp, the aggregates corresponding to that older period of time are correspondingly updated. But, through the use of clever internal “invalidation records”, the refresh procedure is careful to only refresh data in the materialized view that actually needs to be updated, which avoids recomputing data that did not change. This smart refresh procedure massively speeds up the incremental maintenance of materialized views, and ensures that the aggregate data is up-to-date.

Continuous aggregates similarly are amenable to data retention policies that are often applied in time-series workloads, such that aggregates about old data can be retained in the database, even after the corresponding raw data itself has been dropped.

When defining a continuous aggregate policy, you define a refresh window. The continuous aggregate will be updated inside this window only, and only the small portions of data that have been changed will be updated, according to internal invalidation records that track changes to old data. Data outside the refresh window won't be refreshed, enabling data retention policies to delete old raw data while still preserving the aggregates.
When defining a continuous aggregate policy, you define a refresh window. The continuous aggregate will be updated inside this window only, and only the small portions of data that have been changed will be updated, according to internal invalidation records that track changes to old data. Data outside the refresh window won't be refreshed, enabling data retention policies to delete old raw data while still preserving the aggregates.

TimescaleDB’s continuous aggregates also support real-time aggregation. With real-time aggregation, when querying a continuous aggregate, the query engine will transparently combine (i) results that are already pre-computed in the materialized view with (ii) results computed at query time over the newest raw data in underlying hypertable. (You can turn this off if desired, but the majority of developers want this behavior by default).

Real-time aggregation gives you the best of both worlds: the performance of continuous aggregates and the most up-to-date data for real-time queries, without the performance degradation of querying and aggregating all raw data from scratch. This makes real-time aggregation an ideal fit for many real-time monitoring, dashboarding, and analysis use-cases.

Continuous aggregates on distributed hypertables

In TimescaleDB 2.5, these features from continuous aggregates have been extended to distributed hypertables, and use an identical SQL CREATE command as with regular hypertables.

But in a multi-node architecture, the continuous aggregate is stored to optimize query performance. In particular, the underlying incrementally materialized view that’s part of a continuous aggregate is stored on the access node, while the raw data continues to reside across the data nodes. Data is asynchronously materialized from data nodes to the access node via the continuous aggregates refresh policy.

Queries on continuous aggregates over distributed hypertables achieve high performance. Rather than querying the raw data distributed on the various data nodes and aggregating it at query time, many or all of the results from a continuous aggregate query are already pre-computed and present on the access node. This turns a distributed query across multiple nodes into a local one, while of course also benefiting from results that have been pre-computed as part of the background materialization process, further minimizing the computation needed at query time.

Architecture diagram showing how continuous aggregates are stored on a multi-node TimescaleDB cluster. The continuous aggregate is stored on the access node (AN), while the raw data is distributed between the data nodes (DNs). The raw data is materialized in the background by the continuous aggregate refresh policy.

Developers familiar with TimescaleDB might wonder what this design means for real-time aggregates. While the pre-computed results in the continuous aggregate might already reside on the AN, the latest raw data (to be aggregated at query time) still resides across the data nodes. But continuous aggregates, even though now on distributed hypertables, continues to support this transparently: the query engine will push down aggregation to the needed DNs for the latest raw data, if a user’s query to a continuous aggregate includes a recent time range that has yet to be materialized.  

For example, if the continuous aggregate typically has materialized data up until the last 30 minutes, then queries that only touch data older than 30 minutes only need to involve the aggregate data already materialized on the access node. If a query involves newer data, however – say, “give me the max temperature per hour for the past six hours” – then the query engine combines this pre-aggregated (on the AN) with real-time aggregated data from the last 30 minutes (computed across the DNs).

This diagram illustrates what happens when you query a continuous aggregate on a distributed hypertable. If the new query involves new data which arrived since the last refresh of the continuous aggregate, the query engine combines the pre-aggregated data on the access node (AN) with this new not yet materialized data from the data nodes (in this case, DN 3) at query time. Note that this query will be considerably more efficient than querying raw data altogether, as the majority of results are pre-computed already on the access node - and only the latest data not yet materialized is fetched at query time from data node 3.

Real-time aggregation remains enabled by default, although it can be disabled anytime (including for individual sessions). Without real-time aggregation, a query against a continuous aggregate will be performed fully against data already pre-computed on the access node.

Supporting continuous aggregates across distributed hypertables did introduce some new technical challenges. Because the distributed hypertable is spread across multiple data nodes, each data node needs to maintain its own invalidation log to reflect changes to the distributed hypertable’s chunks that reside locally. Then, the refresh process needs to ensure that the proper regions are recomputed, even though these invalidation logs are distributed, and that materialization results from different data nodes are properly serialized (and avoid deadlocks) on the access node. One key step to processing these invalidation logs and refreshing the continuous aggregate involves executing all operations as distributed transactions within the multi-node cluster (using TimescaleDB’s two-phase commit protocol). This way, each stage remains atomic, as it is when the process takes place on single-node TimescaleDB.

In future releases, we plan to add functionality to distribute continuous aggregates themselves across data nodes (e.g., where the materialization hypertable underlying the continuous aggregate is itself distributed across the data nodes). This would enable much larger continuous aggregates than can be supported today (e.g., those that cannot fit on the access node).

Get started today

TimescaleDB 2.5 is available today for Timescale Cloud and self-managed deployments, and will be available in the coming weeks on Managed Service for TimescaleDB.

If you’re new to TimescaleDB, create a free account to get started with a fully-managed TimescaleDB instance (with a free 30-day trial, no credit-card required 🔥).

Join our Slack community to share your results, ask questions, get advice, and connect with 7K+ other developers (our co-founders, engineers, and passionate community members are active on all channels).

Visit our GitHub to learn more (and, as always, ⭐️ are appreciated!). And, if these are the types of challenges you’d like to help solve, we are hiring!

This post was a collaboration between
13 min read
Always Be Launching
Contributors

Related posts

TimescaleDB - Timeseries database for PostgreSQL

Explore TimescaleDB

Learn more about how TimescaleDB works, compare versions, and get technical guidance and tutorials.

Go to docs Go to products