Tuning your TimescaleDB database is essential to leveraging your existing hardware. We built a tool to help you do just that.
Incase you missed our original announcement,
timescaledb-tune(GitHub), initially packaged with TimescaleDB 1.1, is a new command-line tool that helps users tune and configure their PostgreSQL instances to leverage their existing hardware for better performance. Previously, users had to manually configure their postgresql.conf files, a sometimes opaque process, especially if one is new to the PostgreSQL ecosystem. With
timescaledb-tune, users follow simple CLI prompts to configure reasonable settings for memory, parallelism, the WAL, and so on. Essentially, it simplifies the installation processes for users by invoking the tool to “lint” a user’s configuration to make sure it’s ready to go.
Tune your database, Luke!
As many of you know, we have an active Slack channel where our community of users and contributors can engage directly with the Timescale engineering team and each other. We are constantly answering questions to help our users get the most out of their experience with TimescaleDB.
Through our Slack community, we came to realize that users often observed unexpected performance characteristics when using TimescaleDB for the first time. One of the most common reasons for this was that PostgreSQL’s default configurations are not suitable for most larger machines. Further, TimescaleDB extends and replaces much of the PostgreSQL schema management and query planning layers, so the configurations that work best for TimescaleDB as a time-series database don’t necessarily match the configurations that work best for using PostgreSQL for the OLTP use case.
We wanted to streamline the TimescaleDB experience so that users could achieve good performance out of the box, without having to have the background knowledge on how to properly tune their database. That’s what inspired us to develop
timescaledb-tune: our smart installation companion
We wanted to achieve two main goals with
timescaledb-tune: (1) guide users by suggesting appropriate configurations for their TimescaleDB instance and (2) make it easier to set up and install TimescaleDB. Prior to this tool, users had to manually edit their postgres.conf file by adding `timescaledb` as a `shared_preload_libraries` config variable. Afterwards, they had to edit appropriate config variables to achieve better performance, including memory settings, shared workers, etc. While we pointed users to this convenient website for recommendations, this process could be simpler than the user manually entering their machine resources and copying the recommendations.
Now, users can simply run
timescaledb-tune in their CLI to both receive suggestions on what variables to configure based on their machine’s resources, as well as directly update their config file using the tool. The tool itself is written in Go (as our other auxiliary tools are) and built / packaged with each system we support.
A tuned database means faster write and read performance
Not tuning your database can have significant impact on both your write and read performance. Underutilizing your server’s memory will cause indexes to be swapped to disk more often and prevent queries from utilizing faster sorts. The default WAL settings can cause inefficient I/O performance that slows down write performance. Parallel query settings that don’t utilize the total system resources will result in slower queries.
To demonstrate these effects, we ran some experiments on some larger cloud VMs (but by no means the largest, where these effects would only be exacerbated). For our experiment we ran TimescaleDB on an AWS m5.2xlarge instance, which has 8 CPUs, 32GB of RAM, and over 400MBps of disk throughput. We attached two EBS volumes of a TB each: one was used for the WAL and the other for data. Additionally we had a client with the Time-Series Benchmark Suite (TSBS) installed that used a m4.2xlarge (8 CPUs, 32GB RAM).
First we ran TSBS’s insert benchmark both with default settings (“Untuned”) and after running
timescaledb-tune (“Tuned”), using 1B metrics over 100M rows:
The average write performance shows a 1M metrics per second improvement, from 2.23M metrics/s to 3.28M metrics/s. This results in the dataset being fully inserted over two minutes faster (315s vs 465s).
Query performance also improves significantly, especially when the plan involves a large sort and can be parallelized. Below we show the performance over 25 queries of our double-groupby-1 query. This query does a group by on both time and hostname over various 16M row subsets, making it great for parallelizing and needing a lot of memory for sorting.
Here the median query is roughly 3x faster (2.1s vs 6.3s) when tuned and with less variability in performance (a range of 0.5s vs near 1.0s).
We are continuously working to improve the packaging and installation/onboarding experience for our users. We will ship this tool with future versions of TimescaleDB to make it easier to get up and running. With the release of TimescaleDB 1.2, slated for later this month,
timescaledb-tune will move out of beta. In the meantime, we encourage you to try out
timescaledb-tune (GitHub) and let us know if you have any feedback.
If you are new to TimescaleDB and ready to get started, follow the installation instructions. If you have questions, we encourage you to join 1800+ member-strong Slack community. If you are looking for enterprise-grade support and assistance, please let us know.
Finally, if you are interested in building open-source tools like this one, we are hiring!
Interested in learning more? Follow us on Twitter or sign up below to receive more posts like this!