TimescaleDB Internals: Hypertables, Chunks, Compression, and Continuous Aggregates (2026)

A single Postgres table holding two years of one-second sensor readings will eventually collapse under its own weight. The B-tree index no longer fits in memory, autovacuum falls behind, and a DELETE of last quarter’s data locks the table while it rewrites millions of tuples. TimescaleDB hypertables exist to make that failure mode disappear without asking you to leave Postgres. Under the hood a hypertable is not a special storage engine bolted onto the side of the database — it is an abstraction over ordinary child tables that the extension partitions, compresses, rolls up, and drops automatically, all while presenting a single logical table to your application.

This post takes apart that abstraction. We will follow a row from INSERT through chunk routing, watch the row-to-columnar transform that yields 90-plus-percent compression, see how continuous aggregates keep dashboards fast, and trace how data ages out through retention and tiering. Everything here reflects the 2026 state of the extension — TimescaleDB 2.27, the Hypercore engine, and the fact that Timescale the company now trades as TigerData while the open-source extension keeps the TimescaleDB name.

What this covers: hypertable and chunk partitioning internals, chunk exclusion and planner pruning, the columnar compression transform and its algorithms, continuous aggregates with watermarks and real-time aggregation, retention and tiering, Hypercore, tuning, and when a time-series database on Postgres is the wrong choice.

Context and Background

Time-series workloads — metrics, IoT telemetry, financial ticks, application traces — share a shape that ordinary OLTP schemas handle badly. Data arrives append-mostly, ordered roughly by time, is queried by time range, and is rarely updated once written. A vanilla Postgres table treats every one of those billions of rows as equally hot, indexes them into a single ever-growing B-tree, and forces VACUUM to walk the whole relation. Purpose-built stores like InfluxDB and QuestDB solved this by abandoning SQL and the relational model. TimescaleDB took the opposite bet: keep full Postgres — its planner, its SQL, its ecosystem, its transactional guarantees — and add time-series smarts as an extension.

That bet has aged well. Because a hypertable is still a Postgres relation, every tool that speaks Postgres — psql, pgAdmin, pg_dump, logical replication, foreign data wrappers, PostGIS, pgvector — works unchanged. You inherit joins, window functions, CTEs, and ACID transactions for free. The trade-off is that TimescaleDB has to work within the Postgres extension API rather than rewriting storage from scratch, which shapes many of the design decisions we examine below. For a broader comparison of this architecture against the alternatives, see our time-series database internals deep-dive across InfluxDB, Timescale, and QuestDB. The official reference for everything that follows is the TigerData (Timescale) documentation, which the extension now ships under.

Hypertables and Chunk Partitioning: the Core Abstraction

A hypertable is a virtual table you query as one relation, but which TimescaleDB physically stores as many smaller child tables called chunks. Each chunk owns a disjoint slice of the time dimension — for example one week of data — and Postgres inheritance plus TimescaleDB’s planner hooks stitch them back together so a query against the hypertable transparently fans out to only the chunks it needs.

Figure 1: A hypertable dispatches each row to the chunk that owns its time interval; recent chunks stay in the rowstore while older ones convert to the columnstore, and the planner prunes chunks outside the query’s time range.

Figure 1 shows the essential machinery: an INSERT or SELECT hits the parent hypertable, a router maps the row’s time value to a chunk, and each chunk is an ordinary Postgres table stored in the heap. Newer chunks stay row-oriented for fast writes; older ones are converted to compressed columnar form. At query time the planner examines the constraints on each chunk and prunes any that cannot contain matching rows.

Chunks are real tables, and that matters

When you call create_hypertable, TimescaleDB replaces the empty table you handed it with a parent relation and begins creating child chunks on demand as data arrives. Each chunk is a genuine Postgres table with its own indexes, its own CHECK constraint describing the time range it covers, its own statistics, and — critically — its own bounded size. Because a chunk is small, its indexes stay in memory, VACUUM runs quickly over one chunk instead of the whole dataset, and dropping old data is a metadata-cheap DROP TABLE rather than a row-by-row DELETE.

CREATE TABLE metrics (
  time        TIMESTAMPTZ NOT NULL,
  device_id   INTEGER     NOT NULL,
  temperature DOUBLE PRECISION,
  humidity    DOUBLE PRECISION
);

SELECT create_hypertable('metrics', by_range('time', INTERVAL '7 days'));

The by_range dimension builder (which replaced the older positional create_hypertable('metrics', 'time') signature) declares that time is the partitioning column and each chunk spans seven days. That 7 days value is the chunk_time_interval, and choosing it well is the single most consequential tuning decision you make on a hypertable.

Sizing chunk_time_interval

The guiding rule from the docs is that the indexes and uncompressed data for the most recent chunks — the ones being actively written and read — should fit comfortably within roughly 25 percent of main memory, because those are the working set that must stay hot in cache. Set the interval too large and a single chunk plus its indexes spills out of RAM, resurrecting the exact problem hypertables are meant to solve. Set it too small and you generate thousands of tiny chunks, each adding planning overhead and catalog bloat, so query planning time balloons even though execution is fast.

A concrete heuristic: estimate your ingest rate in bytes per interval. If a device fleet writes 2 GB per day and you want recent chunks near 1–2 GB, a one-day interval is right; at 200 MB per day, a one-week interval keeps chunks in the same range. You can change the interval later with set_chunk_time_interval, which affects only chunks created afterward — existing chunks keep their original boundaries, so there is no expensive rewrite.

There is a second, less obvious cost to getting this wrong: catalog pressure. TimescaleDB records every chunk, its dimension slice, and its constraints in internal catalog tables under the _timescaledb_catalog schema, and the planner consults that metadata on every query to decide which chunks to touch. A hypertable with ten thousand chunks forces the planner to reason about ten thousand candidate relations before it prunes, and that reasoning happens on the query hot path. This is why a table that executes in two milliseconds can still take fifty milliseconds to plan when chunk count runs away from you. Watch the ratio of planning to execution time in EXPLAIN ANALYZE; when planning dominates, the fix is almost always a larger chunk_time_interval and, for the past, merging or dropping stale chunks rather than adding indexes.

Space partitioning: usually don’t

TimescaleDB also supports a second, optional partitioning dimension, historically called “space” partitioning, that hashes a column such as device_id into a fixed number of partitions within each time interval. In practice most deployments should not use it. It was designed for multi-disk setups where hashing spread I/O across tablespaces, and it can help parallelize writes across many concurrent devices, but it multiplies chunk count and complicates chunk exclusion. The modern advice is to partition by time only unless you have a demonstrated I/O-distribution or parallelism need, then reach for space partitioning deliberately rather than by default.

Chunk Exclusion, the Planner, and Columnar Compression

The performance story has two halves: reading less by skipping chunks, and storing less by compressing them. Both live in the query and storage path, and both depend on the fact that chunks are ordinary tables the planner can reason about.

Constraint exclusion becomes chunk exclusion

Every chunk carries a CHECK constraint on its time column — say time >= '2026-06-01' AND time < '2026-06-08'. When you run SELECT ... WHERE time > now() - INTERVAL '3 days', Postgres’s constraint-exclusion logic, extended by TimescaleDB’s planner hooks, compares your predicate against each chunk’s constraint and excludes any chunk whose range cannot overlap the query. A query touching one day out of two years of data scans a handful of chunks, not the entire hypertable. TimescaleDB adds runtime chunk exclusion too, so predicates whose values are only known at execution time — such as a parameter from a prepared statement or a join key — can still prune chunks after planning. This is why the same schema that would force a sequential scan over billions of rows in plain Postgres answers time-bounded queries in milliseconds.

The row-to-columnar transform

Compression is where TimescaleDB stops looking like ordinary Postgres. When a chunk is compressed, its rows are physically rewritten from a row store into a columnar layout, and the mechanism is worth understanding precisely because it dictates how you must configure it.

Figure 2: The compression transform groups rows by the segment_by column, orders them by the order_by column, packs up to 1000 rows per batch into per-column arrays, and applies a type-specific algorithm to each array.

As Figure 2 shows, compression takes up to 1,000 consecutive rows and collapses them into a single row in the compressed chunk, where each original column becomes an array holding those 1,000 values. Instead of 1,000 physical rows each carrying per-row overhead (a 23-plus-byte tuple header, visibility info, alignment padding), you get one row whose columns are compressed arrays. Two configuration knobs steer this:

segment_by groups rows that share a value — typically the series identifier like device_id. All rows for one device land together, so a query filtered on that device reads only the relevant compressed batches and the column becomes a single repeated value that compresses to almost nothing.
order_by sorts rows within each segment, almost always by time DESC. Ordering makes adjacent values similar, which is the precondition every compression algorithm below relies on.

ALTER TABLE metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id',
  timescaledb.compress_orderby   = 'time DESC'
);

SELECT add_compression_policy('metrics', INTERVAL '7 days');

The policy tells a background job to compress any chunk older than seven days. Getting segment_by right is the highest-leverage decision: choose a column your queries filter on, but not one so high in cardinality that each segment holds only a few rows, which starves the batch size and wrecks the ratio.

The compression algorithms

TimescaleDB does not use one general-purpose compressor; it picks a type-specific algorithm per column, which is why its ratios beat generic block compression like Postgres TOAST’s LZ.

Timestamps and other integer-like columns use delta-of-delta encoding followed by Simple-8b with run-length encoding. Regular intervals — say readings exactly one second apart — produce first deltas that are all identical, so the second-order delta is all zeros, and Simple-8b packs those zeros into a handful of bits across many values.
Floating-point columns use Gorilla compression, the XOR-based scheme Facebook published in 2015. Consecutive floats that are similar share most of their bit pattern, so the XOR of adjacent values has long runs of leading and trailing zeros that Gorilla stores compactly.
Low-cardinality text and other repetitive columns use dictionary compression: a dictionary of the distinct values plus a Simple-8b-RLE-encoded list of indexes into it.
Everything else falls back to LZ-style general compression.

Because these exploit the structure of ordered time-series data, TimescaleDB routinely reports 90–95 percent size reductions, and TigerData’s docs cite ratios up to about 98 percent on favorable data. Real production write-ups — one widely cited case took 150 GB down to 15 GB, a 90 percent reduction — track that range. The mechanism matters: the ratio is a direct consequence of order_by making neighbors similar, so a badly chosen ordering can halve your compression.

It is worth contrasting this with Postgres’s built-in TOAST mechanism, which the extension’s own team has written about at length. TOAST compresses oversized individual field values with a general-purpose byte-level algorithm and knows nothing about the relationship between one row and the next. It cannot see that a temperature column holds a slowly drifting sequence of doubles, so it treats each value as opaque bytes. TimescaleDB’s columnar transform sees the whole column at once and picks the algorithm that matches the column’s numeric behavior, which is precisely why it beats TOAST by a wide margin on time-series shapes. The lesson generalizes: compression ratio on time-series data is less about the raw compressor and more about arranging the data so that neighbors are similar before the compressor ever runs. Segment and order are the levers; the algorithms are downstream of them.

Indexing compressed and uncompressed chunks

Indexing strategy shifts across the lifecycle, and treating a hypertable like a single flat table here is a common mistake. On uncompressed chunks you generally want a composite index leading with your segment_by equivalent and then time — for example (device_id, time DESC) — because most queries filter a series over a time range, and that ordering lets a single index range-scan answer them. TimescaleDB creates a time index by default; add the composite yourself if your access pattern needs it. Every index, though, is replicated onto every new chunk, so an over-indexed hypertable pays that storage and write cost per chunk. Keep the index set lean.

On compressed chunks the picture is different. The 1,000-row batch layout means row-level indexes lose most of their point-lookup value, so TimescaleDB leans on batch-level metadata and skip indexes: each compressed batch records the min and max of its columns, and the executor skips whole batches whose range can’t match a predicate. Vectorized execution then processes the surviving batches column-at-a-time using SIMD. This is why filtering on the segment_by column is so cheap on compressed data and why filtering on an unindexed high-cardinality column can force a broad decompression — the skip metadata prunes coarsely, not to the individual row.

Immutability and how backfill works

A compressed chunk is effectively immutable at the batch level. Because 1,000 rows are fused into one compressed row of arrays, you cannot cheaply update or delete an individual reading inside it. Historically you had to manually decompress the affected chunk, apply the change, and recompress. Modern TimescaleDB automates this: with the Hypercore engine, INSERT, UPDATE, and DELETE against compressed chunks work transparently — the engine decompresses only the affected batches, applies the change, and leaves the rest columnar. TimescaleDB 2.27 (May 2026) pushed this further by extending bloom-filter pruning into the write path, so an UPDATE or DELETE can skip decompressing batches that provably cannot match its predicate, which the release notes credit with up to 160x more efficient modifications on compressed data. The rule still stands operationally: backfill and heavy mutation are far cheaper against recent, uncompressed chunks, so design your ingest so corrections arrive before compression, not after.

Continuous Aggregates and Data Lifecycle

Compression shrinks storage; continuous aggregates shrink query cost. A dashboard that computes hourly averages over a year should not re-scan hundreds of millions of raw rows on every page load. Continuous aggregates precompute those rollups incrementally and keep them current.

Materialization hypertables and the watermark

A continuous aggregate is a materialized view whose results are themselves stored in a hypertable — the materialization hypertable — and refreshed incrementally rather than fully recomputed. You define it with a time-bucketed aggregate query:

CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  avg(temperature)  AS avg_temp,
  max(temperature)  AS max_temp,
  count(*)          AS n
FROM metrics
GROUP BY bucket, device_id
WITH NO DATA;

SELECT add_continuous_aggregate_policy('metrics_hourly',
  start_offset => INTERVAL '3 days',
  end_offset   => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 hour');

The refresh policy declares “refresh buckets older than start_offset and younger than end_offset, every schedule_interval.” TimescaleDB tracks which portions of the source hypertable have changed and materializes only the affected buckets — it does not recompute the world. The boundary between materialized and not-yet-materialized data is the watermark.

Figure 3: A real-time continuous aggregate serves buckets older than the watermark from the materialization hypertable and computes buckets newer than the watermark live against the raw hypertable, then unions the two.

Real-time aggregation

Figure 3 illustrates the query-time behavior that makes continuous aggregates practical. By default they run in real-time mode: when you SELECT from the aggregate, TimescaleDB serves the historical range that lies before the watermark straight from the precomputed materialization hypertable, and for anything after the watermark it runs a live aggregation against the raw source and unions the two. You get fully up-to-date answers even though the background job only refreshes periodically. If you prefer strictly materialized results — for reproducibility, or to cap query cost — you can set timescaledb.materialized_only = true and the live tail disappears.

Two subtleties bite people. First, end_offset should be at least your maximum expected late-arrival window plus a small safety margin; if late data lands inside an already-materialized bucket, the next refresh covering that range will pick it up, but only if the policy’s window reaches back that far. Second, real-time mode adds the cost of the live aggregation to every query, so on very high-ingest tables a short end_offset with materialized-only reads can be faster than a long real-time tail.

Hierarchical continuous aggregates

You can build continuous aggregates on top of other continuous aggregates. A common terabyte-scale pattern is a pyramid: raw one-second data feeds a one-minute aggregate, which feeds a one-hour aggregate, which feeds a one-day aggregate. Each level reads the level below rather than the raw hypertable, so a daily rollup aggregates 24 hourly rows instead of 86,400 raw ones. Combined with retention that drops the lower, bulkier tiers as they age, hierarchical aggregates let you keep years of coarse history for pennies while retaining fine granularity only for the recent window. This pattern underpins the kind of long-horizon analytics we describe in our piece on streaming trade data into an Iceberg lakehouse architecture, where rollup tiers serve interactive queries and the raw firehose lands in object storage.

Retention and tiering

Figure 4: The data lifecycle moves each chunk from the hot rowstore into the compressed columnstore, feeds continuous aggregates, and finally either tiers the chunk to object storage or drops it once it passes the retention window.

Figure 4 traces the full lifecycle, and it is the mental model that ties every prior section together. A row is born hot in a rowstore chunk where writes and corrections are cheap. As the chunk ages past the compression policy it converts to columnar storage and shrinks by an order of magnitude. Its data may already have been rolled up into one or more continuous aggregates that will outlive it. Finally, when the chunk passes the retention horizon, it either tiers to object storage or is dropped entirely. Every transition is driven by a background policy keyed on chunk age, so the lifecycle runs itself once configured.

A retention policy automatically drops chunks past a cutoff:

SELECT add_retention_policy('metrics', INTERVAL '90 days');

Because dropping a chunk is a DROP TABLE on a child, not a DELETE of rows, retention is effectively free and never bloats the heap. A crucial pattern: apply retention to the raw hypertable while keeping the continuous aggregates that summarize it — you drop the raw firehose but keep the hourly and daily history forever. On Tiger Cloud, tiered storage moves old chunks off local disk to cheaper object storage while keeping them queryable through the same hypertable, so you page rarely accessed history out of your expensive primary volume without losing SQL access to it. The pattern of tiering cold data to object storage mirrors what we cover in Apache Kafka tiered storage under KIP-405: keep the hot set local and lean on cheap durable object storage for the long tail.

Trade-offs, Gotchas, and What Goes Wrong

TimescaleDB is not free of sharp edges, and the failure modes are specific.

Chunk sizing errors dominate. The two most common production problems are chunks too large to stay cached (recreating the original scaling wall) and thousands of tiny chunks inflating planning time. Both come from a mis-estimated chunk_time_interval. Symptoms differ: oversized chunks show as slow inserts and cache-miss-heavy reads; oversized chunk counts show as query latency dominated by planning, visible as a large gap between planning and execution time in EXPLAIN ANALYZE.

segment_by cardinality is a trap. Choose a segment_by column with too many distinct values and each segment holds only a few rows, so batches never fill to 1,000, compression ratios crater, and you get many small compressed rows. Choose too few segments and queries can’t prune to the series they want and must scan and decompress broad swaths. The right column is one your queries filter on with moderate cardinality — device or series id, not a unique event id.

Mutating compressed data is expensive even when automated. Transparent DML on compressed chunks works, but decompress-modify-recompress is orders of magnitude costlier than an insert into a hot chunk. Workloads with heavy corrections or out-of-order backfill deep into history will struggle; keep the mutable window uncompressed.

Real-time aggregation is not free. The live tail added to every continuous-aggregate query has a real cost on high-ingest tables. If dashboards feel slow, check whether a long real-time window is forcing large live aggregations on each read.

It is not a general-purpose columnar warehouse. For ad-hoc analytical scans across every column of petabyte datasets with no time locality, a dedicated OLAP engine (ClickHouse, or a lakehouse over Parquet) will usually win. TimescaleDB shines when time is the dominant access pattern and you want to keep the relational model and Postgres ecosystem, not when you’re running arbitrary full-table analytics.

Practical Recommendations

Start from your data’s physical rate, not a default. Estimate bytes ingested per candidate interval and pick a chunk_time_interval that keeps recent chunks and their indexes inside roughly a quarter of RAM. Partition by time only unless you have a proven I/O or parallelism reason to add space partitioning. Enable compression with a segment_by column your queries actually filter on and order_by time DESC, then verify the achieved ratio on real data rather than trusting the headline number. Build continuous aggregates for every rollup a dashboard repeatedly asks for, size end_offset to your late-arrival window, and consider materialized-only mode on the hottest tables. Apply retention to the raw hypertable while keeping the aggregates, and tier cold chunks to object storage where available.

Checklist before you ship:

[ ] chunk_time_interval sized so recent chunk + indexes fit ~25% of RAM.
[ ] Time-only partitioning unless space partitioning is justified.
[ ] Compression enabled with a sensible segment_by and order_by time DESC.
[ ] Compression policy age set so the mutable/backfill window stays uncompressed.
[ ] Continuous aggregates for recurring dashboard rollups; end_offset ≥ late-arrival window.
[ ] Retention on raw data, aggregates preserved; tiering configured for cold chunks.
[ ] EXPLAIN ANALYZE checked for chunk exclusion and a small planning/execution gap.

Frequently Asked Questions

What exactly is a hypertable in TimescaleDB?

A hypertable is a logical table you query as one relation while TimescaleDB physically stores it as many child tables called chunks, each covering a time interval. You read and write the hypertable normally; the extension routes rows to the right chunk, prunes irrelevant chunks at query time, and manages compression, aggregation, and retention per chunk. Because chunks are ordinary Postgres tables, the whole thing stays inside Postgres with full SQL and ACID guarantees, so every Postgres tool and feature continues to work against it.

How does TimescaleDB compression reach 90 percent or more?

It rewrites rows into a columnar layout, packing up to 1,000 rows per column into arrays and applying a type-specific algorithm: delta-of-delta plus Simple-8b for timestamps and integers, Gorilla XOR for floats, and dictionary encoding for low-cardinality text. Ordering rows by time makes adjacent values similar, which is what lets these algorithms shrink data so far. Generic block compression like Postgres TOAST’s LZ can’t match it because it doesn’t exploit the numeric structure of ordered time-series columns. Ratios of 90–98 percent are typical on well

TimescaleDB Internals: Hypertables, Chunks, Compression, and Continuous Aggregates (2026)

TimescaleDB Internals: Hypertables, Chunks, Compression, and Continuous Aggregates (2026)

Context and Background

Hypertables and Chunk Partitioning: the Core Abstraction

Chunks are real tables, and that matters

Sizing chunk_time_interval

Space partitioning: usually don’t

Chunk Exclusion, the Planner, and Columnar Compression

Constraint exclusion becomes chunk exclusion

The row-to-columnar transform

The compression algorithms

Indexing compressed and uncompressed chunks

Immutability and how backfill works

Continuous Aggregates and Data Lifecycle

Materialization hypertables and the watermark

Real-time aggregation

Hierarchical continuous aggregates

Retention and tiering

Trade-offs, Gotchas, and What Goes Wrong

Practical Recommendations

Frequently Asked Questions

What exactly is a hypertable in TimescaleDB?

How does TimescaleDB compression reach 90 percent or more?

Related

Comments

Leave a Reply Cancel reply

Tag Cloud

Categories