SCADA Historian Architecture: 2026 Design Patterns from First Principles

A SCADA historian architecture is a domain-tuned time-series store that records every tag a control system can produce, keeps it for years, and serves it back at sub-second latency for trends, replays, and analytics. Modern designs split four layers cleanly — ingest, store, query, and replay — and pick a different technology for each. This article walks the SCADA historian architecture used in mid-2026 plants, contrasts AVEVA PI Server, AVEVA Historian (formerly Wonderware), Aspen IP.21, Honeywell PHD, GE Proficy Historian, InfluxDB 3, TimescaleDB, QuestDB, ClickHouse, and VictoriaMetrics, and gives the trade-offs an engineer has to negotiate before signing the bill of materials.

If you are coming to this from the wire layer, the OPC UA technical guide and the Modbus protocol guide cover the protocols feeding the ingest layer described below.

Figure 1 — SCADA historian architecture: layered ingest, write buffer, compression, segments, query.

What a historian is (and is not)

A historian is not a general-purpose database that happens to hold time-stamped rows. It is a piece of plant software with five non-negotiable behaviours that distinguish it from a relational database, an OLAP warehouse, or a metrics back-end like Prometheus.

First, every value carries a quality code alongside its timestamp and number. OPC UA’s StatusCode and the older OPC DA quality bytes flow through unchanged. A historian that loses the quality code on the way in is a logging system, not a historian, because you cannot reconstruct whether a flow reading of zero meant “valve shut” or “transmitter unplugged”.

Second, the clock is the primary key. Tag plus timestamp identifies a row, and the engine is optimised for time-range scans rather than point lookups. Inserts are nearly always at the head — the latest timestamp — and out-of-order arrivals are a special case the engine has to handle without rewriting old segments.

Third, the engine answers interpolation queries natively. “What was tag FIC101.PV at 14:32:17.500?” requires the historian to pick the bracketing samples, decide whether the tag is configured as stepped or interpolated, and return a value that may never have been sampled. PI Server, AVEVA Historian, IP.21, PHD, and Proficy all do this in the query engine. TimescaleDB and InfluxDB 3 require the application to ask explicitly with time_bucket_gapfill or time_weighted_avg.

Fourth, retention is measured in years, not days. A refinery commissioned in 2010 will still need 1-second flow data from 2010 to defend a 2026 emissions claim. The historian has to assume the storage layout will outlive several generations of compute hardware, several operating system versions, and at least one corporate acquisition.

Fifth, the asset model is part of the product. PI Asset Framework, AVEVA’s System Platform Galaxy, Aspen’s Production Record Manager, and Proficy’s Plant Applications all map raw tags onto equipment, units, lines, and sites so that a question like “average yield of Reactor 2 last month” can be expressed in plant language rather than tag soup.

Prometheus, by contrast, is not a historian. It is a metrics system: it loses precision after two hours by default, has no quality codes, no concept of stepped versus interpolated tags, and a retention model designed for weeks, not decades. VictoriaMetrics extends Prometheus’ retention into the months-to-years range and is fine for IT-side observability of the historian itself, but it is not a process historian.

Reference architecture: ingest, store, query, replay

The cleanest mental model of a SCADA historian architecture splits the data path into four layers, each with its own scaling rules, failure modes, and replacement cycle.

Ingest. Tag values arrive from PLCs, DCS controllers, and RTUs via OPC UA, Modbus TCP, MQTT Sparkplug B, EtherNet/IP, or one of the legacy interfaces (PI’s UFL, OSIsoft’s RDBMS interface, Suitelink for AVEVA). The ingest layer terminates these protocols, normalises the value to engineering units, attaches a quality code, and writes to an in-memory ring backed by a write-ahead log. PI Server uses PI Interface nodes; AVEVA Historian uses IDAS; InfluxDB shops typically use Telegraf with the opcua plugin. The collector must be able to buffer locally — “store and forward” — when the upstream historian is unreachable, otherwise a 30-minute network blip turns into a gap that no after-the-fact compression can hide.

Store. The write buffer flushes into compressed, time-partitioned segments on disk. PI calls these archives; InfluxDB 3 calls them Parquet files in object storage; TimescaleDB calls them chunks; ClickHouse calls them parts. The shape is similar in every case: append-only, columnar where possible, indexed by tag and time. The compression scheme is where the products diverge sharply — the next section is dedicated to it.

Query. Read traffic is dominated by three patterns: trend plots over a fixed time window (HMI, Grafana, PI Vision, AVEVA Insight), aggregations over an asset hierarchy (AF analytics, calc tags, ML feature pipelines), and exact-value lookups for compliance reports. Each pattern stresses a different part of the engine. Trends are bandwidth-bound and benefit from server-side decimation. Aggregations are CPU-bound and benefit from columnar layout and SIMD. Compliance lookups are seek-bound and benefit from per-tag indexes.

Replay. The often-forgotten layer. Replay is the ability to feed historical data back into a downstream consumer — a digital twin running in NVIDIA Omniverse or Bentley iTwin, an ML training pipeline, a what-if simulator — at any speed from 1x to 10000x. PI’s Event Frames and AVEVA’s Replay service expose this directly. With the lakehouse-style architectures discussed later, replay shifts to Iceberg or Delta tables read by Spark or DuckDB.

Figure 1 above shows the four layers and the optional cold tier. Note the dotted lines: the WAL flows to a standby node synchronously for HA, and the on-disk segments tier asynchronously to object storage for cost.

Tag model, compression, and retention

A plant of any size has between 20,000 and 2,000,000 tags. A refinery sits comfortably in the upper half of that range, a small water utility in the lower half. The historian has to store them all and still answer trend queries in under a second, so compression is not an optimisation — it is the only reason the architecture works.

Figure 2 — Three families of historian compression compared on lossiness, query cost, and best fit.

Swinging door trending (SDT) is what AVEVA PI Server and AVEVA Historian have used since the late 1980s. The algorithm picks an admissible corridor around each retained sample and drops every later sample that falls inside the corridor. Two parameters — CompDev (compression deviation, in engineering units) and CompMax (maximum time between retained points) — control the trade-off. SDT is lossy but reconstructable to within CompDev, which is a guarantee plant operators understand intuitively. For a slow analog loop with CompDev = 0.1 % of span, retention rates of 5-10 % of raw samples are routine.

Delta-of-delta plus XOR float compression is the Gorilla scheme published by Facebook in 2015 and adopted, with variations, by InfluxDB, VictoriaMetrics, and Prometheus. Timestamps are stored as the second derivative of the sequence — usually a single bit when sampling is regular — and float values are XORed against the previous value, with the leading and trailing zero runs stored compactly. The scheme is lossless but assumes regular sampling and steady-state values. It loses its edge on jittery, fast-changing tags.

Raw plus columnar compression is the ClickHouse and QuestDB approach. Every sample is kept, but columns are stored separately and compressed with ZSTD or LZ4, and tag identifiers are dictionary-encoded. This is the most query-friendly of the three because analytical SQL runs directly against the columnar layout, but it consumes the most disk. For tags that go into ML feature pipelines or ad-hoc Spotfire queries, the disk cost is usually worth it.

TimescaleDB sits in the middle: hypertable chunks use Gorilla-style compression for numeric columns and dictionary encoding for text, applied as a background job after the chunk closes. The trade-off is that compressed chunks are read-only — updates require decompression first.

Retention is a tag-by-tag decision, not a database-wide setting. A typical plant policy looks like: raw 1-second data for 2 years on hot storage, 1-minute aggregates for 10 years on warm storage, 1-hour aggregates forever on cold object storage. PI’s Archive Lifecycle Management, AVEVA Historian’s tiered storage, and InfluxDB 3’s bucket retention all express this directly. Aspen IP.21 and Honeywell PHD have historically been less flexible here — both vendors have improved this in the 2024-2026 release cycle but moving very old archives off primary storage still requires planned downtime in many sites.

Modern stacks (PI Server, AVEVA Historian, GE Proficy, InfluxDB, TimescaleDB, QuestDB, ClickHouse, OpenTSDB, M3DB, Cassandra-based)

The 2026 historian market has three clusters: the incumbent process-industry products, the open-source time-series databases that have grown into the historian niche, and the niche specialists.

AVEVA PI Server (2024 R2 and 2026 R1). Still the reference for process-industry historians. PI Data Archive is the time-series engine; PI Asset Framework is the asset model; PI Vision is the visualisation layer; PI Integrator for Business Analytics and the newer PI Data Hub handle export to cloud. AVEVA’s strategic direction since the 2023 acquisition of OSIsoft has been to keep PI on-prem while pushing analytical workloads into AVEVA Data Hub (cloud SaaS) via the PI-to-Cloud agent. SDT compression, the AF asset model, and PI Vision trends are the three things every PI shop relies on every day.

AVEVA Historian (formerly Wonderware Historian). The sibling product, originally aimed at discrete manufacturing and tightly integrated with the AVEVA System Platform (Galaxy, ArchestrA). The engine is SQL-Server-based, which makes integration with corporate BI straightforward but caps single-node tag rates lower than PI. Many plants run both — PI for the process side, AVEVA Historian for the packaging line — and reconcile via AF.

Aspen IP.21 (InfoPlus.21). The dominant historian in oil refining and bulk chemicals, particularly where Aspen DCS or Aspen Mtell sit alongside it. Strong on chemometric calculations and Aspen Process Explorer for trending. The Aspen Connect ecosystem in the 2026 release pulls IP.21 into the AspenTech Industrial AI platform.

Honeywell PHD (Process History Database). Tightly coupled to Honeywell Experion DCS sites. Honeywell’s Forge platform now wraps PHD with cloud analytics. The engine is competent but the differentiator is the Experion integration, not the historian per se.

GE Proficy Historian (2024 and later). Strong on discrete manufacturing, particularly in power generation and water utilities. The 2024 release introduced a native columnar store option alongside the legacy proprietary format. Proficy Operations Hub is the visualisation layer.

InfluxDB 3 (Core and Enterprise). The 2024 rewrite of InfluxDB on the FDAP stack — Flight, DataFusion, Arrow, Parquet. Storage is Parquet in object storage; compute is DataFusion. This is closer in shape to a lakehouse than to a traditional historian, which makes it attractive when analytics is a first-class workload but means the asset-model story is something you build yourself. Telegraf with the OPC UA plugin is the standard ingest path.

TimescaleDB (Timescale Cloud and self-hosted). A PostgreSQL extension that adds hypertables, native compression, and continuous aggregates. The killer feature is that the rest of your relational world — asset tables, work orders, batch genealogy — sits in the same engine. Compression is Gorilla-style for numerics. The continuous-aggregate machinery handles the rolled-up retention tiers automatically.

QuestDB (8.x). Open-source, written in Java with SIMD-heavy hot paths. Designed for high write throughput and low-latency trend queries. Strong fit when you want a historian-shaped engine without the licensing of a process-industry product, but you supply the asset model and the HMI yourself.

ClickHouse. Not a historian, strictly. It is an analytical columnar database that happens to be very good at time-series ingest. Several large operators use it as the analytical tier behind a traditional historian — PI on the plant, ClickHouse in the data centre, replication via PI-to-Kafka.

VictoriaMetrics, Prometheus. Useful for IT-side observability of the historian and its collectors. Prometheus is explicitly not a historian; VictoriaMetrics extends retention but does not add quality codes or an asset model.

OpenTSDB, M3DB, Cassandra-based historians. OpenTSDB on HBase is largely legacy in 2026; M3DB (Uber) and Cassandra-backed designs survive in places where the operations team already runs the underlying storage cluster. Greenfield projects rarely pick them.

A common pattern in 2026 is two engines per plant: an on-prem process historian (almost always PI, AVEVA, IP.21, PHD, or Proficy) for the operations workload, and a cloud-side analytical store (InfluxDB 3, TimescaleDB, ClickHouse, or a lakehouse) for the data-science workload. The on-prem historian is the source of truth; the cloud store is a copy. The next section covers the bridge.

Cloud-native historians and lakehouse offload patterns

The 2024-2026 lakehouse wave changed the shape of cloud-side historian architectures more than any change since the OPC UA standardisation push of the mid-2010s.

Figure 4 — Cloud offload: hot tier feeds CDC into bronze Iceberg, refined to silver and gold for BI and ML.

The reference pattern looks like this. The on-prem historian (PI, AVEVA Historian, IP.21, PHD, Proficy) keeps 30-90 days of hot data, sized for HMI trending and operator replay. A change-data-capture (CDC) agent — PI-to-Cloud Agent, AVEVA’s Data Hub Connector, an Aspen Connect bridge, or Telegraf for the open-source engines — streams new samples into a bronze layer in Apache Iceberg or Delta Lake, hosted on S3, ADLS, or MinIO. The bronze table holds the raw schema: tag identifier, timestamp, value, quality, source. Silver-layer jobs cleanse, asset-join, and gap-fill. Gold-layer tables hold KPI rollups, downsampled aggregates, and feature tables.

The reasons to do this are operational, not technical: cloud storage is cheaper than on-prem disk at the multi-petabyte scale a 20-year-old plant generates; ML and BI teams prefer SQL over PI’s PIPoint API or AVEVA’s MX queries; and disaster recovery becomes a property of the object store rather than a separate replication problem.

The reasons it goes wrong are also worth naming. Egress costs from on-prem to cloud are a recurring monthly bill that does not appear in the project business case. The CDC agent introduces a new failure mode — a stalled agent silently freezes the bronze layer’s freshness — so monitoring of the agent itself is essential. And tag identifiers in the bronze layer often duplicate work the AF model already does on-prem, leading to a silver-layer that has to maintain a parallel asset hierarchy. Most mature implementations end up running an asset-model sync job rather than rebuilding the hierarchy from scratch.

Cloud-native historians proper — Azure Data Explorer with the OPC UA ingest path, AWS IoT SiteWise, GCP’s Cortex Framework time-series schema — sit in a different niche. They work well for new plants where there is no on-prem historian to migrate, and for fleets of small assets (wind turbines, retail HVAC, water pumping stations) where running PI per site is not economic. They struggle to displace incumbents at large processing sites because the engineering teams already know PI or AVEVA, the HMIs are wired to it, and the regulatory record-keeping has been validated against it.

Query patterns and PI AF/Asset Framework analogues

The query side of a historian architecture is where the asset model earns its keep.

Figure 3 — Tag to asset model: raw PI tags map to AF elements, parent units, sites, and KPI calculations.

Raw tags are unfriendly. A tag named UNIT01.FIC101.PV means “process variable of flow indicator controller 101 in unit 01” only to someone who knows the naming convention. AF wraps this tag in an element called FIC101, gives it attributes (PV, SP, OP), and places it in a hierarchy: enterprise → site → area → unit → equipment → instrument. A query against AF can now be expressed as “average flow at all reactors at site North last week” and the engine resolves the tag list at query time.

AVEVA Galaxy in the System Platform serves the same role for AVEVA Historian. Aspen’s Production Record Manager handles batch and continuous unit hierarchies. Honeywell’s Unit Operations Suite layers on PHD. Proficy Plant Applications layers on Proficy Historian. The pattern is the same across vendors: a logical model in front of the raw tag store, with calculations and KPIs defined against the model rather than against tags.

In the open-source world, the asset model is what you build yourself. TimescaleDB users typically put the asset hierarchy in regular Postgres tables and join. InfluxDB users tag samples with asset identifiers and rely on Flux or SQL-on-Parquet queries to do the joining. ClickHouse users typically maintain a separate dimensional table. The trade-off, named explicitly in the next section, is that hand-rolled asset models are flexible but drift — the integration team has to keep them in sync with the plant.

The third query pattern, event frames, is a 2010s innovation from PI that other vendors have since copied. An event frame is a named time interval — a batch, a downtime event, an emissions episode, a startup — with attributes captured at the start and end. Queries can ask “show me the average flow during all batches of Product X last quarter” without the application having to know when those batches ran. AVEVA System Platform has trip events; IP.21 has Production Record Manager event journals; Proficy has events in Plant Applications.

Failure modes, capacity planning, and cost model

A historian architecture fails in five characteristic ways, and capacity planning is mostly about staying ahead of them.

Figure 5 — HA topology: primary plus standby with WAL sync, plus async DR replica and store-and-forward buffer.

Collector buffer overflow. The interface node loses connectivity to the historian, the local buffer fills, and the oldest samples are dropped. Mitigation: size the store-and-forward buffer for the worst plausible outage (often 24-72 hours) and monitor the high-water mark.

Archive corruption from unplanned shutdown. Power loss during a write to the active segment can leave the segment inconsistent. PI’s recovery, AVEVA’s reconstitution, and Influx 3’s Parquet-with-WAL all handle this, but recovery time scales with segment size. Mitigation: keep segments small enough that recovery fits in the maintenance window.

Cardinality explosion. Someone adds a derived tag per asset and the tag count jumps from 200,000 to 2,000,000 overnight. Most engines handle this, but query latency on the asset hierarchy degrades. Mitigation: cap derived tag creation behind a governance step.

Replication lag during burst load. A trip event generates a burst of high-frequency data (analog tags going to alarm sampling rates). Async DR replication falls behind. Mitigation: size the DR link for peak rather than average, and alert on lag.

Silent compression mismatch. A tag’s CompDev is set too loose; the operator looking at a trend sees a flat line because real variation is being compressed out. Mitigation: routine audit of compression parameters against tag standards.

Capacity planning starts with three numbers: tag count, average write rate per tag, and retention. For a refinery with 500,000 tags, average 1 sample per second after compression, and 5 years of hot storage, raw uncompressed storage is roughly 500,000 × 1 × 86,400 × 365 × 5 × 16 bytes (timestamp + value + quality) — north of a petabyte. Realistic compressed storage is 5-15 % of that depending on tag mix and compression scheme, so somewhere between 50 and 150 TB hot. The cost model is dominated by hot disk for the on-prem tier and by egress plus query compute for the cloud tier.

Avoid the temptation to plug in vendor brochure compression ratios. Real ratios depend heavily on the mix of analog versus digital tags, sample rate, and the noise floor of the transmitters. Benchmark on your tag mix before signing the storage order.

Trade-offs and gotchas

Compression versus query accuracy. SDT is reconstructable to CompDev, not exact. A regulator asking for “the actual value at 14:32:17” gets an interpolated value. Aspen IP.21 and the open-source delta-of-delta engines avoid this by keeping all samples. Pick consciously.

Retention versus cost. Hot disk at the petabyte scale is the dominant line item. The lakehouse offload pattern moves the cost into object storage and query compute, but introduces egress and a new failure mode. The “right” retention is a regulatory and operational question, not a technical one.

Asset-model rigidity. PI AF, AVEVA Galaxy, IP.21 PRM, and PHD’s Unit Operations Suite are powerful but opinionated. Plants that grow organically often end up with AF templates that no longer fit the equipment they describe. Refactoring an AF model in a live plant is non-trivial because every downstream PI Vision display, every analytic, and every external integration references the current structure.

Cloud egress. Streaming 500,000 tags to a cloud bronze layer at 1 sample/second is, in principle, a few megabits per second after compression. In practice, every batch boundary, every CDC checkpoint, and every retry adds overhead. Budget for 2-3x the nominal bandwidth and confirm with a real bridge running for a month before committing.

Prometheus is not a historian. It is excellent for monitoring the historian’s collectors, ingest queue depth, and replication lag. It is not appropriate for storing tag data. Use the right tool.

Open-source TCO. The license cost of InfluxDB or TimescaleDB is zero or modest; the integration cost — collectors, asset model, HMI, replay, operator training — is the same as for the commercial products. Greenfield open-source historians make sense; rip-and-replace at an operating plant rarely does.

Practical recommendations

A 2026 plant building or refreshing its historian architecture should make five decisions in this order.

First, pick the on-prem engine based on the DCS and the operations team’s skills. If the plant runs Honeywell Experion, PHD is the path of least resistance; if AspenTech, IP.21; if Emerson DeltaV with PI Interface, PI Server; if AVEVA System Platform, AVEVA Historian. Do not change the on-prem historian unless you are also changing the DCS.

Second, design the asset model before you populate it. AF templates, Galaxy templates, or your own Postgres schema should be drafted, reviewed, and signed off before the first tag goes in. Refactoring later is expensive.

Third, decide the cold tier early. Decide whether the cloud-side store is a lakehouse (Iceberg or Delta on S3) or a cloud time-series engine (Azure Data Explorer, SiteWise, InfluxDB Cloud) before you build the bridge. The bridge code is different for each.

Fourth, size store-and-forward and DR for the worst plausible event, not the average. A 24-hour WAN outage is rare but plausible; the buffer that survives it costs little extra.

Fifth, run a real benchmark on your tag mix. Vendor numbers will not match your reality. A two-week pilot with your tags, your sample rates, and your compression settings is the only reliable input to the capacity plan.

FAQ

Q1. Is InfluxDB 3 a real historian replacement?
For greenfield projects without an existing process historian, yes — particularly with Telegraf for OPC UA ingest and a hand-built or downstream asset model. For replacing an operating PI or IP.21 site, no — the asset model, HMI integration, and operator workflow are years of work to recreate.

Q2. How does swinging-door compression affect regulatory reporting?
The retained samples are exact, and the dropped samples are reconstructable to within the configured deviation. For most regulatory regimes this is acceptable; some emissions reporting frameworks require lossless storage of specific tags. Configure those tags with CompDev = 0 (effectively lossless) and accept the disk cost.

Q3. Should the historian and the lakehouse have the same tag schema?
The bronze layer should mirror the historian’s raw tag schema exactly. The silver and gold layers should align with the asset model rather than the raw tags. Trying to make the bronze layer “smart” creates a maintenance burden that grows with every new tag.

Q4. What is the relationship between AF and an ISA-95 data model?
AF is a flexible hierarchy that can be configured to match ISA-95 (enterprise, site, area, work centre, work unit) but is not constrained to it. Most mature PI sites configure AF to follow ISA-95 because the corporate MES and ERP layers expect that shape.

Q5. Can Prometheus or VictoriaMetrics store plant tags?
They can store the numbers, but they lose the quality code, do not interpolate, and have no asset model. Use them for IT observability of the historian, not as a historian.

Q6. How long should the on-prem hot tier hold data before offload?
Long enough that operator-initiated replay and troubleshooting never need to hit cloud storage. For most process plants this is 60-90 days. Sites with longer troubleshooting cycles (refining turnarounds, pharma batch investigations) often size the hot tier at 12-18 months.

SCADA Historian Architecture: 2026 Design Patterns from First Principles

SCADA Historian Architecture: 2026 Design Patterns from First Principles

What a historian is (and is not)

Reference architecture: ingest, store, query, replay

Tag model, compression, and retention

Modern stacks (PI Server, AVEVA Historian, GE Proficy, InfluxDB, TimescaleDB, QuestDB, ClickHouse, OpenTSDB, M3DB, Cassandra-based)

Cloud-native historians and lakehouse offload patterns

Query patterns and PI AF/Asset Framework analogues

Failure modes, capacity planning, and cost model

Trade-offs and gotchas

Practical recommendations

FAQ

Further reading

Related

Comments

Leave a Reply Cancel reply

Tag Cloud

Categories