These are our top 5 problems with process historians, more aptly applied when historians sit in the enterprise landscape, though not exclusively. We reserve the right to revise this list as new developments surface, or as new approaches currently being explored fail to match the benefits people have decided they can't do without. That isn't to say those benefits are beyond question, new entrants to the field may not value them the same way, and that ongoing debate is part of what keeps redefining the primary purpose of the software we use today, and what we're preparing to do without tomorrow.
Retrieving historical data quickly and running complex analysis can be genuinely challenging with large datasets. Historians need tools and features built for efficient retrieval and analysis, though data retrieval is resource-intensive for any technology, not just historians. Inside the historian's own ecosystem, the core system is tuned for strong performance. Outside it, performance is measured differently, and enriched data means more than sub-second real-time process information, the historian itself provides very little data analysis capability on its own, that happens elsewhere.
Tip: Evaluate technology performance requirements against the wider business ecosystem, not just the historian ecosystem in isolation.
Closely related to retrieval is data integration. Historians often need to integrate with legacy systems, sensors, and devices using different data formats and protocols, and that integration can be a significant challenge. This is core to a data historian's capability on the front end, where it acts as a data consumer. The real challenge sits on the back end, where the historian is the server. Accessing that ecosystem can be limited in performance and flexibility; SDKs and APIs help programmers, but only when the API is genuinely complete, and conventional automation interfaces are almost always available, even when none of them quite meet the business need, despite meeting the operational need.
Tip: Interoperability is a starting point, not capability. Weigh the effort required to integrate and maintain any customisation done to “connect with ease.”
Data historians depend on accurate, reliable data, usually captured from other software systems or scanned directly from edge devices. Inconsistent or erroneous data leads to incorrect analysis and decisions, and maintaining data quality through cleansing is an ongoing challenge, especially since historians are positioned as the single source of truth for raw data. There are rudimentary ways to clean up raw data on ingestion, but unless a second data set is kept, the original is lost, and if the clean-up rules weren't right, or a new use case later needed the raw data, that value is gone for good. Easy clean-up functionality can create a false sense of data integrity.
Tip: Think about the data historian's primary purpose in the wider business ecosystem. Does data need cleansing inside the historian, or can it be cleaned in real time as it's analysed downstream? Can technically savvy operational users tolerate some raw-data imperfection for the sake of retention, while business users need noise removed entirely? A process historian rarely satisfies both needs at once.
This is closely tied to systems integration. Contextual information that deepens process data analysis is mostly provided outside the core historian archive, and metadata can be genuinely limited on older technologies. Historians can acquire metadata reasonably well when the source is another software system like SCADA or a second historian. For most “intelligent” edge devices, though, the transport protocol typically carries only time, quality, and value (TQV), with the tag or point name unavailable from the source, meaning that identification has to live in the historian instead. That's fine, it has to live somewhere, but the historian needs to support whatever metadata the business actually needs to consume.
Tip: Consider the core historian technology in terms of metadata. Does it need supporting wrappers to provide context? Can it acquire more than TQV from other software systems that already hold richer, concentrated data?
This final topic is shaped by everything before it. As enterprises generate vast data volumes, historians need to scale to handle that growth while holding performance steady, and balancing scalability against performance gets tricky during peak loads. The whole premise of a process historian is storing genuinely unique process data, so scaling and expansion really come down to whether the previous four topics were addressed properly. When they weren't, scaling and expansion become delicate surgery, no different from any large enterprise software solution in that respect.
Tip: Architect the business data ecosystem end to end, from edge device to business platform to platform consumers. Get past OT isolation.
When building a business data ecosystem that includes a process historian, consider its full scale and how it may need to transition as user numbers grow, data demand rises, and acquisition methods diversify, alongside the five topics above. Don't base important architectural decisions on interoperability and compliance statements alone. If you need help, ask. We've had some successes, and we've learned plenty from our failures too.