Facts to Intelligence - Data Model Evolution

Shailendra Savardekar
Head of Product & Technology Strategy
7 min
Published on
June 26, 2026

There is a question that breaks most data platforms: "Who supplies the parts that go into the products that are failing in the field?"

It sounds simple. It is not a complex join. It is not a giant query. It is a cross-domain, semantically linked, temporally aware question and most architectures cannot answer it. Not because the data doesn't exist, but because it was never connected with meaning.

That gap is what the evolution of data modeling is actually about. Not just storage efficiency or query speed but progressively adding the dimension that the previous layer could not express. Scale. Vocabulary. Meaning. Reasoning. Live state.

Here is the full arc from the first CREATE TABLE to a live digital twin and why every layer still matters.

First, The Spectrum

Five structures exist across a spectrum increasing in semantic richness and reasoning power as you move right, decreasing in raw query performance and tooling maturity:

Semantic Richness → Increasing Left to Right

Relational Schema

NoSQL Schemas

Taxonomies

Ontologies

Knowledge Graphs/ DTDL

The critical thing to internalize: each layer does not replace the previous one. Rather, it adds an expressiveness the previous does not provide. They all coexist in a mature architecture, each doing what it does best.

Layer 1 — Relational: The Authoritative Record

Examples: PostgreSQL · MySQL ·Oracle  Aurora  | Dominant Period: 1970s–2000s

The foundation. Data as tables, rows, columns, and foreign keys. ACID transactions. Referential integrity. The architecture that ended double-billing, lost orders, and inventory chaos across every industry simultaneously.

CREATE TABLE products (
    product_id   UUID PRIMARY KEY,
    sku          VARCHAR(50) UNIQUE NOT NULL,
    category_id  INT REFERENCES categories(id),
    price        DECIMAL(10,2)
);

What it does brilliantly: structured uniform records, aggregate queries, audit trails. What it cannot do: it knows a product has a category_id but not what that relationship means. It stores facts. It cannot reason about them.

The limitation that launched a thousand workarounds: variable-depth hierarchies break the parent_id self-join at scale. Entities with highly variable attribute sets produce the EAV anti-pattern —  a table of name/value pairs that is simultaneously flexible and analytically useless.

Layer 2 — NoSQL: Polyglot Persistence

Examples: MongoDB · Cassandra · DynamoDB · Redis  |  Dominant Period: 2005–2015

NoSQL is not one thing. It is a family of models, each solving a specific relational failure mode.

The document model (MongoDB, Firestore) handles entities with variable attribute sets an ergonomic chair and a pharmaceutical product can both be "products" without requiring schema contortions. The wide-column model (Cassandra, DynamoDB) handles time-series writes at horizontal scale that would crush a relational engine. The graph model (Neo4j) makes relationships first-class citizens they carry their own properties, they are traversable, they are the data.

The result: faster development cycles, horizontal scale, polyglot persistence right model per access pattern. Microservices architecture became viable precisely because each service could own its own data store.

The problem NoSQL created: data islands. "Category" in MongoDB means nothing to a Cassandra table. Users of NoSQL gained agility and lost share meaning entirely: systems could not agree on what things were, let alone reason about them.

Layer 3 — Taxonomies: Controlled Vocabulary

Examples: SKOS · MDM · Controlled Vocabulary  | Dominant Period: ~2010, accelerating

The first step toward shared meaning. Taxonomy is data as controlled hierarchical vocabulary — in other words, a formal agreement across systems that "Chair" means the same thing everywhere, and that Chair → Furniture → Product is a navigable hierarchy.

ex:Chair a skos:Concept ;
    skos:prefLabel "Chair"@en ;
    skos:altLabel  "Seat"@en, "Seating"@en ;
    skos:broader   ex:Furniture ;
    skos:definition "A separate seat for one person" .

E-commerce faceted search, library classification systems, regulatory product codes — all of these are taxonomies. The business value produced: search relevance jumps, catalog consistency becomes enforceable, cross-system classification becomes possible.

The ceiling to this layer, however, is that taxonomies only know broader, narrower, and related. They cannot express that a Chair must have exactly 3 or 4 legs, or that a product cannot simultaneously be physical and digital. They have vocabulary but no formal logic.

Layer 4 — Ontologies: Formal Reasoning

Examples: OWL · RDF · DTDL · SHACL  |  Dominant Period: 2015–present

This is where data modeling becomes knowledge engineering. Ontologies apply formal logic to domain knowledge cardinality constraints, class disjointness, multiple inheritance, and most importantly: inference.

# A product cannot be both physical and digital
ex:PhysicalProduct owl:disjointWith ex:DigitalProduct .

# Inference: if something hasCategory Furniture → it IS a FurnitureProduct
ex:FurnitureProduct owl:equivalentClass [
    owl:intersectionOf (ex:Product
    [ owl:onProperty ex:hasCategory ; owl:hasValue ex:Furniture ])
] .

The business unlock is the cross-domain question. An ontology can traverse the following: this batch of products → belongs to this category → manufactured by this supplier → who also supplies components used in → field failures. That query is not possible in a relational schema, which requires formally typed relationships and inference rules.

In the end, there still remains a gap: ontologies are static. They describe what things are — not what they are doing right now.

Layer 5 — DTDL / Digital Twins: Live Operational Intelligence

Examples: DTDL · Azure Digital Twins · IoT Hub  |  Dominant Period: 2020–present

Digital Twin Definition Language is ontologies made executable for live systems. A DTDL interface is not just a schema it carries telemetry streams, commands you can trigger, graph relationships you can traverse in real time. It mirrors physical and business entities as they operate.

"@id": "dtmi:aquerius:catalog:Product;1",
"contents": [
  { "@type": "Property",      "name": "sku"        },
  { "@type": "Telemetry",     "name": "stockLevel" },
  { "@type": "Command",       "name": "reindex"    },
  { "@type": "Relationship",  "name": "hasCategory"}
]

The business value: predictive operations, simulation before action, real-time decisions at the entity level. Not "what did stock levels look like last week" but "what is SKU-001's stock level right now, and when should we trigger a reorder given this supplier's current OTIF record."

Summary: The Full 5-Layer Architecture

A mature platform stacks all five, each doing what it does best, with a clear directional logic: data flows up through layers, governance flows down.

Layer 5 — DTDL / Digital Twins

"What is the live state of every entity?"

Layer 4 — Ontologies + Knowledge Graph

"What does it mean? What can we infer?"

Layer 3 — Taxonomies + Vocabularies

"What do we call things? How are they classified?"

Layer 2 — NoSQL Schemas

"How do we store varied data at scale?"

Layer 1 — Relational Schema

"What are the authoritative records?"

↑ Data flows up through layers, ↓ Governance flows down through layers

Effectively Leveraging The Structure

  • Rule 1: Let relational own truth, not meaning Postgres stores the authoritative record. It should not try to encode semantics through EAV tables or JSON blobs — that is the ontology layer's job.
  • Rule 2: NoSQL for access patterns, not to avoid discipline Choose a NoSQL model based on how data is read and written — not because you want to skip schema thinking. That discipline moves up to the ontology layer.
  • Rule 3: Taxonomy before ontology An ontology built on inconsistent vocabulary is worse than no ontology. Controlled vocabulary first — always.
  • Rule 4: Ontologies govern, databases execute The ontology defines what is valid. The database stores and retrieves it. Business rules belong in the layer where they can be reasoned about.
  • Rule 5: DTDL is the bridge to live reality Use DTDL when your ontology needs to reflect real-time state — not just classify what things are, but mirror what they are doing right now.
  • Rule 6: Govern top-down, store bottom-up Governance flows: Ontology → Taxonomy → Schema → Storage. Data flows the opposite direction. Never confuse the two.

The Practical Decision Table

The question you're answering The layer you need
"Give me order #12345" Relational — SELECT * FROM orders WHERE id=12345
"Store this product with 50 custom attributes" Document NoSQL — flexible schema per entity
"What category does Ergonomic Chair belong to?" Taxonomy — SKOS concept lookup
"What products are related to this failed component?" Ontology — SPARQL graph traversal with inference
"What is the live stock level of SKU-001 right now?" DTDL twin — real-time telemetry query
"Predict when this machine will fail" DTDL + ML — twin telemetry feeding a model
"The evolution from relational schemas to DTDL ontologies is not a replacement story — it is an enrichment story. Each layer adds a dimension the previous could not express: scale, vocabulary, meaning, reasoning, and finally live operational intelligence."

The platforms that leverage all five layers effectively each doing what it does best, governed top-down by ontologies, executed bottom-up by databases are the ones that can actually answer the hard questions. Not just retrieve records. Not just count rows. But reason across domains, in real time, with provable correctness.

That is the difference between a data platform and a knowledge platform. And it matters more now than it ever has because the AI layer sitting on top of all of this can only reason as well as the layers below allow it to.

The Layers are the Moat

Every organization has data. Fewer have a controlled vocabulary. Fewer still have a working ontology. Almost none have operational digital twins connected to their business entities in real time. Each layer you add is not just a technical capability — it is a compounding distance between you and platforms that stopped at layer one. Build the stack deliberately. The reasoning you unlock at the top is only as good as the foundations you laid at the bottom.

Share this post

See Reasoning in Action

See how Aquerius transforms raw data into trusted, verifiable enterprise logic.