A Meta-architecture for Data Mesh

4 min readJan 1, 2022

One of the essential requirements for successful data-driven decisions is speed. While there are many ways to improve the delivery of data products, we are pleased to see a structured approach, Data Mesh. This article will discuss the paradigm shift for next-gen data ecosystems, explore its core principles, and present a meta-architecture as a reference model for developing self-service data products.

Data Mesh presents an approach to align the ownership, architecture, and organization with the change axis. What differentiates it from other enterprise data management and architecture models are:

product thinking
decentralized data ownership
decentralized architecture
federated governance

Product thinking emphasizes data products as the atomic unit of change, which means less coordination than traditional pipeline stage thinking and “shared by default” datasets in a central system. Data can still be “published” by the owners and domain experts to be used by other data products or domains with guarantees and federated controls. Decentralizing ownership leaves the management of data and its lifecycle to domain experts with increased autonomy and accountability and reduces data duplication. When data products are managed with specialized knowledge, they can better be tailored to the needs of their domain and consumers. Product thinking and decentralizing ownership imply decentralizing the architecture to shift domain-specific decisions as implementation details to the domains themselves. In a distributed data products ecosystem, data products can be created and exchanged securely and compliant with federated governance.

Translating these core principles to a meta-architecture, we see the significant deviation from traditional models as:

being prescriptive about solutions and offerings vs. the enablement
centralized, deploy once serve all vs. self-service, re-deployable solutions
basic metadata vs. deep insights, tracing, automated extraction of metadata

In the meta-architecture below, we have the following coarse layers:

Enablement Layer: This layer enables product teams with self-service solutions. It covers any foundational pieces that will speed up the delivery of data products at every stage: authoring, experimentation, deployment, serving, and evaluation. Core enablement teams provide standard templates and re-deployable solutions, but any product team can also contribute new ones created from scratch or existing ones using extensibility points.
Data Literacy & Governance Layer: This layer is responsible for anything that will provide a deep understanding of published and consumed data in the mesh. Quality, performance, cost/value ratio, stability, and health signals are collected and displayed in the catalog listings. Improved data literacy is achieved with lineage, tracking, and high coverage metadata; governance with global and local policies (sovereignty); rule enforcement, control, and audits. The standards and schema rules for exchange and interoperability are defined and tracked.
Marketplace Layer: This layer makes data product exchange possible. Users can publish and consume within the compliance and security requirements and quality gates. It presents catalogs of products, self-service solutions, and deployable templates. Any party in the system can publish these as long as it passes global requirements. Consumers can “shop” data products with the properties in the listing via the API or dashboards.

With these layers, data products are:

self-describing and addressable via a data catalog API with CRUD operations for data product description where a logical name to physical address is stored, along with all the other metadata
interoperable via standardization in the schema and join keys controlled and enforced by policies
discoverable via the marketplace, catalog, and the API
trustworthy via global rules and policies, and mechanisms to ensure the data is high quality, security, and privacy-compliant

Recap

This article discussed Data Mesh high-level concepts and principles to optimize value extraction by focusing on the delivered products. In addition, we shared a meta-architecture to reflect these as fundamental layers.

We hope this article will help with your Data Mesh journey and would like to hear about your data experience, challenges, and solutions that worked well.

Further reading on Data Mesh:

Comprehensive list of resources
Reference architectures for GCP and AWS

Other Data Mesh articles in the series:

Data Mesh Catalog with React, Relay, and GraphQL

Our previous articles provided a high-level architecture for Data Mesh and an approach to model the catalog with ORM…

qulia.medium.com

Visualizing Data Mesh with React ForceGraph2D

Our previous articles looked at meta-architecture for Data Mesh, an approach to model the catalog with ORM, and the…

qulia.medium.com

Hydrating Data Mesh from AWS Lake Formation, Glue, and DataBrew

In our previous articles, we looked at defining and building a data mesh experience. Whether we already have existing…

qulia.medium.com

Announcing MeshLens: A SaaS Solution for Data Mesh

We are happy to announce that MeshLens™ is available on AWS Marketplace as a Data Mesh enablement SaaS product.

qulia.medium.com

A Meta-architecture for Data Mesh

Recap

Data Mesh Catalog with React, Relay, and GraphQL

Our previous articles provided a high-level architecture for Data Mesh and an approach to model the catalog with ORM…

Visualizing Data Mesh with React ForceGraph2D

Our previous articles looked at meta-architecture for Data Mesh, an approach to model the catalog with ORM, and the…

Hydrating Data Mesh from AWS Lake Formation, Glue, and DataBrew

In our previous articles, we looked at defining and building a data mesh experience. Whether we already have existing…

Announcing MeshLens: A SaaS Solution for Data Mesh

We are happy to announce that MeshLens™ is available on AWS Marketplace as a Data Mesh enablement SaaS product.

Written by Hülya Pamukçu Crowell