A Meta-architecture for Data Mesh

Hülya Pamukçu Crowell
4 min readJan 1, 2022

--

One of the essential requirements for successful data-driven decisions is speed. While there are many ways to improve the delivery of data products, we are pleased to see a structured approach, Data Mesh. This article will discuss the paradigm shift for next-gen data ecosystems, explore its core principles, and present a meta-architecture as a reference model for developing self-service data products.

Data Mesh presents an approach to align the ownership, architecture, and organization with the change axis. What differentiates it from other enterprise data management and architecture models are:

  • product thinking
  • decentralized data ownership
  • decentralized architecture
  • federated governance

Product thinking emphasizes data products as the atomic unit of change, which means less coordination than traditional pipeline stage thinking and “shared by default” datasets in a central system. Data can still be “published” by the owners and domain experts to be used by other data products or domains with guarantees and federated controls. Decentralizing ownership leaves the management of data and its lifecycle to domain experts with increased autonomy and accountability and reduces data duplication. When data products are managed with specialized knowledge, they can better be tailored to the needs of their domain and consumers. Product thinking and decentralizing ownership imply decentralizing the architecture to shift domain-specific decisions as implementation details to the domains themselves. In a distributed data products ecosystem, data products can be created and exchanged securely and compliant with federated governance.

Translating these core principles to a meta-architecture, we see the significant deviation from traditional models as:

  • being prescriptive about solutions and offerings vs. the enablement
  • centralized, deploy once serve all vs. self-service, re-deployable solutions
  • basic metadata vs. deep insights, tracing, automated extraction of metadata

In the meta-architecture below, we have the following coarse layers:

  • Enablement Layer: This layer enables product teams with self-service solutions. It covers any foundational pieces that will speed up the delivery of data products at every stage: authoring, experimentation, deployment, serving, and evaluation. Core enablement teams provide standard templates and re-deployable solutions, but any product team can also contribute new ones created from scratch or existing ones using extensibility points.
  • Data Literacy & Governance Layer: This layer is responsible for anything that will provide a deep understanding of published and consumed data in the mesh. Quality, performance, cost/value ratio, stability, and health signals are collected and displayed in the catalog listings. Improved data literacy is achieved with lineage, tracking, and high coverage metadata; governance with global and local policies (sovereignty); rule enforcement, control, and audits. The standards and schema rules for exchange and interoperability are defined and tracked.
  • Marketplace Layer: This layer makes data product exchange possible. Users can publish and consume within the compliance and security requirements and quality gates. It presents catalogs of products, self-service solutions, and deployable templates. Any party in the system can publish these as long as it passes global requirements. Consumers can “shop” data products with the properties in the listing via the API or dashboards.
Data Mesh meta-architecture

With these layers, data products are:

  • self-describing and addressable via a data catalog API with CRUD operations for data product description where a logical name to physical address is stored, along with all the other metadata
  • interoperable via standardization in the schema and join keys controlled and enforced by policies
  • discoverable via the marketplace, catalog, and the API
  • trustworthy via global rules and policies, and mechanisms to ensure the data is high quality, security, and privacy-compliant

Recap

This article discussed Data Mesh high-level concepts and principles to optimize value extraction by focusing on the delivered products. In addition, we shared a meta-architecture to reflect these as fundamental layers.

We hope this article will help with your Data Mesh journey and would like to hear about your data experience, challenges, and solutions that worked well.

Further reading on Data Mesh:

--

--

No responses yet