Visualizing Data Mesh with React ForceGraph2D

Hülya Pamukçu Crowell
4 min readJan 25, 2023

--

Our previous articles looked at meta-architecture for Data Mesh, an approach to model the catalog with ORM, and the catalog experience with React, Relay, and GraphQL. So far, we have a way to get basic analytics and search capabilities for the mesh entities. However, to unlock the full potential of entity relations and concepts we built, we need a visualization mechanism for deeper insights. This article will outline how we use React ForceGraph2D to visualize the mesh and help users make informed decisions about the data products, datasets, and domains.

In this first iteration, we are focusing on a few basic features where users will be able to complete the following journeys:

  • As part of an audit, find all input and output datasets of a data product
  • Given an unhealthy data product, find all outgoing paths to determine all impacted datasets and other data products.

As preliminary steps to build these journeys, we will build graph visualization, filtering, search, and zoom capabilities. Let’s start with forming the GraphQL query to pull in the nodes and relations we are interested in. With our previous relay setup, the retrieved data will have collections of data products with a few properties and the edge properties to a domain, tags, input, and output datasets.

One important choice we made forming the query is to have data products as top level nodes and all relations defined through them. This will later simplify the generating the graph data.

We can now use this data to create a graph data structure as defined by the contract of the component. For example, the following code shows the recursive parsing of an “object” as a graph node.

Note that we allow customization of id, type, name fields, and directional relations with ParseConfig. It is generic enough to be used by any GraphQL result data. We will use the direction information later to find all outgoing paths from a node.

Once we translate the collection of the top-level objects and their relationship with others, we can load ForceGraph2D with this data. For example, in the following image of a sample mesh, we also filtered Tag nodes using the nodeVisibility property.

Finding all input and output datasets

In the user journey, we will search and find a data product node in the graph and identify all the input and output datasets for it.

As part of the graph node object, we have node coordinates. We can center and zoom to the node using this and the underlying force graph reference and methods. When the user hovers over to a node, all first-level nodes in the graph will be highlighted.

Finding the outgoing path from a data product

Let’s say we have an unhealthy node and would like to find all the affected data products and datasets.

When the user clicks on any Dataset or DataProduct node, we find the outgoing path via recursive traversal of all directional links. Note whether an edge is directional and what direction is configured during the parsing phase above.

Also, note that the .gif below shows the direction of data flow using ForceGraph2D’s linkDirectionalParticles property.

Recap

In this article, we built a basic graph visualization for Data Mesh as we continue to define the user experience. From here, we can add features like subgraph viewing and searching, creating exports from the selected nodes, and 3D visualizations (preview is below).

Previous articles in the series

--

--