08/04/2022 • Tom De Wolf

Applying Data Mesh principles to an IoT data architecture

In a previous blog post, we described an Internet of Things (IoT) case in which the IoT data became part of an enterprise-grade data platform. This allows you to combine the IoT data with other datasets you may have available in your enterprise. The challenge is to design your data platform so that it scales with the number of datasets, provides the necessary business agility, avoids ending up with an unmaintainable monolithic data platform, and avoids making only one team responsible as a bottleneck for getting things done. This blog post shows how a data mesh approach helps by explaining what it is, its principles and applying it to the IoT data case as an example.

The evolution towards data mesh

Let's start with explaining what 'data mesh' is and the principles it aims for. We'll begin by looking at how the operational space evolved over the last decade.

From monolithic architectures towards micro-services

Schema illustrationg monolithic software applications evolving towards applications based on micro services

In the past, software applications were often built as big monolithic systems with their typical problems. A monolithic architecture typically evolves towards a 'big ball of mud' which makes it hard to maintain, to change things and to provide the necessary business agility a company needs. At the same time, when scaling towards multiple teams within a company, such an architecture does not provide enough flexibility and makes it unclear which part of the software is owned by which team. As a solution, the operational space went through an evolution towards a microservices architecture. Using techniques from domain driven design, a decomposition is designed based on business domains and into software services. The challenge is to find the right granularity to enable the desired business agility in a composable architecture. This decomposition also enables scaling to multiple teams. Each team is responsible for a business domain, which implies that each microservice is clearly owned by one team.

Schema illustrating that the analytics and data space is monolithic but is moving towards a micro-services architecture

What we see today is a similar trend in the analytics and data space. If we put the analytics and data space next to the operational space, we see again a monolithic structure in the form of data lakes and data warehouses owned by a separate team of data engineers. So even if there is a clear decomposition in the operational space, there is still a monolith in the analytics space, resulting in similar problems. Data pipelines tend to grow over time into a unmaintainable mess of chained pipelines with long execution times, high storage requirements, all-or-nothing upgrades with global downtime, etc. Ownership to structure data and make it usable is assigned to a central team of data engineers that become a bottleneck when the amount of datasets scales and when the frequency of changes increases. This again becomes problematic for enabling the business agility a company needs.

Data products for more structure and ownership

For the analytics & data space, then, we also need to find a suitable decomposition that aligns with the business domains for which business agility is desired. This decomposition is called a 'data product' which consumes data from operational services and other data products and produces data with a clear API or data contract. These data products are owned by the respective business domains, together with microservices for that domain. A cross-functional team of software engineers and data engineers is responsible for building, maintaining and evolving a domain. As such, a network of interconnected data products called a 'data mesh' appears. Note that there are still connections between services and between data products that can result in advanced networks, but the biggest difference is that these connections follow clear APIs or contracts defined by components that clearly structure the IT landscape and the ownership of it.

The concept of data mesh was introduced by Zhamak Dehgani. You can refer to her recently published book for all the details. As a recap, there are four principles which a data mesh journey aims to achieve. These principles complement each other and each addresses new challenges that may arise from others:

Illustration of four principles which a data mesh journey aims to achieve

Domain-oriented ownership: decentralize the ownership of analytical data to business domains closest to the data — either the source of the data or its main consumers.
Data as a product: avoid isolation in domain silos by stimulating sharing data as a product. Apply techniques from product thinking and product ownership to design a new autonomously evolvable and deployable architectural unit with a data contract API that is optimized for usability by data users, data analysts and data scientists.
Self-serve data platform: reduce total cost of ownership and remove friction from the journey of data sharing, access, and consumption with a self-service platform that manages the full life cycle of individual data products (build, deploy and maintain), and provides mesh-level capabilities to discover available data products and increased observability through knowledge graphs, data lineage, and data quality/usage metrics across the mesh.
Federated computational governance: instead of central governance, increase domain engagement by enabling federated decision-making and accountability, with a team composed of domain representatives, data platform, and subject-matter experts (e.g. legal, compliance, security, etc). This model balances the autonomy and agility of domains, with the global interoperability of the mesh. This interoperability enables getting higher-order value by making it easy to interconnect data products. The 'computational' aspect refers to automating the governance policies for every data product and enforce them through reliable self-service platform capabilities.

Data product as new architectural quantum

From the book 'Building Evolutionary Architectures', an architectural quantum is an independently deployable component with high functional cohesion, which includes all the structural elements required for it to function properly. As such, the 'data product' in our data mesh is a new architectural quantum. It can be visualized as follows:

Visualization of the 'data product' in a data mesh as a new architectural quantum

A data product encapsulates these structural elements required for providing the data as a product:

1 or more input ports which take in data from source systems or other data products
1 or more output ports which serve the data in (multiple) format(s) and through (multiple) protocol(s) following a data contract API. Note that 'API' is not limited to a typical REST API, it refers to an agreed upon technology, format and protocol to exchange data which can be a REST API, but could also be an SQL database connection, an S3 storage, etc. However, it should never be the internal model of an operational system, but an explicitly designed external model/table/schema that serves as an API.
the data storage needed internally or to serve the data in an output port
the actual code that applies the transformation logic from input ports to output ports
provided governance policies that are enforced within the data product
metadata that makes the data product discoverable and self-documenting (discovery port)
monitoring (i.e. metrics) and management of the data product (control port)

Using this data product and its different aspects allows to reason about and design a suitable decomposition of a data platform.

IoT data as a simple use case

Let's take an example use case to illustrate how a mesh of data products can already help as a useful design paradigm. The use case concerns itself with using Internet of Things (IoT) data together with other enterprise data to provide valuable insights into well-being and health of employees in the workplace and children in schools. For a full description of the use case, we refer to our previous blog post entitled 'Using IoT and digital canaries to improve health'.

In short, there are 3 operational systems involved:

an IoT platform reading telemetry data from the IoT devices using Google Cloud IoT Core and Google Cloud Pub/Sub
a Google sheet in which metadata about the IoT devices is captured (location, building, floor, outside co2 level, ...)
a Google sheet in which a logbook is kept of the actions that are taken to improve the health of the working environment, and thus improve the values sensed by the IoT devices

All these systems belong to the IoT domain and are owned by one team: the IoT team.

The 3 operational systems involved are owned by the IoT team

For analytics and reporting Google Data Studio is used and owned by the data analytics team. An example of a resulting dashboard is shown below. In what follows, we show how a mesh of data products emerged from designing the data platform. The resulting data mesh gets the data from the operational systems into the desired dashboard.

An example of a dashboard in Google Data Studio

Evolution of the IoT data mesh

In IoT, the core data is of course the telemetry data coming from the IoT devices themselves. The team owning this IoT system is now also responsible to share this time series data as a data product on the data mesh platform. Our first data product called 'IoT Telemetry' is introduced, which takes the IoT events containing multiple metrics from Google Pub/Sub and transforms them using Google Dataflow into a SQL queryable table in Google BigQuery with one row for each metric. The deviceId is an important identifier here. When using a more central technology service like BigQuery for a decentral data mesh ownership, it is important to clearly define boundaries within BigQuery. In this case, each data product becomes a different dataset within BigQuery enabling the teams to get specific access rights to only change and populate their data product. In data mesh, this kind of data product is called a source-aligned data product, because it is closely linked to the operational source system and exposes its data to the mesh. For showing this data in a graph, the team responsible for the dashboard in Google Data Studio can read directly from the output port of this data product.