DATA STRATEGYDATA MESH
24/02/2025 • Marnick Vanloffelt

Data mesh governance: a blueprint for decentralized data management

Data mesh is revolutionizing the way organizations manage data. Unlike traditional centralized models, data mesh uses a decentralized, domain-oriented structure. But how does governance work in such a distributed system?

At ACA Group, we believe data mesh is an answer to the challenge of managing data by focusing on building a decentralized, self-serve data ecosystem. The goal is to embed data-driven innovation within each department or team, making everyone in the organization responsible for creating reusable data that fuels new products and services across departments.

In a data mesh, not only the management of ownership and infrastructure is different. The key to success is transforming data governance itself. Instead of making a centralized IT team responsible for data governance, data mesh distributes the responsibility across different teams.

This approach, known as "federated computational governance", ensures active participation from both data-producing and data-consuming teams in crafting and adopting governance policies.

Four pillars of data mesh and their governance challenges

To understand the importance of governance in a data mesh, we need to break down the core principles of a data mesh and how they relate to data governance challenges:

1. Decentralization

In a data mesh, data ownership and responsibility are distributed across different business domains or teams. Each domain becomes a self-contained unit, managing its own data products. This also means that each data product and domain is self-governing, but needs to be interoperable with other data products and domains.

2. Domain-oriented approach

Instead of a monolithic data warehouse, a data mesh is made up of interconnected data products. This implies that each data product might come with its own “local dialect”. The challenge here is how to speak the same language, without speaking the same language.

3. Data as a product

This approach treats data as a product, with each domain creating and maintaining data products that are discoverable, accessible, and reusable. Metadata management becomes an important topic, since metadata is used to discover, access, integrate with and use the data encapsulated within a data product.

4. Self-serve platform

This engine and control panel empowers data producers and consumers alike. Developer portals, data catalogs, lineage tools, and collaboration spaces facilitate seamless navigation, while automated policy enforcement and regular audits are used to ensure compliance and promote data product quality without manual intervention. Automation of governance is a core challenge associated with the self-serve platform.

Now that you have a better understanding of the central building blocks and challenges of data governance in a data mesh, let’s take a closer look at each of these challenges individually.

Federated Governance

A standout feature of data mesh is federated governance. But what does it actually mean?

“Federated” refers to the fact that while each domain (and data product within those domains) has its own autonomy, they come together to hash out a few things that are relevant and valuable for everyone. You might think of it as a parliamentary democracy, where representatives come together to make joint decisions, which then need to be broadly implemented. 

This cross-domain collaboration means that quite a few teams are going to be involved. 

Federated Governance Team

This is a group of domain representatives and experts who collaborate across business units and areas of expertise. They ensure data quality, compliance, and alignment with organizational goals.They oversee tasks such as:

  • Automated data quality assessments
  • Data access and privacy management
  • Ensuring data products and datasets can be shared and reused

This team defines standardized data governance policies and ensures that data products and datasets can be shared and reused, while safeguarding overall quality. To continue our earlier comparison, the Governance team is like a “parliament” that discusses and passes “laws”.

Platform Team

This team is essential to automate and enforce the governance policies defined by the Governance Team on the self-serve platform. They ensure that policies can be adopted by Data Products on a low-effort basis, promoting interoperability and collaboration without introducing unnecessary overhead.

Domain Teams

Aligned with business units, domain teams handle operational data governance within their own domains. Responsibilities include:

  • Data mapping and documentation
  • Ensuring data quality
  • Implementing standards defined by the federated governance team 

Importantly, each domain team has the autonomy and resources to execute the standards defined by the federated governance team.

In summary

While local domain teams make decisions specific to their domain, federated data governance ensures global rules are applied to all data products and their interfaces. These rules must ensure a healthy and interoperable ecosystem.

How does federated data governance work?

Let’s start with an important note: Federated Governance requires a different way of thinking compared to more traditional governance approaches. 

Federated governance is focused on promoting autonomy and interoperability as much as possible, keeping interference by a centralized team to an absolute minimum. Do you want to successfully implement federated data governance in your organization? Then, make sure you establish the following key foundations:

  1. Culture of ownership
    Teams must feel accountable for their data. This requires a high level of maturity in data literacy, and a willingness to invest in training and continuous education on data management and governance best practices.

  2. Robust data infrastructure
    You need to be ready to invest in scalable and flexible data infrastructure that supports decentralized data management.

  3. Governance framework
    You will need a clear governance framework that defines roles, responsibilities, and processes. This framework should be flexible enough to adapt to the needs of different domains while maintaining overall coherence.

  4. Cross-functional collaboration
    Collaboration between IT, data professionals, and business units is essential.

Enterprise ontology: bridging domain-specific language gaps

Each domain can have its own specific lingo, creating challenges when terms differ in definition across teams. To bridge the gaps between domains, we need a solid basis for “translation” and a common understanding of terms. This is where the enterprise ontology comes in. 

What is an enterprise ontology?

You can see it as a large, hierarchically structured “dictionary” that links concepts used in different domains to each other based on a common denominator.

For example: a sales team and a finance team both use the term “customer”, but the definitions for this term used by each team are somewhat different.

  • The Sales team calls people who have received a quote a customer.
  • The Finance team defines a "customer" as someone with a signed contract and invoicing details. Others are referred to as “prospects”.

Without a shared ontology, combining the data products from these teams would yield inconsistent results, highlighting the need for clarity. 

How an enterprise ontology works

By tagging domain-specific terms to a unified concept (e.g., "customer") in the ontology, teams can reconcile differences and enable cross-domain understanding.

To bridge the gaps between domain-specific terms:

  1. Tag terms to a common ontology: Terms from each domain are linked to a unified concept in the enterprise ontology using tags. For instance, "sales customer" and "finance customer" might both map to a universal "customer" term.
  2. Leverage unique identifiers: When consulting the ontology, you might discover that the unique identifier across all “customers” is their email address. Moreover, finding a unique identifier across terms linked to the same concept is valuable, as it allows you to correlate data related to the same term across domains. 

Metadata: Enabling prevention, validation, and auditing

Metadata, often described as "data about data," plays a crucial role in Federated Data Governance within a data mesh. It provides the necessary context to make data understandable, accessible, and usable across different domains. 

Key roles of metadata in federated data governance

  • Enhancing data discoverability
    Metadata enables users to easily find and understand data across the organization. It includes practical information such as the data source(s), creation date, format, and usage instructions, but also information specifically linked to discoverability, like which enterprise ontology tags are applicable, who the owner is, or associated data products. This makes it easier for teams to locate (and integrate with) relevant data products.
  • Improving data quality and trust
    Metadata includes (or should include) data quality metrics and lineage information, helping teams ensure data accuracy and reliability. It allows users to trace data back to its origin, understand transformations it has undergone, and assess its quality.
  • Facilitating compliance and security
    Metadata helps in maintaining compliance with data privacy and security regulations. The data product team can specify who or which roles can access the data and for what purpose, ensuring accountability and transparency. Furthermore, tagging sensitive data elements helps to automatically apply data privacy and masking policies, ensuring regulatory compliance.
  • Enabling interoperability
    Metadata ensures that data from different domains can be integrated and used together. Standardized metadata formats and definitions enable seamless data exchange and interoperability.

Best practices for metadata management in data mesh

In a data mesh, metadata should be managed as close to the source as possible. Each data product team is responsible to carefully author and curate the metadata associated with their data product. Exceptions, like the automated addition of data quality metrics from the self-serve platform, can apply, but the data product itself remains the source of truth, and they should be managed as such. In short, metadata should be decentrally managed, but centrally consumable.

Metadata management should be automated as much as reasonably possible and integrated with data governance tools to ensure accuracy and consistency. Key practices include:

  • Careful metadata authoring and curation: Use tools that automatically capture and update metadata. Introduce processes and practices that motivate data product owners to take special care when they create and modify the metadata associated with their data product. The data product owner should ensure that the metadata presented to consumers gives a truthful representation of the content of the data product, so these consumers can make an informed decision about the value of the product for their use case.
  • Standardization: Implement standardized metadata formats and definitions across all domains (where appropriate) to ensure maximal interoperability and ease of use.
  • Automated validation: Define procedures and policies to automatically validate metadata, in order to spot mistakes and inconsistencies early on and prevent error propagation throughout the system. As always, prevention and validation come first, audits second.
  • Regular audits: Conduct regular automated audits to ensure metadata accuracy and compliance with governance policies.

The self-serve platform: automating governance

The self-serve platform embodies "Federated Computational Governance." It provides tools and infrastructure that allow both users and creators to independently access and manage data products without relying on a central IT team. 

Key features of a self-serve platform

 

  • Empowering domain teams: Self-serve platforms enable domain teams to take ownership of their data. They can create, manage, and use data products independently, fostering a sense of accountability.
  • Ensuring compliance: Self-serve platforms integrate governance controls, ensuring that data usage complies with organizational policies and regulations, balancing autonomy with oversight.
  • Metadata management: Through the use of the right tooling, the self-serve platform can facilitate the careful curation and automated validation of metadata. This eases both integration with the self-serve platform and management of metadata within the individual data products.
  • Policy management: Governance policies can be translated to automated processes, which can be enforced through the platform. Automated policy enforcement ensures that data usage complies with internal guidelines and external regulations.
  • Monitoring and auditing: Monitoring and auditing capabilities can be used to track data usage and ensure compliance. Regular audits help identify and address any governance issues. Alerting data product or domain teams of these issues and their consequences allows them to address them in their own way and at their own time.

Conclusion: striking the balance between autonomy and oversight

Embracing a data mesh architecture requires a different approach to governance. The traditional centralized model of managing data no longer suffices in a world where agility, autonomy, and cross-functional collaboration are paramount. 

Federated data governance empowers domain teams to take ownership of their data products while ensuring alignment with global organizational standards. By distributing responsibilities across domain teams, supported by a self-serve platform and strong metadata management practices, organizations can enhance data quality, interoperability, and compliance without adding unnecessary complexity.

However, the success of data mesh governance depends on fostering a strong culture of data ownership, building a robust self-service platform, and establishing clear frameworks that promote seamless cross-domain collaboration. 

That’s a lot of buzzwords for one sentence, but it rings true nonetheless: 

  • Data ownership holds people accountable for the data they create and maintain, while allowing them to take full control of their data products.
  • Strong infrastructure and a self-service platform is needed to facilitate this practice of ownership, giving data product teams the autonomy they need to put their product out there, while also allowing for collaboration and sharing.
  • Clear governance frameworks are needed to establish what quality looks like and guides data product teams in implementing best practices related to integration, collaboration, and more.

The key to thriving in data mesh is a governance model that strikes the right balance between autonomy and oversight—allowing teams to produce while safeguarding the integrity and value of the organization's data ecosystem.

 

Ready to embrace data mesh?
Contact us for expert guidance and tailored solutions!