Data Relationships: How Do I Identify Them?

Understanding the relationships between data is essential to improve the quality of your software and reduce the costs of managing test data.

To obtain the right data, we must undertake a segmentation process through which we extract coherent subsets of data from a source environment and deliver them to a target environment.

How can the discovery of data relationships aid in the TDM strategy?

In this article, you will find the answer to this question. We will also discuss the common procedure used for designing segmentation based on these relationships and the test requirements, and we will explore the capabilities that a segmentation tool should have.

Let's begin!

Why is it so necessary to identify data relationships?

Data segmentation relies on understanding the relationships between data domains to discern which records from a database belong to the structure we wish to move.

Thus, through relationships, we identify the path that allows us to move from one data point or record to another.

Therefore, it is essential to conduct research to discover this, without which the segmentation process will not be complete or accurate.

Discovering Data Relationships.

When we consider this technique in the design of a TDM strategy, the first question we encounter is: how do we find out the relationships between the data?

It's a good question.

If it's about extracting a coherent subset of data, the first challenge is to know how to navigate the database in an orderly fashion.

Where do we get this information? Who can tell us?

Next, we will present three sources to which we can turn for this information:

  1. Applications whose database implements relationships explicitly, through foreign keys.
  2. Documentation that describes the data model. Sometimes, it is up-to-date.
  3. The development and maintenance team of the application knows a lot, also about the data model relationships. Always or almost always.

However, what if the relationships are not explicit in the applications, the documentation is not up-to-date, or the development team is new?

Typically, at this point, someone alarmingly comments that knowing all the data model relationships is a daunting and difficult task.

Even when they are defined in the database, they are not all there, and only the software, the code, "knows" all the relationships.

So, do we have to go through all the application code?

Fortunately, no.

We've moved hundreds of thousands, millions of customers, accounts, policies, services, orders, and claims in segmentation processes and have not yet had to review the source code of any of the dozens of applications we've handled.

Subsetting Design

The first key is to realize that our problem is not to document all the application's data model relationships, only the ones we need.

This is a challenge, but of a smaller magnitude.

From here, we must take advantage of what we have and follow this process:

  1. Collect the available model documentation and find out if there is active referential integrity in the database. If so, we're ahead of the game, but we must not be complacent.
  2. Identify the strong entities that testers demand. They usually say, "I need customers with such and such characteristics," or invoices, or orders, or any other relevant functional concept. It's rare that they ask for "audit records from table HJK." That entity gives us the entry point into the model.
  3. If we have documentation and/or relationships in the database, we identify the entity in the model and the relationships.
  4. If not, we ask. All programmers or testers know those tables. And the main related ones, too.
  5. In any case, we now have a starting point to work on designing the segmentation. From here, we identify the data domains and, incrementally, we can advance in discovering the data model.

What are the desirable capabilities in a segmentation tool?

Obviously, segmentation is not done by hand.

If at this point you are doing this task manually, we need to change that, and you are interested in knowing the characteristics that this digital tool that will help you should have.

If, on the other hand, you already use software for this task, this will also help you verify if the one you use meets these requirements.

The 3 capabilities we can expect from our segmentation tool are:

It should allow designing and modifying the segmentation process, entities, and relationships, in a simple and agile way

The prior knowledge we obtain of the model relationships will almost certainly be incomplete. One way to complete it is to enter into an iterative process of design-execution-verification-improvement, and this requires easy maintenance.

It should allow designing relationships in a metamodel

Not necessarily existing in the database or in the code of the applications that use it.

The reasons are several. For example, to segment, you can use alternative, not necessarily functional, paths to improve the efficiency of the data reading and delivery process.

Moreover, physically adding relationships to the database entails constant checks by these, which for the design and service teams are not always easy to assume.

It should offer tools for automated discovery of relationships

It's an extraordinary complement to the above sources of information and often the main one. In the case of icaria Technhology, this is managed through data prospecting.

How does icaria TDM approach the discovery, configuration, and treatment of relationships between data?

icaria TDM meets the previously mentioned characteristics, but how does it do it?

The icaria tool allows you to configure and modify the extraction tree easily through a graphical interface designed for that purpose.

Thus, it allows them to be configured and documented in its own metamodel so that it serves as a guide for the various phases of:

  • Design
  • Implementation
  • Maintenance of the segmentation

Thanks to this, we will be able to reduce the application time from hours to minutes.

Because a relationship configured in icaria can transition smoothly, without requiring restarts, generating code, or similar proposals from other applications, to the execution environment.

Moreover, icaria has a self-discovery tool for relationships that allows obtaining an estimate of the existing relationships, both physical and logical. Prospecting.


The key question to understand the relationships between the data is how to find out the relationships between them.

In this sense, there are 3 main sources to understand the relationship between the data: the application itself provides you with this information, documentation where these relationships are recorded, or the development team.

Thanks to data segmentation, it is not necessary to review all the application code to test it. In addition, there are segmentation tools that allow you to automatically identify these relationships, which represents a significant savings in terms of time and costs.

Do you want to reduce the waiting times and costs of your tests? Contact icaria Technology now and start improving the quality of your test data management software.