Page Contents

Overview of Lineage Models

Data lineage describes what happens to data as it goes through diverse processes. 

If assets you are governing are connected using relationships that describe data lineage, you will be able to click on a given asset and see its lineage or impact across enterprise by selecting the corresponding option from the visualization menu. Lineage will show specifics of how data flows between data sources and applications to users, in support of business activities and functions, and the enablement of enterprise capabilities. Information is presented using an interactive diagram called LineageGram.

One example of such diagram is shown below where we see business (aka logical) lineage showing all systems that feed into one organization's CRM Platform in a context of a product registration activity and associated information.

It is fairly common in an enterprise to have many different connections between individual assets where connections belong to different contexts serving different purposes. For example:

  • CRM system is fed with customer information not only as a result of product registration, but also as a result of other CRM processes and activities such as marketing or customer support
  • Similarly, employee information, for example, may flow between HR related applications in the context of HR processes and it may also flow between applications that are used for customer support or customer acquisition in the context of CRM processes. 

These are different enterprise flows. If they were all captured and we did not specify a context of interest, asking for a lineage of a system or a data source or a data element would display not only the dependencies in play for product registration, but also many other different links and feeds making it hard for users to understand lineage as it relates to a given business activity. Seeing this fuller picture can be important, especially for impact analysis, and EDG provides it. However, users analyzing data lineage will often want a focused and bounded exploration of lineage. To support this, EDG lets you create Lineage Models as separate asset collections to capture contextualized lineage relationships. The role of the Lineage asset collections is to contain the context specific relationships between data, applications and other assets. Each collection can store lineage for one or more enterprise flows. It often make sense to keep lineage for related processes (e.g., HR) within the same collection.

EDG also includes an asset type called Lineage Model. It shares the name with the asset collection that is intended to store lineage information, but it is an entity of its own that is created within the collection. Its use is optional and its role is to provide a convenient “starting point” for users who want to explore lineage. A Lineage Model asset is linked to assets that participate in the lineage using one of the relationships that have been designed for this purpose, for example, “uses software executable”, as is shown in the screenshot below

Only the last application or a data source in the lineage chain needs to be linked to a Lineage Model asset - as shown above. When users click on a Linage Model asset collection, they will see the Lineage Model assets presented in a table. They can select one and choose an option to show the Lineage diagram.

With this, we see a full flow for product registration information, from the beginning to the end. As shown below, it does not stop with the CRM Platform, but continues into a Data Warehouse and ultimately a Reporting and Analysis Toolset.

To see a sample Lineage Model in EDG, create a new Lineage Model Asset Collection, select import RDF from the Import tab and import this attached file:

Once data is loaded, click on the Assets tab and double click on the asset FRY9C-SECURITIZATION. You will see the following table.


In the visualizations menu, select Lineage:

You can also select Lineage by simply highlighting FRY9C-SECURITIZATION asset in the table and using the visualizations menu link above the table's header.

Unlike the product registration lineage shown above, FRY9C-SECURITIZATION example shows a more detailed data flow that goes beyond the business lineage and includes specific data sources and data elements. EDG can capture and connect lineage at different levels.

To learn more about the full scope of lineage supported by EDG, what kind of relationships are used to capture lineage information and to understand the capabilities of the LineageGram, click on the Lineage Model link in the blue left hand side navigation bar and navigate to the interactive tutorial by following the link in the "A tutorial explaining EDG's visualization of Lineage Models can be accessed at this page" text.

Data/Technical/Enterprise Assets Models to include: These asset models provide the contents of the information flows. If they are not selected, it is assumed that you will be storing relevant data, technical and enterprise assets directly in a lineage model.