Skip to end of metadata
Go to start of metadata

Page Contents

Validate and/or Import Standard Lineage Spreadsheet

This imports Lineage Model contents (data transformations) from a CSV file in the format described here. Each CSV row represents one data transformation step with three logical parts: (1) a source input to an executable, (2) the executable itself, and (3) a target output. Source inputs might be database columns, files, etc. Executables might be stored procedures or scripts. Target outputs might be other columns, reports, etc. An example CSV template is provided in the EDG workspace project: edg.topbraidlive.org\1.0\templates\import\ImportStandardLineageSpreadsheet.csv.

Optionally, the transformation steps can be identified for specific pipeline steps. A second example template with pipeline steps is provided in project: edg.topbraidlive.org\1.0\templates\import\ImportStandardLineageTemplateWithPipelines.csv.

The CSV columns are as follows. Note that not all columns are used in all situations. See subsequent sections for details.

(0) Pipeline columns (optional)Description
PIPELINE_NAMEThis identifies a set of related pipeline steps
PIPELINE_STEP1, 2, 3, ...
(1) Input Source columnsDescription ( * indicates required fields)
SOURCE_DATABASE_TYPE *

Supported values: DB2, MSSQL, NETEZZA, ORACLE, TERADATA, FILE SYSTEM, HIVE.
Any other value is treated as a custom object; see Custom Sources and Targets for details.

SOURCE_SERVER_NAMEIf applicable, hostname of the ..._DATABASE_TYPE
SOURCE_DATABASE_NAMEif applicable, the database (server instance) or directory path
SOURCE_DATABASE_SCHEMAif applicable, the database/schema (persistent data)
SOURCE_DATASET_NAMEName of the data element's table/view/synonym/directory-structure, etc.
SOURCE_DATASET_TYPEType of the data element's table/view/synonym/directory-structure, etc.
SOURCE_DATA_ELEMENT_NAME

Name of the specific element/column of the dataset that the executable uses

(2) Executable columnsDescription
EXECUTABLE_SYSTEM_TYPEAll executables have an executable system type. For example, if the executable is a stored procedure or a trigger (etc.) the executable system type might be ORACLE, MSSQL, TERADATA, PLSQL but it can be also IFPC, MSSQL scripts, Oracle scripts etc. for external scripts and workflows. For more details in scripts or custom executables see the corresponding sections.
EXECUTABLE_SERVER_NAMEhostname of the executable
EXECUTABLE_DATABASE_NAME if applicable, the exe's declaring database (server instance) or script's file-path
EXECUTABLE_SCHEMA_NAME if applicable, the executable's database/schema (persistent data)
EXECUTABLE_NAME *the program, script, stored procedure, job, etc.
EXECUTABLE_TYPE *Allowed values:
  • SCRIPT
  • SOFTWARE FUNCTION
  • ETL JOB
  • STORED PROCEDURE
  • SOFTWARE MACRO
  • SOFTWARE TRIGGER
  • VIEW
  • SYNONYM
  • INFORMATICA SESSION
  • CUSTOM TRANSFORM
(3) Output Target columnsDescription ( * indicates required fields)
TARGET_DATABASE_TYPE *

Supported values: DB2, MSSQL, NETEZZA, ORACLE, TERADATA, FILE SYSTEM, HIVE.
Any other value is treated as a custom object; see Custom Sources and Targets for details.

TARGET_SERVER_NAMEIf applicable, hostname of the ..._DATABASE_TYPE
TARGET_DATABASE_NAMEif applicable, the database (server instance) or directory path
TARGET_DATABASE_SCHEMAif applicable, the database/schema (persistent data)
TARGET_DATASET_NAMEName of the data element's table/view/synonym/directory-structure, etc.
TARGET_DATASET_TYPEType of the data element's table/view/synonym/directory-structure, etc.
TARGET_DATA_ELEMENT_NAME

Name of the specific element/column of the dataset that the executable uses

Detailed Examples for Sources and Targets

The properties of source and target data may vary depending on their type (as declared in SOURCE_DATABASE_TYPE /  TARGET_DATABASE_TYPE). The following sections indicate the required values and constraints for the different types of data sources or targets.

Relational Database Sources and Targets

(1) Input Source columns(3) Output Target columnsDescription ( * indicates required fields)
SOURCE_DATABASE_TYPE *TARGET_DATABASE_TYPE *

Supported RDB values: DB2, MSSQL, NETEZZA, ORACLE, TERADATA.
Any other value is treated as a custom type; see Custom Sources and Targets for details.

SOURCE_SERVER_NAMETARGET_SERVER_NAMEOptional: Hostname of the corresponding ..._DATABASE_TYPE
SOURCE_DATABASE_NAME *TARGET_DATABASE_NAME *Name of the database (server instance)
SOURCE_DATABASE_SCHEMATARGET_DATABASE_SCHEMA Optional:  Name of the database/schema (persistent data)
SOURCE_DATASET_NAME *TARGET_DATASET_NAME *Name of the corresponding TABLE, VIEW, or SYNONYM
SOURCE_DATASET_TYPE *TARGET_DATASET_TYPE *Supported RDB values: TABLE, VIEW, or SYNONYM
SOURCE_DATA_ELEMENT_NAME *TARGET_DATA_ELEMENT_NAME *

Name of the specific element/column of the TABLE, VIEW, or SYNONYM that the executable uses

Example NeighborGram for relational database I/O

Relational database CSV row:

Corresponding Neighborgram:

File Sources and Targets

(1) Input Source columns(3) Output Target columnsDescription ( * indicates required fields)
SOURCE_DATABASE_TYPE *TARGET_DATABASE_TYPE *

Supported value: FILE SYSTEM

SOURCE_SERVER_NAMETARGET_SERVER_NAMEHostname of the file if applicable
SOURCE_DATABASE_NAME *TARGET_DATABASE_NAME *Directory path where the file resides
SOURCE_DATABASE_SCHEMA TARGET_DATABASE_SCHEMA NOT APPLICABLE
SOURCE_DATASET_NAMETARGET_DATASET_NAMEThe name of the file
SOURCE_DATASET_TYPETARGET_DATASET_TYPESupported value: FILE
SOURCE_DATA_ELEMENT_NAMETARGET_DATA_ELEMENT_NAME

Column of the file (1,2,3 etc.) 

Example NeighborGram for a file output

An example SCRIPT has an input from a database column and an output to a column in a file:

Custom Sources and Targets

(1) Input Source columns(3) Output Target columnsDescription ( * indicates required fields)
SOURCE_DATABASE_TYPE *TARGET_DATABASE_TYPE *

Any unsupported value is considered to be a custom Data Container, and the source element is considered a Custom element.

NOTE: this field is required, so if the custom object does not belong to a specific container, the field can have value: Custom Data Container

SOURCE_SERVER_NAME TARGET_SERVER_NAME NOT APPLICABLE
SOURCE_DATABASE_NAME TARGET_DATABASE_NAME NOT APPLICABLE
SOURCE_DATABASE_SCHEMA TARGET_DATABASE_SCHEMA NOT APPLICABLE
SOURCE_DATASET_NAMETARGET_DATASET_NAMELevel hierarchy if exists without the last level that is stored in the ELEMENT ( eg. type-level5/type-level4/type-level3/type-level2)
SOURCE_DATASET_TYPETARGET_DATASET_TYPEType of the custom Element if Exists. If the type is an existing class in EDG then the custom object will be created as an instance of this type. In all cases, the custom object will be type of CustomObject
SOURCE_DATA_ELEMENT_NAMETARGET_DATA_ELEMENT_NAME

Name of the specific element/column of the dataset that the executable uses

Detailed Examples for Executables

Executables can either reside in relational databases (e.g., stored procedures, functions, triggers) or be (external) scripts. Also, whenever the row describes the creation of a view, synonym or a custom target, then the executable is correspondingly VIEW, SYNONYM, or CUSTOM TRANSFORM. The following sections describe different executable types.

Relational Database Executables

NOTE: To declare the create View / Synonym function in a database, the EXECUTABLE_TYPE is VIEW or SYNONYM.

Executables(for Relational Databases)
EXECUTABLE_SYSTEM_TYPESupported RDB values: DB2, MSSQL, NETEZZA, ORACLE, TERADATA, HIVE .
EXECUTABLE_SERVER_NAMEif applicable, hostname of the script
EXECUTABLE_DATABASE_NAME The executable's (stored procedure's) database (server instance)
EXECUTABLE_SCHEMA_NAMEif applicable, the executable's (stored procedure's) database/schema (persistent data)
EXECUTABLE_NAME *name of the script
EXECUTABLE_TYPE *Allowed values:
  • SOFTWARE FUNCTION
  • ETL JOB
  • STORED PROCEDURE
  • SOFTWARE MACRO
  • SOFTWARE TRIGGER
  • VIEW
  • SYNONYM
  • INFORMATICA SESSION

Script Executables

Executables(for Scripts)
EXECUTABLE_SYSTEM_TYPEWhen EXECUTABLE_TYPE is SCRIPT, then see APPENDIX A for the supported script-type values of EXECUTABLE_SYSTEM_TYPE. If the script type is unsupported, then the SCRIPT will be created in EDG without a specific type.
EXECUTABLE_SERVER_NAMEif applicable, h ostname of the script
EXECUTABLE_DATABASE_NAME if applicable, the directory structure of the executable. It will be created as a FILE SYSTEM in edg
EXECUTABLE_SCHEMA_NAMENOT APPLICABLE
EXECUTABLE_NAME *name of the script
EXECUTABLE_TYPE *Value: SCRIPT

Custom Executables

Executables are custom when they have CUSTOM TRANSFORM. Conventionally it covers the cases where the target object is custom. In EDG, the executable it is created as a Mapping between the source element and the target custom object

Executables(for Custom executables)
EXECUTABLE_SYSTEM_TYPE

This is the executable's container, which is often the same as the target database type, but this is not restricted.

EXECUTABLE_SERVER_NAMENOT APPLICABLE
EXECUTABLE_DATABASE_NAMENOT APPLICABLE
EXECUTABLE_SCHEMA_NAMENOT APPLICABLE
EXECUTABLE_NAME *

Name of the custom transform. We consider it to be

the name of the target custom element
EXECUTABLE_TYPE *Value: CUSTOM TRANSFORM

Example row of a custom executable and related NeighborGram

Appendix A: Executable System Types

Supported values for EXECUTABLE_SYSTEM_TYPE:

  • DB2 SCRIPTS
  • HIVE SCRIPTS
  • IFPC
  • JAVA
  • MSSQL SCRIPTS
  • MSSQL TRANSACT-SQL
  • NETEZZA SCRIPTS
  • ORACLE PL/SQL
  • ORACLE SCRIPTS
  • PLSQL
  • PYTHON
  • SCALA
  • TERADATA BTEQ
  • TERADATA PT
  • TERADATA SCRIPTS

  • No labels