Page Contents


Reference data are standardized codes or data entities that are typically used by multiple applications as lists or tables. In fact, they are often called "code tables." An individual code table may seem like a simple thing, but a well-managed collection of code tables and related reference data spread across an enterprise is a resource that can bring great value to that enterprise—or cause great problems if it is not well maintained. EDG lets you control your reference data so that you can put it to work for you as efficiently as possible.

For additional information on reference datasets see:

This document is organized by roles showing how:

  • Reference Data Stewards can create and modify enterprise  reference datasets  and  ontologies, import reference data and manage information about it.

  • Data Stewards can create reference datasets that reflect reference data in sources they are responsible for. They can then use crosswalks to align these with the enterprise reference datasets for the same entity.
  • Data Managers can export and provision reference data for use in their applications.

  • Business Analysts and other users can consult EDG to learn more about codes and code sets important to their work.

Reference Data Management

Getting Started for the Reference Data Steward

Defining the Structure for Reference Dataset

Each reference dataset designates some ontology class as the dataset's main entity, which defines the type of the dataset's reference instances, i.e., the individual code items.

Ontologies describe business entities, including entities for which you will govern reference data (codes). Ontologies can be thought of as a powerful flexible representation of business glossaries. An ontology may contain a class (entity) such as country, product category, industry and so on. Each of these entities can have different fields (properties) making it easy to support different types of reference data. Reference datasets in EDG are not limited to having only a handful of predefined fields such as a code and a description. They can have any property you may need to capture. For example, a reference dataset for country codes may have properties such as the various ISO codes, capital, gross national product, and language.

In order to create reference data, we need to first define the corresponding entity and its properties in an ontology.

Select the Ontologies in the left hand side navigation menu to see the list of ontologies you have access to. EDG lets you create a single enterprise ontology or a set of individual ontologies (for example, per department or business area) which can be combined with one another using the "includes" mechanism.

TopBraid EDG Samples project includes a number of sample ontologies and datasets.

This tutorial uses the ontology:  Enterprise Ontology - Example. To obtain this file please download the EDG samples.

In this tutorial, we will be extending Enterprise Ontology model with definitions necessary to support a new reference dataset. Alternatively, a new ontology can be created. For information on creating a new ontology, see  Create New Ontology in the User Guide.

Select Enterprise Ontology from the table to go to a page where you can perform various operations with it - make changes, import data into it, exports it, etc.

Users that have edit privileges can make ad-hock changes to a given ontology or dataset. Otherwise, they must can follow a more formal process of modifying an ontology by using Workflows which will sandbox all changes into an isolated working copy until they are reviewed and approved. See Workflow Overview for details. In this tutorial we will make the change without using a workflow.

Creating a New Class

You will see several panels presenting ontology content.

Left pane shows a Class Hierarchy with classes (entities) and their properties (both attribute/datatype and relationship/object properties) shown as nodes in a tree.

Below the Class Hierarchy, you will see a panel that lets you create class members or instances. For example, if you select a class Country, you will be able to create countries. Best practice, however, is to keep schema and data separate. Later in this tutorial we will be creating a reference dataset for the class instance data.

You can disable the Instances panel by clicking on the Manage tab and switching the ontology into No Instances mode. You will need Manager permission for the ontology in order to see Manager menu tab.

The colored buttons at the top of the class hierarchy, next to the quick search field, will create a new class , attribute property , relationship property , or add property shape for the selected class in the hierarchy - associating already existing property or a set of properties with a class.

There is also a button that lets you view and create node shapes. These model elements support creating different role-specific views into reference data. We will not use this feature in this guide, but you can learn about it by looking at the user guide for Ontologies.

As shown in the screenshot above, clicking on a node in the tree (such as a class Country), displays information about it in the form to the right of the tree. The Edit button at the top of the View/Edit form, switches the form into edit mode, making all fields on the form editable. It may also display and let you edit fields that currently have no data and, thus, you will not see them in the view mode. Alternatively, you can edit values for each field in-line by clicking on the pencil icon that will appear when you position your mouse to the right of the field's name.

Later in this tutorial a reference dataset of airport codes will be created and populated with data from a spreadsheet. The following fragment shows data in this spreadsheet:




Country Code




Keflavik International Airport



































Sault Ste Marie

Sault Sainte Marie






Winnipeg St Andrews














St Anthony

St. Anthony






To add model support for this information, create a class named 'Airport' that will be used as the main entity in the reference dataset. To do this, select the top-level class named 'Thing' in the class hierarchy, click the yellow button in the header of the Class Hierarchy pane, enter the name "Airport" and click OK.

You will see the newly created class displayed in the Edit/View pane.

If desired, provide a description of your new class in the comment field. Scroll down to see GraphQL Schema section.

Click on the public class of field and select your ontology from the dropdown - as shown below:

Click the Save Changes button at the top of the pane.

To the right of the View/Edit form, you will find a Search panel (collapsed in the screenshot above). The search panel is collapsible and expandable by clicking on the blue "candy stripe" pattern in the vertical divider to the right of the Form panel.

Creating Attributes

We will now create the following attributes for airports.

Attribute Name (Label)

Description (Comment)


airport city

Main city served by airport. May be spelled differently from the airport's name.


IATA airport code

An IATA airport code, also known an IATA location identifier, IATA station code or simply a location identifier, is a three-letter code designating many airports around the world, defined by the International Air Transport Association (IATA).



A horizontal position of a location on the Earth according to a geographical coordinate system in decimal degrees, usually to six significant digits. Positive latitude is above the equator (North), and negative latitude is below the equator (South).



A vertical position of a location on the Earth according to a geographical coordinate system in decimal degrees, usually to six significant digits. Positive longitude is East of the prime meridian, and negative latitude is West of the prime meridian.


Create attributes by selecting the Airport class and clicking on the green icon at the top of the Class Hierarchy pane. After entering the name of the attribute and clicking OK, You will see the data entry form shown below

To set datatype, scroll down to the Type of Values section:

Press Save to save the descriptions and datatypes.

Alternatively to manually entering classes and properties, you can use Import>Import Schema from Spreadsheet to automatically create them from the first row of the spreadsheet and then adjust as necessary.

Creating Label Attribute

Note that an attribute for the airport name has not been created. This is because there is a built-in attribute "label" which is intended to hold names. Label is always asked for in the Create New dialog for reference data items. If we want to edit this field later, we need to tell TopBraid EDG to show it on the form. Since this is a special built-in field, this requires some additional setup.

Click on the Airport class, then on the Add Property Shape icon. Start typing "label" in the Create Property Shape dialog and select it from the dropdown.

On the form that will show up, set the GraphQL field name to "rdfs_label" (in the This Shape section) and datatype to string (in the Type of Values section) .

Defining Attribute to be used as a Primary Key

TopBraid EDG will always create a globally unique resource identifier, a URI, for each resource you create. There are different options for the URI may be constructed that are described in details in the User Guide.

For reference datasets, each entry in a dataset gets a URI derived from the reference data code. To enable this, you need to identify the field which will contain code values. The selected field is declared to be a primary key for the entity. Note that the field used as a primary key must always have unique values for a given class of codes.

We will use IATA airport code. Click on this property and click on Edit button. Scroll down to the Primary Key section and type in a namespace to prepend when creating the unique identifiers. For example,

Click on Save.

Creating Relationships

Next, click Create Relationship button at the top of the class hierarchy to create a relationship property named "airport country". In the comment field, describe it as "A country where an airport is located". Set its class of values to the Country class. (Failing to do this can cause problems when it's time to import data into the new reference dataset.) To do so, start typing "Count" in the class field in the Type of Values section and pick "Country" as it appears in the autocomplete.

Since the primary key for ISO Country is its two-character ISO country code and the spreadsheet contains this information, EDG will be able to create a relationship between airports and countries as we import spreadsheet data. Note that we have not created a field for the country name; names of the countries are already maintained as part of the country codes, and therefore including names will redundantly add another country name.

In the next step a reference dataset will be created that will store reference data for the airports.

For more information on working with ontologies, and especially creating property shapes that will let you validate reference data, see  User Guide.

Creating Reference Datasets

Go back to the home page by clicking the TopBraid EDG logo in the upper-left. You can now click on the Reference Datasets link in the left hand navigator menu, see the page with all reference datasets you have access to and create a new reference dataset using a link on that page.

However, we want to associate the reference dataset we're about to create with a particular "governance area". We can do this by creating the dataset directly from the Governance Areas page.

Governance areas group asset collections according to organization's business or data subject concerns. Governance areas are used to define a delineated part of stewardship. They partition and delegate ownership of assets, and define a meaningful context for assets that are associated with a governance area.

Select the Governance Areas link located in the left menu under Governance Model section. First, create a new governance area. Click the Create Data Subject Area button, add a data subject area with the label Logistics.

Not every user will have permissions necessary to modify governance areas. If you can't create a new governance area, contact your EDG Administrator.

Now you're ready to create the dataset. Choose Reference Dataset in the Choose type dropdown.

You will see the following page:

Enter Airports as the label (or name) of the dataset and for its description enter: Reference dataset of airports with IATA codes. The Ontology to Include option lists the ontologies that are available to the user, which in turn will provide the class for the dataset's main entity. In this case, select the Enterprise Ontology Example, which has the Airport class that you defined. Click Submit. You will see a message that the dataset was created and you will be forwarded to the Import page where you can load data. However, before we can do this, we must finish setting up the new dataset by identifying its main entity and specifying primary key for the entity - since we didn't specify the primary key in the ontology.

Setting the Main Entity

Ontology used for creating a reference dataset will typically contain several classes (entity types). After creating the reference dataset and before importing the airports data, you need to tell TopBraid EDG what reference data will be in the dataset. This is done by identifying the "main entity" for a dataset. In our example, it is Airport class. There are two ways to set the main entity initially.

  • If the main entity is unset, then editing the reference dataset will first require the main entity class to be selected. Click on Codes tab, and select Airport , from the provided dropdown that list classes from the included ontology.
  • A reference dataset's main entity class can also be set or changed via the its utility: Settings > Metadata > Edit > main entity (class).

Make the Airport class the main entity using the first method.

You will now use the Import tab that will let you import reference data from the spreadsheet you downloaded earlier.

Importing Reference Data

Select Import > Import Spreadsheet using Pattern. Then click Choose File to select the spreadsheet. (Download the airports.xlsx spreadsheet to get a local copy to import.) This page has two more fields:

  • Sheet index: by default this is 1. This spreadsheet has only has one worksheet and therefore there is no need to edit it.

  • Entity type: a list of classes from the included ontology (the enterprise ontology) to indicate which one is being populated by the airport. Ensure that Airport is selected.

Clicking Next shows several potential patterns for spreadsheet data. Select No Hierarchy. (Note: Reference data supports managing hierarchies as well as flat lists. However, the spreadsheet we are importing does not contain any hierarchical structures.)

The next step is to map the spreadsheet columns to the properties of the Airport class as shown below, which maps the columns to the properties defined above and to the built-in "label" property. Note that in the image below Altitude column was not mapped by choice - to demonstrate that only mapped columns will be imported. The Country column was also not mapped because it contains country names that are already managed as part of the ISO Country Codes reference dataset - also included in the samples project.

Click the Finish button. After data is imported, click on Codes tab to view the reference dataset.

A page appears containing a table with the imported data.

Click on Columns button to add more columns to the table.

Clicking on a row displays information in a new browser tab.

The table displays 25 rows at a time by default. This default can be changed by resetting the field at the top left corner of the table as shown above.

To save the current configuration of columns as a default for all users, click on the "more" icon to the right of the Columns button and then select Save Default Search.

Reference datasets can be organized in hierarchies as well as in flat lists. If a reference dataset contains hierarchical relationships between codes, these can be viewed and modified by clicking on the tree icon  to the right of the "more" icon.

Including other Reference Datasets

As shown in the first screen shot of the reference data, the Airport Country column contains URIs of the countries and not their names or the code values. It happens, because the reference dataset describing the country codes was not added to the Airports dataset (or not included in it). We can fix it by clicking on Settings tab and including the appropriate reference dataset. Click on Includes.

In a pop-up window, select "Country Codes" to include it in the Airports. After selecting, click on Close. Instances of the Country class will now be included in the Airports dataset by reference, meaning the data is not copied, but included.

Referencing other dataset in this manner ensures that reference data for countries is maintained in one place. If a country is renamed, for example, Cape Verde, an island country in West Africa, is renamed to the Republic of Cabo Verde, the update needs to occur in only one place, the ISO Country datasets. All datasets that include ISO Country will see this change immediately. At the same time, you will have access to country names and all other information from any reference dataset that includes country codes. The names and other reference data for countries is stored in the Country Codes dataset.

Once the reference dataset for countries is included, EDG will automatically match countries to the values of the "airport country" property. Click on the Codes tab. Note that Country codes appear in the Airport Country column instead of URIs as before. These codes come directly from the ISO Country dataset.

Click on any of the rows to see a View/Edit form for the selected airport.

The "airport country" property is now populated with a country code from the ISO Country dataset. Clicking on a country code link will open up a form that will show you other information about the country directly from the ISO Country dataset.

You can change the "focus" of the table from Airports to other data by using he dropdown field to the left of the user name in the header. Currently, 'Airport' is chosen. You can switch the focus to any other class related to the Airport. In our case, the only related class is Country.

Included data, such as the Country Codes data referenced by the Airports dataset, can be viewed and searched, but modifications to included data is not permitted. Included data can only be modified by editing the included referenced dataset directly. You will only be able to edit only codes for the main entity - or one of its subclasses.

Managing Metadata for a Reference Dataset

Reference datasets (and, in general, any asset collection in EDG) can have metadata such as name, description, status, etc. The metadata associated with an asset collection can be viewed/edited on the Settings page. Click on Settings and scroll down to the Metadata section. Expand the Dataset Status and Property Definition sub-sections to see available information about Airports dataset.

We have identified the main entity and entered the short description information earlier in this tutorial. In the Overview sub-section, the related entity value is automatically derived by the reference dataset as any class (entity) that is connected to the main class. Country appears because this is the main class for the Country Codes dataset now included. The last updated field is also automatically recorded.

When a dataset is first created, the status is automatically set to "Under development". This can be edited to update it when the status of the dataset changed.

TopBraid EDG is shipped with some predefined status values. They are configurable if your organization needs a different set of values.

The Property Definitions sub-section shows the description of each property of the main class as it was entered in the ontology. You can also add additional descriptions local to the dataset; these will not be part of the enterprise ontology.

Click the Edit button to see more available fields. You may want to differentiate private (internal) reference data from public (external) such as ISO country codes. Set  is external dataset  to "true" in the  Dataset Status  section of the form. IATA codes are maintained by the IATA Association, which publishes updates bi-annually. Change the status code to Approved. Click Save changes at the top of the Metadata section.

Once the status of a reference dataset is approved for use, you will no longer be able to delete codes from the dataset, but you will be able to change information about them.

Documenting a Reference Dataset as an Enterprise Reference Dataset

Your organization may have several reference datasets in EDG that contain codes for a given entity. For example, you may have different existing applications and corresponding sources that already store and use airport codes. The goal of standing up a system for managing reference data is to achieve alignment across your existing reference data and to streamline its management. This alignment takes time. At least initially, you may have in addition to a "master" reference dataset that you want to be a definitive source of reference data for a given entity across all system, reference datasets that capture what each of your systems is using.

To differentiate between your master reference dataset for airport codes and others "in situ" reference datasets, In the Metadata section of Settings click on Edit and find is enterprise dataset field. Set this flag to true and click Save.

If another reference dataset is created for the same entity, it could be mapped to the enterprise dataset using Crosswalks. TopBraid EDG can auto-create crosswalks between two datasets. It also offers crosswalk web services to translate between codes.

Creating Reference Data Facts

In the Metadata section of Settings, expand the Reference Dataset Facts section and enter the following "fact":

IATA codes should not be confused with the FAA identifiers of US airports. Most FAA identifiers agree with the corresponding IATA codes, but some do not, such as Saipan whose FAA identifier is GSN and its IATA code SPN, and some coincide with IATA codes of non-US airports.

Note that the text area displayed allows rich text, including hyperlinks. The links above can be replaced by choosing the text to be hyperlinked, such as "Saipan", and click the chain link in the icon box. Add the hyperlink to the text box that appears.

Click on the plus + icon to the left of the fact field name to add an additional entry and enter this additional fact there:

Since "Q" is used for international communications, IATA airport codes never begin with "Q".

Save your changes. The fact is now part of the metadata for the dataset and can be referenced, searched, etc.

You can define facts at a dataset level and also specify them for a given code in the reference dataset. If you want to do the latter, you need to include in your reference dataset a pre-built Reference Data Facts ontology. Your EDG administrator can also specify this inclusion as a system-wide setting for all reference data.

Entering Subscription Information for External (public) Reference Data

In the Metadata section of the Settings tab click the Edit button again. Set "is external dataset" to true and save. Edit again and you will see a new sub-section on the form called Subscription; this is used to capture subscription-related information for external reference datasets. Add "IATA Association" to the "sourced from" field. You will only need to type the first few letters of its name, because the reference data knows that only one defined organization begins with those letters.Click the Save Changes button.

For additional information, see Reference Dataset Utilities - Settings > Metadata.

TopBraid EDG is shipped with predefined metadata fields for reference datasets. They are configurable if your organization needs different metadata. EDG is a semantic, model-based solution. Configuration is done using steps similar to those used to modify ontology models to accommodate new reference data.

Assigning Access Privileges to other Users

For any asset collection in EDG, including reference datasets, a user can have one of the following permission roles (see Asset Collection Permissions for more information):

  • Viewer A Viewer can browse a dataset, viewing all the reference data (as well as any change history associated with that data) and the metadata associated with a dataset. A Viewer can create saved searches and export data. They can create and view tasks, add comments and change status of a task assigned to them. A viewer can also start a workflow. The Viewer then becomes the Manager of the working copy that is associated with the workflow. However, these changes will not affect the reference dataset until they are approved and committed by a user that has Editor permission for the dataset.
  • Editor In addition to being able to perform all activities that a Viewer can perform, an Editor can make changes to the dataset's metadata and to the reference data itself.
  • Manager A Manager has the most capabilities. In addition to all the activities that an Editor can perform, a Manager can delete an entire dataset, they can change the default columnar view for all users and they control the access privileges that other users have over a particular dataset by assigning Manager, Editor, or Viewer permission roles to them. They can also reassign and change the status of all tasks, even those that are not assigned to them. A person who creates a reference dataset automatically becomes its Manager.

To give others access to the dataset, go to the User Roles tab on the dataset's home page.

Permission levels can be set for (1) individual users, (2) user security roles (e.g., from Tomcat or LDAP), The list of users you will see on this tab can include individual users and LDAP roles. A Manager can assign Manager, Editor and Viewer privileges to each user or user group. User Roles page is also used to set up  governance roles (as defined in the Governance model) for individual reference datasets. Governance roles can also be defined at business area or data subject area a reference dataset is associated with.

Governance roles provide an alternative approach to assigning permissions. A user has any governance role for a reference dataset (or any other asset collection), specified either directly for a dataset or in directly for a subject area the dataset belongs to, will automatically get Viewer permission. And you can also assign Editor and Manager permissions to governance roles.

Modifying Reference Data

Dominica's main airport, the Melville Hall Airport, was just renamed to the Douglas-Charles Airport in tribute to its late prime ministers, Rosie Douglas and Pierre Charles. While your next bi-annual update from the IATA Association will reflect this change, you need to make it ahead of receiving the update.

Click on Codes tab. Search for airports in Dominica by clicking on Filters button, then selecting "airport country" field. Start typing "the comm.." to get the match on the official country name - "the CommonWealth of Dominica".

Then, click on the Search icon to get two airports in Dominica.

You can pick any of the available fields or a combination of fields to search on. Auto-complete works for relationships. For attributes, you can simply type in any string.

Search options include regular expression match, finding items that have no values in a given field, using nested forms, etc. Click on the dropdown to the right of the equals to switch to other options such as the use of the nested search which will let you search on the values associated with countries e.g., country code and will also let you add these values as table columns.

To get back the unfiltered list of all airports, remove the filter condition and click on the Search icon.

You can now change the airport name by clicking on Melville Hall, then clicking on the Edit button.

When you make the change to rename its label value to "Douglas-Charles Airport", you can check the Enter log message before clicking the Save Changes button if you want to include a log message about your change.

Alternatively to clicking on Edit, you can mouse over the field value and click on a pencil icon that will appear to the left of the value. This will open just that field for inline editing.

You will not be able to enter a log message if you use inline editing.

TopBraid EDG keeps a complete audit trail of all changes. Click the "Show History" check box to see the audit trail.

Instead of using Filters, you could have also clicked on Columns button, added the "airport country" column to the table and typed Dominica in the Refine field.

This approach, however, does not search across all reference data. It will only filter within the data that has been loaded into the table. Data loaded into the table can be a subset of the codes in a reference dataset. By default, TopBraid EDG will load 1,000 rows. Our airports dataset has over 5,000 entries. Thus, you may not find the result you are looking for even if data exists. Your EDG administrator can change the default setting. However, this may impact performance for large datasets. Using search filters is always the most reliable approach for large datasets.

Creating New Codes

To create a new airport, click on the New button in the button-row above the airports table.

Export, Collaboration, and other Activities

Some of data stewards' tasks overlap with the tasks of other users. For example, stewards may build exports of reference data, but so do data managers. These overlapping activities, including collaboration between users working with reference data, are covered in the Getting Started Guides for Data Manager and Getting Started for the Business Analyst.

Creating a Crosswalk

Some systems may use a different local set of codes for the same entity - in our case, Airport. In these cases, you will want to map local, in-situ codes to the enterprise reference dataset for airports.

First, lets extend the ontology by creating a new class "Local Airport". Define for it an attribute "local airport code" with the string datatype. Make it a primary key for the class. Specify start of the URI pattern of your choice.

Now, create a new reference dataset. You can do this from the Governance Areas page as described previously. Or, alternatively, go to the EDG home page and click on the Reference Datasets located on the left navigation menu under Asset Collections. You will see a page listing all  Reference Datasets you have access to. This page includes a Create New Reference Dataset link. When dataset is created this way, it will not be associated with any governance area. You can add association to a governance area later by updating dataset's metadata under Settings .

Let's assume that it is a dataset used by a hypothetical Flight Tracker application and call it Flight Tracker Airport Codes. Base it on the Enterprise Ontology.

Click on Edit. When asked, set main entity to Local Airport and click on Continue. Create a few New York area airports using data from the table below.


Local Airport

Local Code

La Guardia




Westchester County






 Create a new Crosswalk from the Flight Tracker Airports to the enterprise reference dataset Airports as shown in the image below. Click Finish.

You can now map two sets of airport codes manually or automatically. TopBraid EDG supports many to many mappings. Click on Mappings to view the crosswalk. Initially, it has no mappings. To map manually, position your cursor in a row in the target dataset (last column) and start typing the name of an airport.

Autocomplete list will appear. Select your choice from the dropdown and click on the green + button to create the mapping. You can also add a note to describe the mapping if desired.

To auto-map select Generate Mappings button.

TopBraid EDG will generate some suggested mappings for you based on the airport names. Move the confidence level to 50% in the slider to filter out unlikely suggestions.

You can now accept suggestions one by one or move the confidence level even higher to let's say 70%, accept all top suggestions and then individually pick any lower confidence suggestions you want to apply. From the generated list, we want to accept La Guardia mapping, Newark Liberty mapping and Westchester Co mapping. The official name of the Islip airport on Long Island is Long Island MacArthur, so it was not found. Add this mapping manually. Your crosswalk should now look as follows:

To see more information about the mapped airports including their IATA codes, you can double click on a row. The form will open in a separate window. For more on working with crosswalks see Crosswalk User Guide pages.

Documenting the Use of a Reference Dataset

If you are using TopBraid EDG for Metadata Management together with TopBraid EDG-RDM, you can document the use of a reference dataset in your applications catalog, data assets catalog and/or business glossary. See relevant User Guides for more details.

Getting Started for the Data Manager

While this section can serve as a standalone tutorial, it assumes that all steps described in the  Getting Started for the Data Steward section has already been completed. the Airports reference dataset has been created and populated with data and you have access to it.

Defining Reference Data Export

As a data manager, you may need to distribute reference data for use in your data source. Export is one way of doing this. Reference data can be exported in full or as subsets of data defined through search criteria. After finding the reference dataset you need, click the dataset's Export tab to view the available exports. (Examples in this section use the Airports reference dataset.)

This tab includes an option to export all information available in a dataset. There may also be exports that focus on specific subsets of data; these are accessible from the Export Saved Search or Export using Saved SPARQL Query links. If there is no export that suits you, you can create one.

Creating SPARQL query for re-use requires knowledge of SPARQL query language. SPARQL queries can be defined, tested and saved by clicking on the SPARQL Endpoint link. Export formats for these queries include CSV, TSV, Excel and JSON.

Saved searches can be created entirely using EDG UI. Click Codes tab, create and save a new search.

First, add columns for IATA code, airport city and country. Click on Filter and select airport country column. Start typing "US" in the airport country field and pick "USA" from the autocomplete. Click on the Search button and results will appear in the grid. Different export formats can be chosen by clicking on the "more" icon. Export formats include TSV, CSV, and JSON.

If these results fit your needs and you expect to pull this data from the dataset on a periodic basis, save the search by clicking the Save Search option from the "more" menu and giving your search a name such as "US airports". Saved searches are web services that you can use to automate distribution of reference data.

Select Open Saved Search option from the "more" menu to display a list of saved searches. Selecting one and clicking the Select button under the list will display your search criteria (filters) as specified by the selected search, and you can then click the Search button to re-run the saved search.

When selecting a saved search from the list, note that above the Select button is a URL that can be used as a RESTful web service call to invoke the search.

See  Search Within a Reference Dataset  for more information on specifying search criteria.

Viewing Saved Searches

Click on the Export tab if it is not already selected, then select Export Using Saved Search.

Export using Saved Searches will list all searches saved, including a description of the fields they include. The format for exporting the results of a saved search isJSON.

The URL of the saved search is displayed in the Service URL field, including a unique id for the saved search. This URL can be copied as-is and included in any third-party application needing to extract the codes in the saved search.

Saved searches can be deleted in the Search view for the dataset. Choose Open Saved Search, select the search to remove and click Delete.

Using TopBraid EDG-RDM Web Services

TopBraid EDG includes pre-built services for validating your locally stored reference data against the datasets managed by EDG. It also includes crosswalk services for translating from one set of codes to another set of mapped (cross-walked codes). See relevant Guides for more details on how to use these services.

Getting Started for the Business Analyst

While this section can serve as a standalone tutorial, it assumes that all steps described in the Getting Started for the Data Steward section has already been completed. the Airports reference dataset has been created and populated with data and you have access to it.

Finding a Reference Dataset

When you click on Reference Datasets link in the left hand side navigator, you will see a table listing reference datasets you have access to. This list can be long, especially in large organizations with lots of different reference data.

Table displaying datasets is sortable. Click on any of the columns to sort by the subject area a dataset belongs to, creator, date of creation, user that last updated it, update date, your level of permission, main entity and dataset description.

If you know of collections (e.g., ontologies or reference datasets) in your TopBraid EDG system that do not appear, you might not have the appropriate viewing or editing privileges for them. Each such collection requires a manager to provide access by setting you (or your security role) as a viewer, at least. See the collection type's User Roles utility documentation for details about these steps.

Further, you can filter the table by typing in the Refine field. The text string entered will be matched against the information in all columns.

Finding a Code

To find a specific code, go to Find Code in the left hand side navigator.

You can also use Search the EDG facility as described in the Getting Started with Business Glossaries.

Viewing Dataset's Metadata

The Settings tab > Metadata  section contains descriptive and contextual information about the dataset, grouped into sub-sections. Note that empty sub-sections might not be displayed until Metadata is placed into Edit mode.

Also, some "dependent" (variant) sections might not appear unless certain conditions apply, e.g., setting Dataset Status > is external dataset > true, identifies a reference dataset as external/public which makes available the Subscription section with associated fields describing the source of public reference data and how and when it gets updated.. The Property Definition (Semantic Analysis) section has a field-by-field description of each field in the dataset's main entity class.

Using Reference Dataset and Data Facts

As a Business Analyst you may have a report that needs to include a data feed that uses FAA airport identifiers. Reviewing the data, the FAA identifiers in the data seem to match the IATA codes, but you want to double-check that this would correctly integrate with the rest of your data which uses IATA codes.

Expand Reference Dataset Facts sub-section of the Metadata section. You will learn that while many FAA identifiers are identical with IATA codes, there are also differences. Assuming that these are the same codes would have let to errors in the integrated reports. To correctly integrate data, you should request a steward to build the crosswalk between the two sets of codes.

Creating Tasks and asking Questions about a Code

You may want to ask the reference data governance team to add FAA identifiers to the reference dataset because you believe this information will be useful not only for your immediate task, but for other applications and, should therefore be managed with the rest of the reference data.

Reference datasets let you log requests and questions in a form of Tasks. Tasks can be associated with an individual code or with the entire dataset. To create a task for the entire dataset, go to the Tasks tab for a dataset, click the Create Task link and enter:

"Most of my data is coded with IATA codes, but I am starting to integrate new data feeds that use FAA identifiers. Please expand the dataset to include FAA identifiers."

You can select which user to assign the task to. By default it will be assigned to the dataset's manager. Click the OK button.

The task is now displayed in the tab. Tasks can be filtered by assignee and by status. Clicking on a task opens it in a popup dialog that lets users post responses or ask for additional information by adding Comments.

To create a task for a specific airport, click Codes, select the code you want to associate a task with and click on the Task icon at the top of the form that shows detailed information about a code. Once a task is created, the number (0) in the icon will change to reflect a number of outstanding tasks.

Next Steps

You are now ready to explore the EDG User Guide - Overview to learn more about the many capabilities of TopBraid EDG, including workflows for team collaboration, importing more complex spreadsheets, and more.