Skip to end of metadata
Go to start of metadata

Page Contents

Overview of Big Data Assets

Big Data Assets specify the data structures, jobs, nodes and other software and hardware components that make up a big data ecosystem.

Big Data Assets Home

Selecting the Big Data Assets link in the left-navigation pane of TopBraid EDG (Home) lists all of the Big Data Assets Collection collections currently accessible to the user and, it allows authorized users to create new ones.

Prerequisites: Licensing and Enablement

The availability of any collection type (including Big Data Assets and customer-defined types) is determined by what is (a) licensed and (b) configured under Server Administration. To install a license or to view the currently licensed features, see Setup > Product Registration. To configure which licensed collection types are currently enabled or disabled, see EDG Configuration Parameters > Configure Asset Collection Types. For general licensing information, see the TopQuadrant website, which describes the TopBraid products and the  data governance packages that determine the available collection types.

Create New Big Data Assets Collection

The Big Data Assets > Create New Big Data Assets Collection link opens a form with fields used to define the new Big Data Assets Collection. Note that you can also create a Big Data Assets Collection by using a Create link in the Governance Areas page. 

Nobody will have a link for creating any asset collection until an administrator configures EDG's persistence technology as documented in Server Administration: Teamwork Platform Parameters: Application data storage . Additionally, each user will not have a create link unless the user or their role has a Create permission for the EDG Repositories project as documented in  EDG Rights Management .

Required and Permitted Includes

Collections often have natural relationships to other collections (e.g., each reference dataset's main entity class comes from an included ontology). Any collection using outside resources must first include the collections that contain them. Some inclusions might be required while others might merely be permitted. For example, taxonomies always include the SKOS ontology, and they may include other taxonomies. As mentioned, each reference dataset must include at least one ontology to define the dataset's entities. Glossaries always include the pre-defined EDG ontology that describes business glossary terms. Catalogs of data assets always include the pre-defined EDG ontology describing data assets and are expected to include definitions of relevant physical datatypes. These requirements can be further configured.

When creating a collection, any required reference to another type of collection will either be handled automatically or be presented for selection. If any required inclusion is omitted at its creation, then the resulting collection will show red warnings about the missing relationship(s). After creation, included collections can be changed using utilities view: Settings > Includes. When changing collection's includes, selection options are restricted to required and permitted types.

The Create dialog box asks for the Big Data Assets Collection's Label (name), its Default namespace and, optionally, a Description. The default namespace will be used to construct URIs (unique identifiers) for the resources in the Big Data Assets Collection. EDG will automatically pre-populate the default namespace based on the system-wide, configurable settings. Creator can change it. Recommended practice for all collection types is to use a '/' (slash) at the end of the default namespace. For ontologies, it is typical to use '#' (pound sign). However, '/' can be used as well. 

Creator is automatically granted Manager's permission for the new Big Data Assets Collection. When Big Data Assets Collection creation starts from the Governance Areas page, new Big Data Assets Collection is automatically associated with the selected area. When Big Data Assets Collection creation starts from the Big Data Assets home page, new Big Data Assets Collection is not connected to any governance area. To change this after creation, update in utilities: Settings > Metadata > Edit > subject area

Create New Big Data Assets Collection

This creates a new Big Data Assets Collection with yourself as the manager.

If using Search the EDG with Lucene indexing (the default option), an option exists on create to add this collection to the index. This is the same as selecting it in Search the EDG configurations with the default property selectors. 

Listing of Big Data Assets by Manage, Edit, or View

This home view lists all Big Data Assets that you can access in some way. Which ones you can see and what you can do with them depend on each Big Data Assets Collection's permissions settings for your user identity or security role. The listing groups the Big Data Assets according to your assigned permissions as either a manager, an editor, or a viewer:

  • Big Data Assets that you manage
  • Big Data Assets that you can edit
  • Big Data Assets that you can view

You will only see relevant categories. For example, if you do not have manager permissions to any Big Data Assets, you will only see "Big Data Assets that you can edit" and "Big Data Assets that you can view" groupings.

This page provides a focused, permission level oriented view on Big Data Assets. To see a view of all asset collections, irrespective of their type, that you have a governance role for click on your User Name in the upper right corner of the page.

If a Big Data Assets Collection is either missing or it is lacking expected features in your views, you or your security role(s) may lack proper permissions for the Big Data Assets Collection.  A manager of the Big Data Assets Collection can give you the needed permissions via its utilities' Users settings. For background information, see Asset Collection Permissions: Viewer, Editor, and Manager.

Another possible cause of a missing feature is that it requires administrative setup to become active. See EDG Administration for relevant within-application settings and/or see other EDG Administrator Guide documents for relevant external installation and integration setup

For each collection on the home page some brief metadata is available including information about workflows available to the user. In the image below the user has an action on the workflow that they have permission for. An action means they are in a role that is allowed to transition the workflow to the next state, such as "committed". If the user does not have an action, but they have permission for the workflow they will be presented a read-only view when accessing that workflow. 

  • No labels