Skip to end of metadata
Go to start of metadata

Page Contents

Big Data Assets Collection Utilities

To perform various kinds of operations on a particular Big Data Assets Collection, open its name-link in the Big Data Assets home listing. This shows the Big Data Assets Collection's assets view/edit page, where the menu bar also lists multiple utility categories, each of whose operations are listed in the corresponding subviews.

Dashboard View

A model's dashboard summarizes various kinds of status and quality measurements that are relevant for that model's type.

Completeness and Validity

The Completeness and Validity charts summarize any data-quality issues found from running the collection's Reports > Problems and Suggestions. This includes issues such as missing values for required properties (incompleteness) or other types of rules violations. (Required properties are set by the cardinality checkboxes of a property form.) To update the dashboard charts or to see the issue details, run the Problems and Suggestions report.


Process summarizes information about tasks and workflows.

Settings View

The Settings view of this collection pertains to its various references (e.g., inclusions, URIs and namespaces, metadata, etc.).

Graph URI

This is an internal identifier for any data asset created by EDG.

External Graph URI

This defines a base URI that automatically maps with EDG's internal Graph URI (see utilities > Settings) during imports and exports. A manager can either edit this manually or set it automatically by importing an RDF file when no other value exists for the external URI. Also, RDF file import automatically redirects any owl:imports statements to the local copies. Thus, a manager can create a new Ontology and then import an RDF file to pre-populate it correctly.

The inverse mapping happens when graphs are exported back to RDF files: their external Graph URI is used instead of the internal urn:x-evn-master:... URIs.

Namespaces and Prefixes

This view lists the collection graph's namespace prefixes, which can be used in SPARQL queries, etc.. The collection's managers can edit them using Turtle notation.

NOTE: Although comments (#) are accepted, they are not preserved.

Included-By and Includes

These sections show the references (owl:imports) from and to other models. Note that the choice of models that may be included is restricted to types that are either required or permitted for the current model. See the main Big Data Assets page (under Home > Create) for additional documentation.

Includes based on Subject Area

Asset collections that enable this function will automatically include other collections contained in the same governance area and its sub-areas (both business and data-subject areas). By default, each collection has these auto-includes disabled. Auto-includes should only be enabled for governance areas having a reasonably small number of collections.

Default Namespace

The default namespace is used to construct URIs (unique identifiers) for the classes, properties and other resources in the Big Data Assets Collection.


This item only appears if the Big Data Assets Collection is referenced by any Crosswalk mappings. It lists links to the associated Crosswalks.


This section lets you view or edit information about the Big Data Assets Collection. There is a rich selection of metadata fields, and it is easy to configure EDG to include additional fields if required. The metadata is organized into sections. The view mode only shows the sections and fields that contain information, while the edit mode shows all sections and all fields.


This section records descriptive properties of the Big Data Assets Collection, including its official name (if different from the common name or label).

subject area

This is the governance area (business or data-subject) to which this Big Data Assets Collection is assigned (if any).


The Status section records the life cycle stages of the Big Data Assets Collection.

Users View

Big Data Assets Collection Permission Profiles

Managers of a production collection or a workflow can assign its permission profiles (viewer, editor, or manager) to various users, either as individuals or as security roles (e.g., from LDAP). A production manager can also assign permissions on its child workflows. For non-managers, this view is read-only.

Whereas a production collection allows settings for any EDG user, a workflow copy only allows access to viewers (at least) of the parent production collection. Because each collection or workflow assigns its permissions separately, a given user can have different profiles for a particular production copy and one of its child workflows, or for two different workflows. A blank setting excludes access to the user or role. Any user with multiple assignments on a given collection or workflow receives the greatest level assigned. See Workflows - Permissions for Production Collections and Workflow Copies for details.

Big Data Assets Collection Governance Roles

This section allows managers to assign governance roles to users: as individuals, as security roles (e.g., from LDAP), or as organizations. For details on governance roles, see Governance Model > Governance Areas (and Roles).

Import View

From any Big Data Assets Collection's production or working-copy home page, the Import functions lets editors copy graph data into the given Big Data Assets Collection from external sources such as RDF files, spreadsheets, etc.

Import RDF File

Any Big Data Assets Collection can import data from an external RDF file (in a serialized format). The Import > Import RDF File link shows a screen where the Choose File button opens a dialog for picking the external source file.

Choose the source file and identify its format. Decide whether to record new triples in the change history (use with care!) and then click Finish to complete the import. A message will indicate whether the import was successful.

When importing RDF into a Working Copy, the addition of each triple can be added as an entry in the change history, where it will be available to all the relevant reports. When importing into a production copy, the Record each new triple in change history checkbox gives you the option of adding these to the change history; note that this is not recommended when importing large amounts of data.

Import File using Script (Customization)

EDG can also incorporate custom scripts for importing arbitrary types of text files, including XML, JSON or spreadsheet files. Each such importer must be set up by a power user or Administrator based on a SPARQLMotion script. Once set up, the custom importers would show up on the Import view as shown in the following screenshot:

The common requirement of these script-based importers is that they take a single text file as input and output new RDF data. Script-based importers can be activated per model type (e.g. only for Ontologies) or even for only one specific model instance. They are a powerful mechanism to simplify repeatable tasks for end users.

This paragraph assumes the reader to be familiar with SPARQLMotion. In order to create such scripts, create a new RDF/SPARQLMotion file with TBC-ME. The file must end with .sms.ttl. Into that file, import the namespace (from / Also import the XYprojects.ui.ttlx file for the vocabulary type that you want the script to be activated for. For example, if you want to add a script for Ontology vocabularies, add an import to ontologyprojects.ui.ttlx. Next, use Scripts > Create SPARQLMotion Function/Web Service to create a new service. This service must take a single argument of type xsd:string and return a module of type sml:ReturnRDF. This argument will contain the text content that has been uploaded, e.g. the data from an XML or CVS file. The script may access the currently active target graph (vocabulary) using sml:ImportCurrentRDF. The script needs to produce a graph of new triples that shall be added to the target graph. At the web service instance that you have created (instance of sm:Function), use the property teamwork:suitableProjectType to specify the vocabulary type(s) that the script should show up for, e.g. ontologyprojects:ProjectType. Alternatively, use teamworkscripts:suitableVocabulary to link the URI of individual vocabularies, starting with urn:x-evn-master: .

Import Spreadsheet using Template

This screen lets you pick a spreadsheet and a template that will be used to convert the spreadsheet data into reference data. The template may be created using the mapping process explained in the ...using Pattern section, below.

The mapping can also be created using TopBraid Composer when the simple mapping described above is insufficient and you need to perform more complex transformations–for example, concatenation of values. TopBraid Composer's SPINMAP tool provides a drag-and-drop interface that makes it especially easy to create more complex mappings.

Templates developed with TopBraid Composer must be stored in files with ".tablemap." in their name (for example, myMapping.tablemap.ttl) and be uploaded to the EDG server to be available to EDG users. Spreadsheet imported using a template must have exactly the same structure as the spreadsheet used to develop the template. The names and order of the columns must be exactly the same. If multiple worksheets are used, the order and structure of each worksheet (even for worksheets that are not imported) must be the same.

Import Spreadsheet using Pattern

This link shows the following screen:

Click Choose File to pick the external spreadsheet file whose data you want to import into the Big Data Assets Collection. This may be an Excel file (file-type extensions: .xls or .xlsx), a tab-separated value (.tsv) file, or a comma-separated value (.csv) file. The file-name should have the expected extension. Because an Excel file may have more than one sheet of data, this screen lets you specify a sheet index value to identify which sheet to read in. The default is 1, for the first sheet.

The sheet index counts all sheets in an Excel workbook, including hidden ones. For example, if you enter a 3 here and EDG seems to import the second sheet, there may be a hidden one between the first and second sheet that made the third one look like it was the second one when Excel was displaying the workbook. The Excel online help explains how to check for the existence of hidden sheets.

The Entity type for the imported data field identifies which class (available to the current collection based on its type and includes) to use as a schema for mapping the spreadsheet's row data. In other words, the chosen class's structure (e.g., its datatype and object properties) will act as a template for mapping each data elements of each spreadsheet row into the current collection's assets (new or existing instances). Make the (1) file, (2) sheet index, and (3) entity type (class) selections and click  Next .

Select Spreadsheet Type

This view enumerates five possible (columnwise) layout patterns for the spreadsheet's row-item data, showing an example of each pattern.  For data explicitly structured into a hierarchy, like a taxonomy, t here are four layout options.  For all other data, there is the No Hierarchy layout (#1). In the hierarchical layouts, each row item also indicates its hierarchchal path, either explicitly (absolute path, #2, #3, #4) or implicitly (recursive path, #4, #5; note that lighter text in the layout patterns indicates optional data).


Note the header row of column labels in every layout. The imported spreadsheet should have such a header row.

Below the five layout options, the view also shows a sample of the spreadsheet's actual data.  This following image shows a spreadsheet of airport codes.

Select the layout title link that most correctly corresponds to the chosen spreadsheet's structure.

Import Spreadsheet

This view defines the data-mapping rules from the spreadsheet's columns into the class and optional hierarchy structures chosen for the collection data. It also allows users to (1) preview the import, (2) save its settings as a template for future imports, and it (3) initiates the import.

Column Mapping

The Column Mapping settings specify which spreadsheet columns will correspond to which properties in the target Big Data Assets Collection. Typically, the mapped properties belong to the target entity type (class), as either attribute/datatype or relationship/object properties. The mapping also supports inverse relationship properties, which belong to other classes that have relationships to the imported entities. Any unmapped properties will be ignored during the import, leaving those property triples unchanged on target instances in the graph.

The following example shows a mapping for the "No Hierarchy" layout.

For target datatype properties, the corresponding spreadsheet column should contain row-values of the matching datatype. The target datatype may also specify an optional Language field setting, which will add the selected language tag to each imported value.

For target relationship (object) properties, the corresponding spreadsheet column's row-value is used as a reference value that matches a property of one instance of the related class. The reference values may either be the corresponding property values of the referenced instances or they may the referenced instances' URIs. If all of the values in such a column are recognized as valid URLs, then they will be mapped as URIs, as indicated by the associated Use values as URIs label. Otherwise, the relationship values will be normal property values of the referenced class. If the related class has a designated primary key (PK) property, then this is automatically assumed to be the referenced property, and the spreadsheet data must have a column whose values map to this PK property. If the related class lacks a PK designation, then the user must indicated the referenced property by selecting it from the property drop-down list for the related class.

For inverse relationships, the spreadsheet column represents some other class's reference to the imported entities. Similar to forward (non-PK) relationships, if an inverse relationship is the chosen mapping, then there is a further choice of which referring-class property the spreadsheet provides as values for identifying the referring instances.

Even if the referenced property is not designated as a primary key (PK), it is still assumed that all of the corresponding property-values are unique across all referenceable instances. If duplicate values exist, then the referenced instance will be assigned arbitrarily.

Also note that if the related class designates a primary key property, then imported rows will always construct a reference, regardless of whether such an instance exists. On the other hand, if there is no such PK designation, then imported rows will construct a reference only if a matching instance exists.

As explained below, if the target of a relationship has the same entity type as the entity type for the imported data AND you are using matching on the property values to build a relationship, the Override existing values option must be unchecked. Otherwise, the relationship will not be created.

If the imported rows will generate any new instances, as opposed to only adding data to existing ones, then some column should map to the target label property, which determines the name or display field for an instance. When importing into reference datasets, then one spreadsheet column must map to the primary-key property that is designated for the dataset's  main entity class. For example, the screen image above identifies this field as the 2-letter alphabetic country code.

If the imported rows are adding new data values to existing instances and/or adding new instances, it is best to uncheck the Override existing values option. Checking this option has the following consequences:

  • If an instance already exists and has a value for any of the mapped columns, the value will be replaced with data coming from a spreadsheet.
  • Relationships between instances of the same type that rely on matching of values will not be created (because these values may be overridden as part of the processing).
  • When working with Taxonomies, a combination of checked Override existing values and the No Hierarchy pattern will always make imported instances top concepts of a new Concept Scheme, even if they already exist in the Taxonomy and have parent concepts.

Hierarchical spreadsheet types

If a hierarchical pattern was selected, then there will also be Hierarchy settings that specify how the spreadsheet represents the hierarchical relationships of its data items.

Note that one could still use the No Hierarchy import pattern for a taxonomy. This would import spreadsheet terms as separate concepts without connecting them hierarchically. The taxonomy's editors could then use the Concept Hierarchy view to manually arrange the imported nodes in the taxonomy's concept tree.

  • For Path with Separator spreadsheets, make sure to assign at least one column under Column Mapping as the preferred label. For other types, the importer can often infer the preferred label column. (See below about the Preview button.)
  • For Column-based Trees and Column-Pair Based Tree spreadsheets, specify the top and bottom levels of the hierarchy by picking the first and last column names.
  • For Path with fixed-length Segments spreadsheets, specify the column with the path values used as IDs and the length of the segments within the path IDs. In the Path with fixed-length Segments sample layout on the Select Spreadsheet Type screen, the Id column has the path values, and each two-digit segment of these values indicates a step of the hierarchy; removing the last two digits of any of those Id values shows the Id value for that term's parent. For example, Australia has an Id value of 010201, which has a parent value of Pacific (Id value 0102), which has a parent of World (Id value 01).
  • For Path with Separator spreadsheets, in which a spreadsheet entry such as "World > Europe > France" indicates the hierarchical structure above the term "France", specify the column storing these values using the Column containing the paths field and the Path separator character in the field with that name. If your spreadsheet also includes an ID column, the Hierarchy section includes a dropdown field to indicate this.
  • For Self-Join spreadsheets, there are columns to specify the Column containing the parent ids and the Column containing the child ids. In the Self-Join sample on the Select Spreadsheet Type screen, these would be the Parent and Term columns, respectively. A Hierarchy Property field also lets you set whether you want the "has broader" property used to identify relationships in the displayed hierarchy or something else.
  • The Generate in inverse direction checkbox will reverse the direction of how the property specified in Hierarchy property is applied.

Unique Identifiers

This section defines the URI for each imported row. Selecting Overwrite existing values will delete an existing value for a mapped property before adding its new (different) value; otherwise, new values will be added to existing ones. Selecting Record each new triple in change history (warning: not recommended for large files) prevents EDG from recording the addition of each new triple in the change history.

Preview button on the Import Spreadsheet form lets you see the RDF triples that would be generated with the currently configured settings. The browser's Back button returns to the form.

The Optional: Make this a reusable mapping template field lets you save all of the settings on this form so that you can later import other spreadsheets with a similar layout without filling out this form. Once you use this field to name a template of settings, if you later pick Import Spreadsheet using Template on the Import tab instead of Import Spreadsheet using Pattern, you'll see a drop-down list of the saved template names, a Browse button to pick a spreadsheet, and a Finish button to perform the actual conversion.

When you are satisfied with the sample data shown on the preview screen, click the Finish button. EDG will display a message about successful importing of the data along with a set of data that you can use as a mapping file for imports of spreadsheets with a similar structure in the future.

Transform View

Execute Rules

If any SHACL (or SPIN) rules have been defined in the current collection or any of its inclusions, they will be listed here. Editors can select rules for execution to generate inferred triples that are added to the current collection's graph.

Export View

Export Big Data Assets Collection as a Graph

Any viewer of a collection can export its data in a standard RDF serialization format.

Creation of reports and exporting of data are available when working with both production and working copies of reference data to anyone with who has read access.

To export subsets of Big Data Assets Collection data according to custom criteria and sorting, note that EDG's Search screen provides fine-grained control over the data to display on the Search Results area. That form's gear menu offers several choices to export the results into spreadsheet-compatible formats (e.g., for Excel). See Big Data Assets Collection View or Edit for details on searching.

Select JSON-LD, N-TriplesRDF/XML, or Turtle under the Export Big Data Assets Collection as a Graph header of a production or working copy's Export... header to generate an RDF file representation of the reference data using one of these formats. Different browsers may display the result different ways, or perhaps not display anything at all in the standard browse window; selecting your browser's equivalent of the View Source command will display the actual RDF data. When viewing the source, you can also pick Save As or Save Page As from your browser's File menu to save the RDF file as a disk file.

Instead of clicking one of these three links, an alternative is to right-click it and then pick Save Target As or Save Link As, depending on your browser, to save the RDF representation of the data to a local file. A dialog box will then prompt you for the name and location of the file.

SPARQL Endpoint

This allows users of the collection to run new or saved SPARQL queries on it and to optionally save queries for others. Saved queries can be deleted by their creators and by collection managers. If SPARQL updates have been enabled by an administrator, editors (and managers) can run them, but viewers cannot. Note that the Pivot Table and Geo functions can be slow on some platforms and are not supported for Internet Explorer.

Saved SPARQL Queries

This lists the SPARQL queries that have been saved for the collection. For each query, it provides a URL that will run it, along with an Export Query button that runs it and shows the results.

Publish Big Data Assets Collection for Explorer Users

On an EDG server that is paired with a TopBraid Explorer server (for read-only access), managers of an editable Big Data Assets Collection can publish it to the Explorer for viewing. Note that the working copies of all published asset collections might or might not be viewable, depending on the Explorer's administrative configuration.

Any manager of an asset collection can control its Explorer publication status by selecting Export > Publish Big Data Assets Collection for Explorer Users. The view shows a Status drop-down for the asset collection, which indicates whether the asset collection was ever Published or not (Unpublished).

It also lists any included asset collections that might also require publication.

Ensure that all included graphs are either already present on the Explorer server or published along with the asset collection.

Changing the status causes the following action.

Current StatusChosen OptionResult
UnpublishedPublishedSends a copy of the asset collection and selected includes to the Explorer server. Changes the source collection's status to Published.
PublishedUpdate Published CopyRe-sends a current copy and selected includes to the Explorer server, overwriting the previous version(s). Keeps the source collection's status as Published.
PublishedUnpublishedDeletes the asset collection on the Explorer server. Changes the source collection's status to Unpublished.

Export Saved Search

The Saved Search link shows a screen listing your saved searches. These are searches that you have saved using the Save current search button in the search form of the Editor page.


After setting the Result Format for a given search, clicking its Export button will download the search results in that format. Your saved searches are web services. They can also be used as an APIs by other systems.

GraphQL Queries

This allows users of the collection to run new or saved GraphQL queries on it. For users unfamiliar with GraphQL, there is an included link to a tutorial inside TopBraid EDG and TopBraid Composer. For more information on GraphQL visit

Reports View

Anyone with read access to a production Big Data Assets Collection or working copy can generate various standard reports for it. Custom reports are also possible.

Problems and Suggestions Report

For any Big Data Assets Collection (production copy or working copy managed by a workflow), Reports > Problems and Suggestions checks the current state of the Big Data Assets Collection against all of its applicable quality rules (i.e., its shapes and validity constraints they define) and enrichment rules. A message box shows the rule-processing progress and then shows the report. Note that the report results are also reflected in the Dashboard > Completeness and Validity display.

Users can also enable validity checking when they are viewing individual resources in the form. This setting the applies across all of asset collections user works with. See the View or Edit documentation for details.

To develop custom extensions to this feature, see EDG Developer Guide > Extending the Problems and Suggestions Reports.

View Shapes and Constraints

This link lists all of the SHACL shapes and constraints that are currently applicable for the given Big Data Assets Collection. Editors of the Big Data Assets Collection can individually disable them (cf. internally using sh:deactivated=true). They will then be disabled for the asset collection you made this change for and for any asset collections that include it. To disable them more globally e.g., for all the Big Data Assets, use Ontology modeling features of EDG.

Note that shapes not only define rules about valid values for the properties, they also specify that a given property is available for a given type of asset. When you use View Shapes and Constraints page to disable them, you are disabling the field. Ontology modeling features of EDG give you finer control. You can keep the field but disable some of the constraints defined for it.

View Change History

Click Reports > View Change History to show the Change History view. For a production copy, this shows all the changes made since it has been created. For a workflow, this shows only changes made within the working copy managed by the workflow.

Clicking the Search button on the Change History screen displays a time-stamped list of the saves made in the Matching Changes panel, and clicking one of those lines displays details about what changes were made as part of that save operation in the Details of Selected Change panel. Below, the change made on July 30th has just been clicked, showing that three values were added and one was deleted as part of the change made with a particular save operation.

If you are logged in as a user who is editor or manager of the vocabulary/asset collection or a workflow where the change was made, then a link Revert this Change will appear in the bottom panel. Click on this link to undo this operation. This will in fact create a new "forward" edit in the change history, with yourself as author. Note that this feature should be used with care, because reverting some steps from the middle of the change history may lead to orphan resources in your model.

If you are logged in as a user who is editor or manager of an asset collection and look at a change performed in a working copy as part of a workflow, then a link Commit this Change to production will appear in the bottom panel. You can click on this link to move the change history entry (in the example above, the three additions and the deletion) out of the workflow copy and into the production copy, essentially cherry-picking which change from a workflow copy you want to accept. As with the Revert feature mentioned above, this feature should be used with care, because committing some steps from the middle of the change history may lead to creating data statements that are disconnected from the rest of the information. For example, when you commit a change that has modified some attribute of a newly created code, then you should also make sure that the change that created the code in the first place has also been committed.

Before you click the Search button, you can narrow the scope of the search by filling out any or all of the fields at the top of the form:

  • creator Enter the name of a particular EDG user to only see changes by that user. This field uses typeahead, so that if you have users named "Joe" and "Joan" and only type in "Jo", these two names will appear in a drop-down list for you to pick from.

  • date Enter a date in the first date field to see all changes after that date, a date in the second field to see all changes before that date, or in both fields to see the changes within a particular date range. (There's no need to actually type in the date value; clicking in either field displays a calendar where you can then click on the date you want to enter.)

  • status Enter "committed" or "uncommitted" to only list changes with one of these status values.

Comparison Report

For a production Big Data Assets Collection, this report shows its differences with another, user-selected Big Data Assets Collection. For a working copy, it shows the differences to its parent production version. Note that differences do not extend to the contents of included asset collections. The report will list each changed assets and properties that were changed, showing the changed values. If a value was added, it is shown in green. If it was deleted, it is shown in pink.

For example, the following shows what happens after the preferred label property for "South Korea" is edited, an alternative label is added, and a "Seoul" is added as a narrower value of the "South Korea" resource (renamed to "Republic of Korea").

The right hand side of each change contains a link View Change that displays a dialog box with details of the change log entry that caused that particular change. Depending on your permissions, you can revert or commit the change in that dialog box. See View Change History for further information on reverting and committing individual changes.


For any Big Data Assets Collection (production or working copy), Reports > Graph Statistics displays details about the Big Data Assets Collection's node distribution. The following shows the statistics for the sample Reference Dataset: Country Codes.

Workflows View

This view allows users to start workflows, and it lists both the active and completed workflows, if any. (For general information about workflows, see Workflows).

Start new Workflow

This button opens a form for starting a workflow. If multiple workflow templates (types) are available, select the appropriate one. The new workflow requires a name and allows you to enter an optional description, both of which remain editable by managers. For more information, see Workflows Utilities and related pages.

Users can also create a workflow pertaining to a selected asset (see View or Edit > Actions > Additional asset actions). Such workflows record the identity of the selected asset but are otherwise ordinary.

Workflows in Progress

This section lists any active (uncommitted) workflows of the collection. To access a particular workflow, select its row and click Go to Workflow. You will see a page showing you the status of the workflow and, depending on the workflow's status and your role, allowing you to move the workflow to the next state.

Also depending on workflow's status and your role, you can view or edit the workflow and view or execute various utility actions on it. A workflow can be used to process changes to multiple assets or changes to one specific asset. Each workflow isolates its changes to its own workflow copy, which does not affect other workflows or the production version, until and unless the workflow is committed back into production.

If the workflow was created for a specific asset, its name will appear in the row. Selecting the row and clicking  Go to Asset will open the asset's details view, which workflow editors can also modify.

Completed Workflows

This table works similarly to Workflows in Progress except that it lists the workflows that reached the terminal state. Typically, this means that changes have been finalized and committed to the asset collection. Users can view the history of workflow transitions. Each completed workflow shows its number of changed statements (triples), giving users information about the volume of changes made as part of the workflow. For completed workflows with extensive changes, preserving such history of changed triples might occupy considerable space. Therefore, asset collection managers can select a completed workflow and use the Archive action to remove the audit trail from the change history. The change records are copied into a file in a new project that an administrator can access if the change history details are ever needed again. To browse these files, use the Base URI Management page in the Server Administration area. The files will be located in a project (or folder) called "Archive". If these are not longer needed you can move them off the server. 

Tasks View

The Tasks feature allows users to associate tasks with asset collection resources. If the Tasks item is not listed in asset collections' main utility (operations) view, then see EDG Configuration Parameters for how an administrator can enable the Tasks activated configuration parameter.

Tasks for [Big Data Assets Collection NAME]

When this feature is active, tasks can be associated with either a top-level asset collection or with a resource it contains, such as a class, property, or individual code. The Tasks management view of an asset collection shows the tasks associated with it at any level. At the bottom of the view, the Create Task link displays a dialog box where you can add a new task's description and user assignment. Once you click this dialog box's OK button, EDG adds the task to the list for this data asset, where you can reset its status or who it's assigned to with that form's drop-down lists. You can then filter the list display by these values.

When editing a particular asset resource within a collection, the Tasks button on its details edit form allows viewing and creating tasks about the resource. The button also indicates the number of tasks assigned to the resource.

Newly created tasks are, by default, assigned to the manager of the asset collection, who can then reassign tasks to other users. A user assigned to a task can change its status and enter comments about tasks.

Administrators can activate a feature to Send task emails in the EDG Configuration Parameters. When activated, users with an email address (e.g. via LDAP) will receive emails whenever a task gets assigned to them, or if a property of an assigned task has changed.

Comments View

The Comments feature allows users to associate comments with asset collections and asset resources. If the Comments item is not listed in asset collections' main utility (operations) view, then see EDG Configuration Parameters for how an administrator can enable the Comments activated configuration parameter.

Recent Comments

When viewing or editing a resource such as a class, instance, or taxonomy concept, the Comments button in the lower-right shows how many comments have been added to the selected resource for this production or workflow copy. Clicking the button displays a dialog box where you can see previous comments and add your own under the "Add Comment" title; click the OK button when you are finished.

Comments have a status such as "open," "declined" or "resolved." The status of those can be changed using a drop-down list to the right of each comment entry. If you also have the TopBraid Explorer (Viewer) application, the display can also include comments from those viewers, marked with (via TopBraid Explorer).

To get a list of of the most recent 100 comments for a production or workflow copy, select its Comments management view. These comments can be filtered by status, for example, to only display the "open" comments.

When resources such as concepts, classes, or instances are deleted, their comments are not automatically deleted with them. These are known as "orphan comments." If there are any orphan comments associated with a given asset collection, the Comments view will include a hypertext link saying "Delete the X orphan comments about entities that no longer exist," where X is the number of orphan comments associated with this asset collection. Clicking this link will delete these comments.

Manage View

Each collection's Manage view is only available to its managers.

Create a Cloned Version

Managers of a particular Big Data Assets Collection can use the Create a Cloned Version function to create one or more named clones of the Big Data Assets Collection. A new clone will have the same content and user permission settings as the original production instance. However, neither the change history nor the working copies will be cloned.

Cloning is often used to "branch off" a version of the Big Data Assets Collection, so that it can be referenced and imported separately from the current version. For example, one could start with a Big Data Assets Collection called "People." Then, on reaching a milestone, one could create a clone and call it "People 1.0." Now, any other Big Data Assets Collection that explicitly should only use terms from version 1.0 could change its includes to that version only, while the ongoing work towards version 2.0 will continue on the main "People" Big Data Assets Collection.


Managers can Clear a particular Big Data Assets Collection, which deletes all of its content, history, working copies, comments, and tasks. The empty Big Data Assets Collection itself and its user permission settingss will be preserved. This feature can be used prior to file imports, to replace the whole content with an externally generated version.


Managers can delete a Big Data Assets Collection via its   Delete  link, which raises a message box to confirm the deletion. Clicking  OK  will delete the Big Data Assets Collection production instance and any working copies and history data.

A deleted Big Data Assets Collection is not recoverable.

Configure Notifications

For each Big Data Assets Collection, EDG can send notification messages to users in selected roles when certain kinds of changes happen to it. In order to receive email notifications the SMTP parameters in the Server Configuration must be configured. The Manage > Configure Notifications link provides a page listing all available Notification Events together with check-boxes to select the governance roles that will be notified:

The association of users with the governance roles for this collection is configured via governance areas. The user settings can be specified directly as individual users or indirectly as either user security roles or job titles. See Governance Model Overview for a discussion.

JIRA Project Key

Note that this item only appears if an administrator has setup the EDG Administration: JIRA Integration Parameters.

JIRA is Atlassian's web application for team issue-tracking. EDG's JIRA launch-in-context (LiC) feature allows users who are working in both EDG and JIRA to launch from editing particular EDG asset items into related JIRA searches and new items.

If the EDG JIRA feature has been administratively setup, then each collection manager can optionally set a JIRA project key string for the asset collection, where the JIRA-key identifies a specific project in the JIRA application. Setting the project key then enables JIRA LiC functions for collection editors. When editing any asset item, editors can use its gear menu to create or search for related JIRA issues. See Big Data Assets Collection View or Edit – Manage > JIRA Launch-in-Context for details.

Setting the project key also adds a JIRA link to the collection's utilily view header, which launches into JIRA to show the configured project's open items.

Record Triple Counts only

Selecting this option disables retention of change history at the level of individual triples (for production graphs). It records summary counts of changed triples added or deleted. This significantly reduces storage and memory impacts at the cost of losing detailed change information and the ability to undo (revert) the changes. Working copy graphs and existing change histories are not affected by this setting.

Root Class (of Hierarchy)

For a given Big Data Assets Collection, Manage > Root Class of Hierarchy lets one reset which class will be the root of its Class Hierarchy whenever someone edits the Big Data Assets Collection (for example, if the Big Data Assets Collection specializes a standard class, perhaps its custom class's ancestors should not show).


Configure Solr

Apache Solr is an open-source enterprise search platform that can be used in conjunction with TopBraid. In a nutshell, TopBraid can be configured to automatically update a Solr search index whenever changes to a vocabulary are made. The resulting Solr index can then be queried either from external applications or via TopBraid's own SWA Faceted Search components. In addition to powering faceted search, Solr can also improve string and text searches, speeding searches over large vocabularies.

Setting up a Solr Integration

This section applies for Solr 6.1 and later.

Launch Solr with:

bin\solr start -e schemaless

and then use the following URL when configuring a solr server: http://localhost:8983/solr/gettingstarted

After making any changes to those settings, you may want to (re-) initialize the Solr index by clicking on Rebuild Solr Index in the Export section of the vocabulary page.

If Solr has been activated as above, any regular edit to the production (master) vocabulary will be automatically sent to the Solr index. For example, if a user adds a new description to a vocabulary concept, the Solr index will contain a Solr document with the URI of the concept as its id and Solr fields for each of the relevant properties of the concept.

Manual Re-indexing of a Solr Index

After certain batch operations, the automatic indexing may not be triggered, and the Solr index needs to be recreated by hand. In particular, the index should be rebuilt if the imports of a vocabulary have been changed, because the Solr index will cover both the main vocabulary and its imports. Use Rebuild Solr Index from the Export section of the vocabulary page to trigger a rebuild. Note that this will completely wipe out the associated index.

Using a Solr Index

The Solr index can be accessed by any Solr-compliant third party application. In addition, the SWA Faceted Search interface will automatically make use of the Solr index if one has been configured for the vocabulary being queried.

The faceted search component can be accessed from a menu item in the drop down menu under the search form: Open Faceted Search WIndow... for the currently selected class. Furthermore, you can define a custom teamwork project type that uses an editor that includes the faceted search. Check the TopBraid Teamwork Platform Help to learn more.

Enable Per-Asset Governance Roles

When checked, this option allows editors to assign governance roles at the level of individual assets. When enabled, editing a selected asset shows a section Governance Roles for this Asset, which lists available governance roles, each of which is multiply assignable to users and security roles. The roles set on an asset will pertain to any workflows directly spawned from the asset (whereas collection-spawned workflows use the collection-level role settings).

  • No labels