Skip to end of metadata
Go to start of metadata

Page Contents


Workflow Utilities

Viewers of any production collection can access any child workflow in either of two ways:

  • select link: [collection type] > [production-collection listing] > [workflow name-link] or
  • select table row:  [production-collection page] > Workflows > Workflows in Progress [table][workflow-name row] , and then click button Go to Workflow.

Accessing a workflow requires at least viewer permissions on both the production and workflow copies.

A workflow's view has an active header section and multiple categories of utilities (subviews).

Name and Description

The Workflow's main view (gear tab) shows its name (menu bar) and description (top of tab-pane). Managers can edit either one by clicking on the text, editing, and confirming by clicking on the edit-box's check-mark.

Edit Working Copy

The Assets item opens the Workflow's editing view. (See Workflows for details on the difference between using workflows and directly editing production copies.) For details on editing the workflow, see the corresponding collection type's Editing page.

Utility Groups: [Gear], Dashboard, Settings, Users, etc.

Various utility groups provide operations related by functional area and/or user role.

Workflow View [Gear]

This view shows the workflow's current status (state) and available actions (transitions), according to the user's role.

The Basic Workflow

The standard EDG workflow template includes the Basic workflow, which supports optional review cycles and staging before committing changes to production.

The Basic workflow's states and actions (transitions)

Basic WF StatusAction

Role(s): Graph(s) *

DescriptionResulting Status
ANYRefresh Working CopyViewer: WFIncorporates other users' changes into this viewNO CHANGE
Cancel this workflowManager: eitherIrrevocably deletes the entire WFN/A

Uncommitted

(editable)

Commit changes to productionEditor: ProductionApplies all WF changes into productionComplete
Freeze for reviewManager: WFPauses WF editabilityFrozen for Review
Frozen for ReviewRequest further changesEditor: WFResumes WF editabilityUncommitted
Reject changesEditor: WFBlocks applying any WF changes into productionRejected
Approve changes Editor: ProductionEnds WF editability (staged for production)Approved
RejectedAllow further changesEditor: Prod. & Manager : WF Resumes WF editability Uncommitted
ApprovedAccept changes to productionEditor: ProductionApplies all WF changes into productionComplete
Complete   Complete
* All actions require at least viewer permissions on both the production and WF copies.

Settings View

The Settings view of this collection pertains to its various references (e.g., inclusions, URIs and namespaces, metadata, etc.).

NOTE: Workflow Settings are read-only. They can be modified on the production version by editors.

Included-By and Includes

These sections show the references (owl:imports) from and to other collections. Editors can modify the includes of production collections, where the choice of collections-to-include is restricted to collection types that are either required or permitted for the current production collection See the main Workflows page (under Home > Create) for additional information.

Default Namespace

The default namespace is used to construct URIs (unique identifiers) for the classes, properties and other resources in the Workflow.

Graph URI

This is an internal identifier for any data asset created by EDG.

External Graph URI

This defines a base URI that automatically maps with EDG's internal Graph URI (see utilities > Settings) during imports and exports. A manager can either edit this manually or set it automatically by importing an RDF file when no other value exists for the external URI. Also, RDF file import automatically redirects any owl:imports statements to the local copies. Thus, a manager can create a new ONTOlogy and then import an RDF file to pre-populate it correctly.

The inverse mapping happens when graphs are exported back to RDF files: their external Graph URI is used instead of the internal urn:x-evn-master:... URIs.

Metadata

This section lets you view or edit information about the Workflow. There is a rich selection of metadata fields, and it is easy to configure EDG to include additional fields if required. The metadata is organized into sections. The view mode only shows the sections and fields that contain information, while the edit mode shows all sections and all fields.

Overview

This section records descriptive properties of the Workflow, including its official name (if different from the common name or label).

subject area

This is the governance area (business or data-subject) to which this Workflow is assigned (if any).

Status

The Status section records the life cycle stages of the Workflow.


Users View

Workflow Permissions

Managers of a production collection or a workflow can assign its permission profiles (viewer, editor, or manager) to various users, either as individuals or as security roles (e.g., from LDAP). A production manager can also assign permissions on its child workflows. For non-managers, this view is read-only.

Whereas a production collection allows settings for any EDG user, a workflow copy only allows access to viewers (at least) of the parent production collection. Because each collection or workflow assigns its permissions separately, a given user can have different profiles for a particular production copy and one of its child workflows, or for two different workflows. A blank setting excludes access to the user or role. Any user with multiple assignments on a given collection or workflow receives the greatest level assigned. See Workflows - Permissions for Production Collections and Workflow Copies for details.

Import View

From any Workflow's production or working-copy home page, the Import functions lets editors copy graph data into the given Workflow from external sources such as RDF files, spreadsheets, etc.

Import RDF File

Any Workflow can import data from an external RDF file (in a serialized format). The Import > Import RDF File link shows a screen where the Choose File button opens a dialog for picking the external source file.

Choose the source file and identify its format. Decide whether to record new triples in the change history (use with care!) and then click Finish to complete the import. A message will indicate whether the import was successful.

When importing RDF into a Working Copy, the addition of each triple can be added as an entry in the change history, where it will be available to all the relevant reports. When importing into a production copy, the Record each new triple in change history checkbox gives you the option of adding these to the change history; note that this is not recommended when importing large amounts of data.

Import Spreadsheet using Template

This screen lets you pick a spreadsheet and a template that will be used to convert the spreadsheet data into reference data. The template may be created using the mapping process explained in the ...using Pattern section, below.

The mapping can also be created using TopBraid Composer when the simple mapping described above is insufficient and you need to perform more complex transformations–for example, concatenation of values. TopBraid Composer's SPINMAP tool provides a drag-and-drop interface that makes it especially easy to create more complex mappings.

Templates developed with TopBraid Composer must be stored in files with ".tablemap." in their name (for example, myMapping.tablemap.ttl) and be uploaded to the EDG server to be available to EDG users. Spreadsheet imported using a template must have exactly the same structure as the spreadsheet used to develop the template. The names and order of the columns must be exactly the same. If multiple worksheets are used, the order and structure of each worksheet (even for worksheets that are not imported) must be the same.

Import Spreadsheet using Pattern

This link shows the following screen:

Click Choose File to pick the external spreadsheet file whose data you want to import into the Workflow. This may be an Excel file (file-type extensions: .xls or .xlsx), a tab-separated value (.tsv) file, or a comma-separated value (.csv) file. The file-name should have the expected extension. Because an Excel file may have more than one sheet of data, this screen lets you specify a sheet index value to identify which sheet to read in. The default is 1, for the first sheet.

The sheet index counts all sheets in an Excel workbook, including hidden ones. For example, if you enter a 3 here and EDG seems to import the second sheet, there may be a hidden one between the first and second sheet that made the third one look like it was the second one when Excel was displaying the workbook. The Excel online help explains how to check for the existence of hidden sheets.

The Entity type for the imported data field identifies which class (available to the current collection based on its type and includes) to use as a schema for mapping the spreadsheet's row data. In other words, the chosen class's structure (e.g., its datatype and object properties) will act as a template for mapping each data elements of each spreadsheet row into the current collection's assets (new or existing instances). Make the (1) file, (2) sheet index, and (3) entity type (class) selections and click  Next .

Select Spreadsheet Type

This view enumerates five possible (columnwise) layout patterns for the spreadsheet's row-item data, showing an example of each pattern.  For data explicitly structured into a hierarchy, like a taxonomy, t here are four layout options.  For all other data, there is the No Hierarchy layout (#1). In the hierarchical layouts, each row item also indicates its hierarchchal path, either explicitly (absolute path, #2, #3, #4) or implicitly (recursive path, #4, #5; note that lighter text in the layout patterns indicates optional data).

 

Note the header row of column labels in every layout. The imported spreadsheet should have such a header row.

Below the five layout options, the view also shows a sample of the spreadsheet's actual data.  This following image shows a spreadsheet of airport codes.

Select the layout title link that most correctly corresponds to the chosen spreadsheet's structure.

Import Spreadsheet

This view defines the data-mapping rules from the spreadsheet's columns into the class and optional hierarchy structures chosen for the collection data. It also allows users to (1) preview the import, (2) save its settings as a template for future imports, and it (3) initiates the import.

Column Mapping

The Column Mapping settings specify which spreadsheet columns will correspond to which properties in the target Workflow. Typically, the mapped properties belong to the target entity type (class), as either attribute/datatype or relationship/object properties. The mapping also supports inverse relationship properties, which belong to other classes that have relationships to the imported entities. Any unmapped properties will be ignored during the import, leaving those property triples unchanged on target instances in the graph.

The following example shows a mapping for the "No Hierarchy" layout.

For target datatype properties, the corresponding spreadsheet column should contain row-values of the matching datatype. The target datatype may also specify an optional Language field setting, which will add the selected language tag to each imported value.

For target relationship (object) properties, the corresponding spreadsheet column's row-value is used as a reference value that matches a property of one instance of the related class. The reference values may either be the corresponding property values of the referenced instances or they may the referenced instances' URIs. If all of the values in such a column are recognized as valid URLs, then they will be mapped as URIs, as indicated by the associated Use values as URIs label. Otherwise, the relationship values will be normal property values of the referenced class. If the related class has a designated primary key (PK) property, then this is automatically assumed to be the referenced property, and the spreadsheet data must have a column whose values map to this PK property. If the related class lacks a PK designation, then the user must indicated the referenced property by selecting it from the property drop-down list for the related class.

For inverse relationships, the spreadsheet column represents some other class's reference to the imported entities. Similar to forward (non-PK) relationships, if an inverse relationship is the chosen mapping, then there is a further choice of which referring-class property the spreadsheet provides as values for identifying the referring instances.

Even if the referenced property is not designated as a primary key (PK), it is still assumed that all of the corresponding property-values are unique across all referenceable instances. If duplicate values exist, then the referenced instance will be assigned arbitrarily.

Also note that if the related class designates a primary key property, then imported rows will always construct a reference, regardless of whether such an instance exists. On the other hand, if there is no such PK designation, then imported rows will construct a reference only if a matching instance exists.

As explained below, if the target of a relationship has the same entity type as the entity type for the imported data AND you are using matching on the property values to build a relationship, the Override existing values option must be unchecked. Otherwise, the relationship will not be created.

If the imported rows will generate any new instances, as opposed to only adding data to existing ones, then some column should map to the target label property, which determines the name or display field for an instance. When importing into reference datasets, then one spreadsheet column must map to the primary-key property that is designated for the dataset's  main entity class. For example, the screen image above identifies this field as the 2-letter alphabetic country code.

If the imported rows are adding new data values to existing instances and/or adding new instances, it is best to uncheck the Override existing values option. Checking this option has the following consequences:

  • If an instance already exists and has a value for any of the mapped columns, the value will be replaced with data coming from a spreadsheet.
  • Relationships between instances of the same type that rely on matching of values will not be created (because these values may be overridden as part of the processing).
  • When working with Taxonomies, a combination of checked Override existing values and the No Hierarchy pattern will always make imported instances top concepts of a new Concept Scheme, even if they already exist in the Taxonomy and have parent concepts.

Hierarchical spreadsheet types

If a hierarchical pattern was selected, then there will also be Hierarchy settings that specify how the spreadsheet represents the hierarchical relationships of its data items.

Note that one could still use the No Hierarchy import pattern for a taxonomy. This would import spreadsheet terms as separate concepts without connecting them hierarchically. The taxonomy's editors could then use the Concept Hierarchy view to manually arrange the imported nodes in the taxonomy's concept tree.

  • For Path with Separator spreadsheets, make sure to assign at least one column under Column Mapping as the preferred label. For other types, the importer can often infer the preferred label column. (See below about the Preview button.)
  • For Column-based Trees and Column-Pair Based Tree spreadsheets, specify the top and bottom levels of the hierarchy by picking the first and last column names.
  • For Path with fixed-length Segments spreadsheets, specify the column with the path values used as IDs and the length of the segments within the path IDs. In the Path with fixed-length Segments sample layout on the Select Spreadsheet Type screen, the Id column has the path values, and each two-digit segment of these values indicates a step of the hierarchy; removing the last two digits of any of those Id values shows the Id value for that term's parent. For example, Australia has an Id value of 010201, which has a parent value of Pacific (Id value 0102), which has a parent of World (Id value 01).
  • For Path with Separator spreadsheets, in which a spreadsheet entry such as "World > Europe > France" indicates the hierarchical structure above the term "France", specify the column storing these values using the Column containing the paths field and the Path separator character in the field with that name. If your spreadsheet also includes an ID column, the Hierarchy section includes a dropdown field to indicate this.
  • For Self-Join spreadsheets, there are columns to specify the Column containing the parent ids and the Column containing the child ids. In the Self-Join sample on the Select Spreadsheet Type screen, these would be the Parent and Term columns, respectively. A Hierarchy Property field also lets you set whether you want the "has broader" property used to identify relationships in the displayed hierarchy or something else.
  • The Generate in inverse direction checkbox will reverse the direction of how the property specified in Hierarchy property is applied.

Unique Identifiers

This section defines the URI for each imported row. Selecting Overwrite existing values will delete an existing value for a mapped property before adding its new (different) value; otherwise, new values will be added to existing ones. Selecting Record each new triple in change history (warning: not recommended for large files) prevents EDG from recording the addition of each new triple in the change history.

Preview button on the Import Spreadsheet form lets you see the RDF triples that would be generated with the currently configured settings. The browser's Back button returns to the form.

The Optional: Make this a reusable mapping template field lets you save all of the settings on this form so that you can later import other spreadsheets with a similar layout without filling out this form. Once you use this field to name a template of settings, if you later pick Import Spreadsheet using Template on the Import tab instead of Import Spreadsheet using Pattern, you'll see a drop-down list of the saved template names, a Browse button to pick a spreadsheet, and a Finish button to perform the actual conversion.

When you are satisfied with the sample data shown on the preview screen, click the Finish button. EDG will display a message about successful importing of the data along with a set of data that you can use as a mapping file for imports of spreadsheets with a similar structure in the future.

Export View

Export Workflow as a Graph

Any viewer of a collection can export its data in a standard RDF serialization format.

Creation of reports and exporting of data are available when working with both production and working copies of reference data to anyone with who has read access.

To export subsets of Workflow data according to custom criteria and sorting, note that EDG's Search screen provides fine-grained control over the data to display on the Search Results area. That form's gear menu offers several choices to export the results into spreadsheet-compatible formats (e.g., for Excel). See Workflow View or Edit for details on searching.

For any collection, to generate an RDF representation of the graph data, select Export > Export Workflow as a Graph and then an RDF format:  JSON-LD, N-TriplesRDF/XML, Turtle or TriG, with or without inferences ("with inferences" adds a dedicated graph named urn:x-topbraid:inferences, which has any triples inferred via SHACL or SPIN rules; note that this computes on-the-fly and might be very slow).

Browser interactions might vary: displaying the data directly or via a kind of view source command. Alternatively, instead of directly clicking a format link, the browser might provide (on the link) a right-click menu option to save the link target to a file (i.e., without explicitly displaying the link result in the browser). A dialog box will prompt for the file location and name.

SPARQL Endpoint

This allows users of the collection to run new or saved SPARQL queries on it and to optionally save queries for others. Saved queries can be deleted by their creators and by collection managers. If SPARQL updates have been enabled by an administrator, editors (and managers) can run them, but viewers cannot. Note that the Pivot Table and Geo functions can be slow on some platforms and are not supported for Internet Explorer.

Export using Saved SPARQL Query

This lists the SPARQL queries that have been saved for the collection. For each query, it provides a URL that will run it, along with an Export Query button that runs it and shows the results.

Export using Saved Search

The Saved Search link shows a screen listing your saved searches. These are searches that you have saved using the Save current search button in the search form of the Editor page.

   

After setting the Result Format for a given search, clicking its Export button will download the search results in that format. Your saved searches are web services. They can also be used as an APIs by other systems.

Reports View

Anyone with read access to a production Workflow or working copy can generate various standard reports for it. Custom reports are also possible.

Problems and Suggestions Report

For any Workflow (production copy or working copy managed by a workflow), Reports > Problems and Suggestions checks the current state of the Workflow against all of its applicable quality rules (i.e., its shapes and validity constraints they define) and enrichment rules. A message box shows the rule-processing progress and then shows the report. Note that the report results are also reflected in the Dashboard > Completeness and Validity display.

Users can also enable validity checking when they are viewing individual resources in the form. This setting the applies across all of asset collections user works with. See the View or Edit documentation for details.

To develop custom extensions to this feature, see EDG Developer Guide > Extending the Problems and Suggestions Reports.

View Change History

Click Reports > View Change History to show the Change History view. For a production copy, this shows all the changes made since it has been created. For a workflow, this shows only changes made within the working copy managed by the workflow.

Clicking the Search button on the Change History screen displays a time-stamped list of the saves made in the Matching Changes panel, and clicking one of those lines displays details about what changes were made as part of that save operation in the Details of Selected Change panel. Below, the change made on July 30th has just been clicked, showing that three values were added and one was deleted as part of the change made with a particular save operation.

If you are logged in as a user who is editor or manager of the vocabulary/asset collection or a workflow where the change was made, then a link Revert this Change will appear in the bottom panel. Click on this link to undo this operation. This will in fact create a new "forward" edit in the change history, with yourself as author. Note that this feature should be used with care, because reverting some steps from the middle of the change history may lead to orphan resources in your model.

If you are logged in as a user who is editor or manager of an asset collection and look at a change performed in a working copy as part of a workflow, then a link Commit this Change to production will appear in the bottom panel. You can click on this link to move the change history entry (in the example above, the three additions and the deletion) out of the workflow copy and into the production copy, essentially cherry-picking which change from a workflow copy you want to accept. As with the Revert feature mentioned above, this feature should be used with care, because committing some steps from the middle of the change history may lead to creating data statements that are disconnected from the rest of the information. For example, when you commit a change that has modified some attribute of a newly created code, then you should also make sure that the change that created the code in the first place has also been committed.

Before you click the Search button, you can narrow the scope of the search by filling out any or all of the fields at the top of the form:

  • creator Enter the name of a particular EDG user to only see changes by that user. This field uses typeahead, so that if you have users named "Joe" and "Joan" and only type in "Jo", these two names will appear in a drop-down list for you to pick from.

  • date Enter a date in the first date field to see all changes after that date, a date in the second field to see all changes before that date, or in both fields to see the changes within a particular date range. (There's no need to actually type in the date value; clicking in either field displays a calendar where you can then click on the date you want to enter.)

  • status Enter "committed" or "uncommitted" to only list changes with one of these status values.

Comparison Report

For a production Workflow, this report shows its differences with another, user-selected Workflow. For a working copy, it shows the differences to its parent production version. Note that differences do not extend to the contents of included asset collections. The report will list each changed assets and properties that were changed, showing the changed values. If a value was added, it is shown in green. If it was deleted, it is shown in pink.

For example, the following shows what happens after the preferred label property for "South Korea" is edited, an alternative label is added, and a "Seoul" is added as a narrower value of the "South Korea" resource (renamed to "Republic of Korea").

The right hand side of each change contains a link View Change that displays a dialog box with details of the change log entry that caused that particular change. Depending on your permissions, you can revert or commit the change in that dialog box. See View Change History for further information on reverting and committing individual changes.

Property Value Rules

This shows the collection's property values that are inferred through SHACL: sh:defaultValue or sh:values (see Inferring Data with SHACL Property Value Rules for details). The inferred properties are depicted diagrammatically.

Statistics

For any Workflow (production or working copy), Reports > Graph Statistics displays details about the Workflow's node distribution. The following shows the statistics for the sample Reference Dataset: Country Codes.

View Transition History

This lists the temporal sequence of the workflow's status transitions, executed by various users.

Comments

This lists all user comments associated with the workflow. Workflow-level comments can be created here directly. Asset-level (resource instance level) comments created while viewing or editing the workflow also appear here. A comment's status can be changed by any workflow editor (or manager) and by the comment's creator.

  • No labels