Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current ·  View Page History

Taverna PROV data bundle (Taverna 2.x)

Structure of exported provenance

The file is a RO bundle, which species a structured ZIP file with a manifest (.ro/manifest.json). You can explore the bundle by unzipping it or browse it with a program like 7-Zip.

This source includes an example bundle and unzipped bundle as a folder. This data bundle has been saved after running a simple hello world workflow.

The remaining text of this section describes the content of the RO bundle, as if it was unpacked to a folder. Note that many programming frameworks include support for working with ZIP files, and so complete unpacking might not be necessary for your application. For Java, the Data bundle API gives a programmating way to inspect and generate data bundles.

Inputs and outputs

The folders inputs/ and outputs/ contain files and folders corresponding to the input and output values of the executed workflow. Ports with multiple values are stored as a folder with numbered outputs, starting from 0. Values representing errors have extension .err, other values have an extension guessed by inspecting the value structure, e.g. .png. External references have the extension .url - these files can often be opened as "Internet shortcut" or similar, depending on your operating system.

Example listing:

c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>ls inputs intermediates mimetype outputs workflow.wfbundle workflowrun.prov.ttl c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>ls outputs greeting.txt c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>cat outputs/greeting.txt Hello, John Doe 

Workflow run provenance

The file workflowrun.prov.ttl contains the PROV-O export of the workflow run provenance (including nested workflows) in RDF Turtle format.

This log details every intermediate processor invocation in the workflow execution, and relates them to inputs, outputs and intermediate values.

Example listing:

c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>cat workflowrun.prov.ttl | head -n 40 | tail -n 8 <#taverna-prov-export> rdf:type prov:Activity ; prov:startedAtTime "2013-11-22T14:01:02.436Z"^^xsd:dateTime ; prov:qualifiedCommunication _:b1 ; prov:endedAtTime "2013-11-22T14:01:03.223Z"^^xsd:dateTime ; rdfs:label "taverna-prov export of workflow run provenance"@en ; prov:wasInformedBy <> ; 

See the provenance graph for a complete example. The provenance uses the vocabularies PROV-O,wfprov and tavernaprov.

Intermediate values

Intermediate values are stored in the intermediates/ folder and referenced from workflowrun.prov.ttl

Intermediate value from the example provenance:

<> tavernaprov:content <intermediates/d5/d588f6ab-122e-4788-ab12-8b6b66a67354.txt> ; wfprov:describedByParameter <> ; wfprov:describedByParameter <> ; wfprov:wasOutputFrom <> . 

Here we see that the bundle file intermediates/d5/d588f6ab-122e-4788-ab12-8b6b66a67354.txt contains the output from the "hello" processor, which was also the input to the "Concatenate_two_strings" processor. Details about processor, ports and parameters can be found in the workflow definition.

Example listing:

c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>ls intermediates/d5 d588f6ab-122e-4788-ab12-8b6b66a67354.txt c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>cat intermediates/d5/d58* Hello, 

Note that "small" textual values are also included as cnt:chars in the graph, while the referenced intermediate file within the workflow bundle is always present.

<intermediates/d5/d588f6ab-122e-4788-ab12-8b6b66a67354.txt> rdf:type cnt:ContentAsText ; cnt:characterEncoding "UTF-8"^^xsd:string ; cnt:chars "Hello, "^^xsd:string ; tavernaprov:byteCount "7"^^xsd:long ; tavernaprov:sha512 "cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e"^^xsd:string ; tavernaprov:sha1 "f52ab57fa51dfa714505294444463ae5a009ae34"^^xsd:string ; rdf:type tavernaprov:Content . 

Workflow definition

The file workflow.wfbundle is a copy of the executed workflow in SCUFL2 workflow bundle format. This is the format which will be used by The file workflow.wfbundle contains the executed workflow in Taverna 3.

You can use the SCUFL2 API to inspect the workflow definition in detail.

The file .ro/annotations/workflow.wfdesc.ttl contains the abstract structure (but not all the implementation details) of the executed workflow, in RDF Turtle according to the wfdesc ontology.

c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>cat .ro/annotations/workflow.wfdesc.ttl | head -n 20 @base <> . @prefix rdfs: <> . @prefix xsd: <> . @prefix owl: <> . @prefix prov: <> . @prefix wfdesc: <> . @prefix wf4ever: <> . @prefix roterms: <> . @prefix dc: <> . @prefix dcterms: <> . @prefix comp: <> . @prefix dep: <> . @prefix biocat: <> . @prefix : <#> . <processor/Concatenate_two_strings/> a wfdesc:Process , wfdesc:Description , owl:Thing , wf4ever:BeanshellScript ; rdfs:label "Concatenate_two_strings" ; wfdesc:hasInput <processor/Concatenate_two_strings/in/string1> , <processor/Concatenate_two_strings/in/string2> ; wfdesc:hasOutput <processor/Concatenate_two_strings/out/output> ; wf4ever:script "output = string1 + string2;" . 

Taverna 3 Data bundle


  • None