Taverna PROV Data Bundle (Taverna 2.x)
Taverna 2.4 with the Taverna-PROV plugin 2.1.5 or later can export Taverna workflow runs as a Data Bundle. The bundle can be saved from within the Workbench results (Save All) or from the command line. The Data Bundle contains the workflow input and output values, intermediate values, a provenance trace and a copy of the executed workflow definition.
Structure of exported provenance
.bundle.zip file is a RO bundle, which species a structured ZIP file with a manifest (
You can explore the bundle by unzipping it or browse it with a program like 7-Zip.
The remaining text of this section describes the content of the RO bundle, as if it was unpacked to a folder. Note that many programming frameworks include support for working with ZIP files, and so complete unpacking might not be necessary for your application. For Java, the Data bundle API gives a programmatic way to inspect and generate data bundles.
outputs/ contain files and folders corresponding to the input and output values of the executed workflow. Ports with multiple values are stored as a folder with numbered outputs, starting from
0. Values representing errors have extension
.err, other values have an extension guessed by inspecting the value structure, e.g.
.png. External references have the extension
.url - these files can often be opened as "Internet shortcut" or similar, depending on your operating system.
inputs intermediates mimetype outputs workflow.wfbundle workflowrun.prov.ttl
Hello, John Doe
This log details every intermediate processor invocation in the workflow execution, and relates them to inputs, outputs and intermediate values.
c:\Users\stain\workspace\taverna-prov\example\helloanyone.bundle>cat workflowrun.prov.ttl | head -n 40 | tail -n 8
rdf:type prov:Activity ;
prov:startedAtTime "2013-11-22T14:01:02.436Z"^^xsd:dateTime ;
prov:qualifiedCommunication _:b1 ;
prov:endedAtTime "2013-11-22T14:01:03.223Z"^^xsd:dateTime ;
rdfs:label "taverna-prov export of workflow run provenance"@en ;
prov:wasInformedBy <http://ns.taverna.org.uk/2011/run/385c794c-ba11-4007-a5b5-502ba8d14263/> ;
Note that the URIs starting with
Intermediate values are stored in the
intermediates/ folder and referenced from
Intermediate value from the example provenance:
Here we see that the bundle file
intermediates/d5/d588f6ab-122e-4788-ab12-8b6b66a67354.txtcontains the output from the "hello" processor, which was also the input to the "Concatenate_two_strings" processor. Details about processor, ports and parameters can be found in the workflow definition.
Note that "small" textual values are also included as
cnt:chars in the graph, while the referenced intermediate file within the workflow bundle is always present.
rdf:type cnt:ContentAsText ;
cnt:characterEncoding "UTF-8"^^xsd:string ;
cnt:chars "Hello, "^^xsd:string ;
tavernaprov:byteCount "7"^^xsd:long ;
tavernaprov:sha512 "cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e"^^xsd:string ;
tavernaprov:sha1 "f52ab57fa51dfa714505294444463ae5a009ae34"^^xsd:string ;
rdf:type tavernaprov:Content .
workflow.wfbundle is a copy of the executed workflow in SCUFL2 workflow bundle format. This is the format which will be used by The file
workflow.wfbundle contains the executed workflow in Taverna 3.
You can use the SCUFL2 API to inspect the workflow definition in detail.
Taverna 3 Data bundle
Taverna 3 uses the same Data Bundle format as Taverna-PROV plugin. Currently there are some differences due to the two different implementations for capturing provenance.
Taverna 3 does not yet export provenance trace to
Taverna 3 introduces a new resource in the data bundle,
workflowrun.json which is a much more Taverna-centric and it mirrors the actual execution state while running a workflow. This example excerpt shows the structure. (See also the full workflowrun.json)
subject refers to the executed workflow/processor/activity, as identified within the SCUFL2