This page is for discussion of the model and XML syntax of SCUFL2 which will replace the current t2flow format.
Compare this with the t2flow XML schema - which has documentation about workflow elements as currently serialized.
Scufl2 has moved to Apache (incubator)
Information in this section is out of date!
SCUFL2 is the proposed new mechanism for specifying Taverna workflows. SCUFL2 defines a model, a workflow bundle file format (.wfbundle), and a Java API for working with workflow structures. SCUFL2 is the workflow language for Taverna 3, and replaces Taverna 2's t2flow format.
SCUFL2 is the proposed new mechanism for specifying Taverna workflows. SCUFL2 adopts Linked Data technology and preservation methodologies to create a platform-independent workflow language that can be inspected, modified, created and executed.
SCUFL2 comes with a Java API that can be used for programmatic access to read and write SCUFL2 workflow bundles. A workflow bundle is a structured ZIP file with the workflow definitions included as XML documents. Those workflow documents are described by an XML Schema and are also valid RDF/XML. The XML Schema allows tools to read and write SCUFL2 workflow definitions as regular structured XML. The RDF allows RDF-enabled tools to link workflow definitions with external resources.
The workflow structure is defined using an OWL ontology and annotated with URIs so that third parties can form semantic statements about any component of a Scufl2 workflow, for example to state that a particular service produces outputs of a certain type, or that a data link was added by a specific researcher.
Semantic annotations and a manifest for the bundle declare the purpose of, and links between the different components forming a workflow. This allows third parties to extract and append annotations about data and services used by the workflow.
The t2flow serialization format suffers from being very close to the Java object model, and contains various items that are simply Java beans serialized using XMLBeans. As the t2flow format is very verbose, it can be difficult to deal with for third party software to do inspection ("Which services does this workflow use?"), modification ("Change all calls to http://
broken.com/ to http://
fixed.com/") and generation ("Build a custom workflow from a button").
We have therefore decided to form a new serialisation format for workflows, called SCUFL2. This format will be accompanied with an UML model, and a primary serialisation format as XML, but also with possible secondary serialisations as JSON and RDF, all following the UML model. This model will also be reflected in a lightweight API, which can deserialize and serialize these formats, in addition to
.t2flow, but also more easily allow inspection of workflow structures, modification and generation.
As detailed in the Taverna roadmap, myGrid will be working on SCUFL2 during the summer 2010: (Subject to change)
June 2010 - SCUFL2 language specification draft
July 2010 - SCUFL2 tools Beta
September 2010 - SCUFL2 tools
See planned SCUFL2 tasks in myGrid's Jira.
This page reflect preliminary work, and these specifications are not yet at alpha level. Do not write any applications assuming the SCUFL2 format will stay as discussed on this page.
- native zargo file
- nice pictures of the class diagrams
This has been produced using the early scufl2 code from http://taverna.googlecode.com/svn/unsorted/scufl2/trunk/
Suggestion for identifiers in [Taverna URI templates.
From: Stian Soiland-Reyes <firstname.lastname@example.org>
To: List for general discussion and hacking of the Taverna project <email@example.com>
Date: Wed, 6 Jan 2010 10:47:34 +0000
Subject: Scufl 2 workflow language
We're working on making the new SCUFL2 workflow language.
This will be a simplification of the current .t2flow serialisation, but will also come with an API.
We're basing this new workflow definition language on what we have learnt are the best features of Scufl (from Taverna 1) and .t2flow - Scufl was quite easy for third party suppliers to generate or parse (for instance myExperiment generates the Taverna 1 diagrams from scratch using Ruby code that parses the scufl), while .t2flow allowed to specify all the finer grained details possible in the new Taverna 2 engine - but this also made it a bit too verbose.
These are early days, so we'll figure out what the language should be and what the API should look like. Paolo Missier has done good work in making a proposed UML model of the new language, which we can then use as the basis for figuring out the XML serialisation, but also the Java beans of the API, and possibly also RDF and JSON versions.
In my spare time I've tried to tie together some simple Java beans implementing this UML model, and I've now checked this into Subversion. - these beans are not complete yet, have no integration with Taverna code, and the API can only serialise to RDF currently. (To test out Sesame/Elmo annotations on beans).
I might come back with code examples so we can discuss what the API should look like. The current tests only builds a workflow from scratch - also note that scufl2-rdf is not yet connected to scufl2-api and can be considered an early version of the scufl2-api. (API wise there are pulls in different directions, for instance we want to make it easy to inspect a workflow, but also to construct one. If more information is needed for inspection, this could make it more tricky to construct.)
The API should minimally be able to:
- Work independently, without any Taverna dependencies, runtime or plugin system
- Load .t2flows
- Save as .scufl2 (undetermined yet what this format is - most likely XML and/or RDF inside a Research Object .zip)
- Inspect an existing workflow to tell:
b) Connection between processor/workflow ports (and conditional links)
c) Activities/Services (ie. 'WSDL' method 'fish' from endpoint 'http://asdkljasdkjasdkj')
- Allow modification and creation from scratch of such workflows
The API should be rich enough so that you could use it to generate the workflow diagram - ie. what the myExperiment does already in Ruby.
- Load scufl 1 .xml from Taverna 1
- Save as backwards compatible .t2flow or even scufl 1 if possible
- Exposed as a RESTful service
However, the API should also be lightweight, so it will not do tasks better done by Taverna engine (t2core):
- Determining if a workflow definition is valid (checking for loops, invalid iteration strategies etc)
- Perform the actual execution of the workflow
Other tasks are also better suited for the main Taverna code base, as they require various plugins or other considerations:
- Discovering available services/methods
- Find input/output ports of a given service definition
- Determining what configuration can be done for a given service
- Merging workflows
If you talk about a client/server architecture, you can picture these (RESTful?) services:
- Taverna engine: execute workflow and manage data/provenance
- Taverna inspection: check workflow definition validity, calculate depths, etc
- Taverna service descriptions: Find available services, specify possible service definition, determine ports for service definition
- Taverna editing: Workbench-type activities, Undo/redo, merge workflows, workflow refactoring
- Taverna diagram: Generate workflow diagram in various formats and configurations
(The last two of these should be possible to implement using mainly the Scufl2 API.)
A client could then use the Scufl2 API and a selection of these services - and still be able to implement what would look like the current Taverna workbench. The client could be written in a non-Java language, and use the Scufl2 serialisation schema/ontology directly with the help of whatever XML/RDF/JSON support is available for its language - this should give the same functionality but without a few convenience methods.
We're very interested in hearing about potential use cases for what such a SCUFL2 language and API could be used for. Feel free to add your comments!
SCUFL2 consists of: