Skip to end of metadata
Go to start of metadata

This page is for discussion of the model and XML syntax of SCUFL2 which will replace the current t2flow format.

Compare this with the t2flow XML schema - which has documentation about workflow elements as currently serialized.

The t2flow serialization format suffers from being very close to the Java object model, and contains various items that are simply Java beans serialized using XMLBeans. As the t2flow format is very verbose, it can be difficult to deal with for third party software to do inspection ("Which services does this workflow use?"), modification ("Change all calls to http://broken.com/ to http://fixed.com/") and generation ("Build a custom workflow from a button").

Developers have informed us that the old SCUFL format of Taverna 1 was significantly easier to work with. However, this format also has its caveats, like no schema, unidentified ways to extend service definitions for Taverna plugins and not supporting various new features in the Taverna 2 engine.

We have therefore decided to form a new serialisation format for workflows, called SCUFL2. This format will be accompanied with an UML model, and a primary serialisation format as XML, but also with possible secondary serialisations as JSON and RDF, all following the UML model. This model will also be reflected in a lightweight API, which can deserialize and serialize these formats, in addition to .scufl and .t2flow, but also more easily allow inspection of workflow structures, modification and generation.

Material

Preliminary work

This page reflect preliminary work, and these specifications are not yet at alpha level. Do not write any applications assuming the SCUFL2 format will stay as discussed on this page.

The code and definitions for SCUFL2 are kept in GitHub. Rough overview:

  • scufl2-api Java Beans for SCUFL2 objects (and currently XML import/export)
  • scufl2-t2flow .t2flow import (and later export)
  • scufl2-rdf RDF export (and later import)
  • scufl2-usecases Example code covering SCUFL2 use cases  

Material for iteration 2 of the UML Scufl model includes

  • native zargo file
  • nice pictures of the class diagrams

Here is an attempt at demonstrating the new proposed XML syntax for Scufl2: as.scufl2.xml - a translation of as.t2flow

Specification of identifiers in Taverna URI templates.

Introduction

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
To: List for general discussion and hacking of the Taverna project <taverna-hackers@lists.sourceforge.net>
Date: Wed, 6 Jan 2010 10:47:34 +0000
Subject: Scufl 2 workflow language
Message-ID: <a20e6fb11001060247k72e422b5v75f0a471ffa6896@mail.gmail.com>

Hi!

We're working on making the new SCUFL2 workflow language.

This will be a simplification of the current .t2flow serialisation, but will also come with an API.

We're basing this new workflow definition language on what we have learnt are the best features of Scufl (from Taverna 1) and .t2flow - Scufl was quite easy for third party suppliers to generate or parse (for instance myExperiment generates the Taverna 1 diagrams from scratch using Ruby code that parses the scufl), while .t2flow allowed to specify all the finer grained details possible in the new Taverna 2 engine - but this also made it a bit too verbose.

These are early days, so we'll figure out what the language should be and what the API should look like. Paolo Missier has done good work in making a proposed UML model of the new language, which we can then use as the basis for figuring out the XML serialisation, but also the Java beans of the API, and possibly also RDF and JSON versions.

In my spare time I've tried to tie together some simple Java beans implementing this UML model, and I've now checked this into Subversion. - these beans are not complete yet, have no integration with Taverna code, and the API can only serialise to RDF currently. (To test out Sesame/Elmo annotations on beans).

I might come back with code examples so we can discuss what the API should look like. The current tests only builds a workflow from scratch - also note that scufl2-rdf is not yet connected to scufl2-api and can be considered an early version of the scufl2-api. (API wise there are pulls in different directions, for instance we want to make it easy to inspect a workflow, but also to construct one. If more information is needed for inspection, this could make it more tricky to construct.)

The API should minimally be able to:

  • Work independently, without any Taverna dependencies, runtime or plugin system
  • Load .t2flows
  • Save as .scufl2 (undetermined yet what this format is - most likely XML and/or RDF inside a Research Object .zip)
  • Inspect an existing workflow to tell:
    a) Processors
    b) Connection between processor/workflow ports (and conditional links)
    c) Activities/Services (ie. 'WSDL' method 'fish' from endpoint 'http://asdkljasdkjasdkj')
    d) Annotations
  • Allow modification and creation from scratch of such workflows

The API should be rich enough so that you could use it to generate the workflow diagram - ie. what the myExperiment does already in Ruby.

Optionally:

  • Load scufl 1 .xml from Taverna 1
  • Save as backwards compatible .t2flow or even scufl 1 if possible
  • Exposed as a RESTful service

However, the API should also be lightweight, so it will not do tasks better done by Taverna engine (t2core):

  • Determining if a workflow definition is valid (checking for loops, invalid iteration strategies etc)
  • Perform the actual execution of the workflow

Other tasks are also better suited for the main Taverna code base, as they require various plugins or other considerations:

  • Discovering available services/methods
  • Find input/output ports of a given service definition
  • Determining what configuration can be done for a given service
  • Merging workflows

If you talk about a client/server architecture, you can picture these (RESTful?) services:

  • Taverna engine: execute workflow and manage data/provenance
  • Taverna inspection: check workflow definition validity, calculate depths, etc
  • Taverna service descriptions: Find available services, specify possible service definition, determine ports for service definition
  • Taverna editing: Workbench-type activities, Undo/redo, merge workflows, workflow refactoring
  • Taverna diagram: Generate workflow diagram in various formats and configurations

(The last two of these should be possible to implement using mainly the Scufl2 API.)

A client could then use the Scufl2 API and a selection of these services - and still be able to implement what would look like the current Taverna workbench. The client could be written in a non-Java language, and use the Scufl2 serialisation schema/ontology directly with the help of whatever XML/RDF/JSON support is available for its language - this should give the same functionality but without a few convenience methods.

We're very interested in hearing about potential use cases for what such a SCUFL2 language and API could be used for. Feel free to add your comments!

Labels
  • None