Here are a few suggested use cases that should be possible to build as examples of how SCUFL2 can be used.
myGrid is not intending to implement all of these use cases. This list is mentioned as an extensive list of suggestions of what might be possible to do with SCUFL2, hence feed into the requirements for the formats and APIs.
The suggestion is to implement many of these use cases in at least two languages, like Java, Ruby, Python, XSLT, SPARQL or even Taverna workflows.
- What are the names of the processors in a given workflow? Java** Show a tree, including processors in nested workflows. Java
- What input ports does a workflow require, and what output ports would it produce?
- Are there any example inputs or descriptions?
- Which WSDL services does this workflow use?
- Are they registered on BioCatalogue?
- Given all public Taverna workflows on myExperiment (Taverna 1, .t2flow and scufl2), how many different services have been used?
- How many different beanshell scripts?
- Given all public Taverna workflows in myExperiment and a chosen service, suggest possible input and output services
- Generate a textual script-like (say Python) overview representation of workflow
- Generate workflow diagram using GraphViz
- Check if a workflow definition has invalid loops like A->B->C->A
- Check if a workflow has invalid depth combinations, like a merge of depth 1 and 2.
- Check if service bindings match port names and depths of processors
- Check by service annotations if a connection is syntactically valid, for instance passing string to service expecting XML, or gene when expecting
- Change any web service calling http://oldservice/old.wsdl to call http://newservice/new.wsdl
- Remove unconnected processors and input ports (clean-up)
- Wrap a selection of processors (for instance a WSDL service and shim splitters) and their links as a new nested workflow, replacing old processors
- Execute workflow using the command line tool, and annotate workflow with example outputs
- Using provenance, annotate workflow with how long each service invocation takes, and how many iterations are typical
- Look up web services in BioCatalogue and annotate each processor with service descriptions
- Extract per-processor semantic annotations and return as a single RDF document with URIs for each processor
- Adding annotation of licence, creator, credits from myExperiment round-trip.
- Automatic credits from common ancestor?
- Persist annotations across edits, ie. even if workflow ID changes
- Given a workflow with unconnected web services and their splitters, auto-connect compatible ports (based on their XSD datatype and/or port name)
- Given a simple input format (Makefile?) of shell script file names, generate a workflow which executes those shell scripts using either (configurable) the command line local worker or the use case activity
- Create an abstract workflow with no service bindings
- Perform dynamic service bindings based on type annotations
- Given a particular dataflow from a semi-compatible workflow system (Knime/RapidMiner/Kepler/etc), generate an abstract workflow (no bindings) with the same structure.
- Create Taverna workflow based on data playground usage in Galaxy
- Include different bindings for workbench, command line tool and server. For instance user interaction; workbench binding could pop up UI, command line ask on console, and server send an email
- Include information about plugins used in workflow, offering alternative (lower-priority) bindings if plugin is not installed. For example: Use-case activity with fall-back to local worker
- Include Taverna version information in binding, for instance warn if service definition is for a newer version with an unsupported property
Taverna research objects
- Bundle workflow representations in various formats (XML, RDF/N3, JSON) - Must define which wins when loading
- Include workflow diagram when saving from Taverna
- Include cached web service definitions (WSDLs)
- Include 'calculated' extra information from Taverna, such as resolved depth and all available (but not connected) activity ports, dispatch stack details, etc
- Bundle provenance and data for a successful example run (ODF export?)
- Include required local tools, like JAR dependencies, shell scripts, Taverna plugins
- Load scufl 1
- Load t2flow
- Load RDF
- Load JSON
- Save scufl 1
- Save t2flow
- Save RDF
- Save JSON
- Load/save plugin for Taverna 2.2
- Formats should be easy to work read and write from other software without access to the SCUFL2 API
- Sign workflow definition (XML-DSIG etc) - so that Taverna Server could execute workflows built by 'trusted sources'.
- Sign activity bindings - allow arbitrary users to execute workflows from a pre-approved list of service bindings
- Tracking of ownership, attribution, workflow reuse
- URIs should be Linked Data compliant (when possible). For instance have a lookup service at http://ns.taverna.org.uk/ which redirects to http://scufl2.taverna.org.uk/
- Identifiers for workflows and workflow components (in addition to the workflow ontology itself) should be globally unique Non-Information Resources** A resolution service that gives basic information in RDF about (known) workflows?