Tag Archives: t2

Stand up and be counted

Taverna workflows are full of shims. That’s a fact. Shims are the little adapter services, mostly using Beanshells, which convert the outputs of one workflow processor before sending it to the input of another. They are always being re-invented and 90% of the time do the same things – concatentate strings, swap things around…..

The problem is that these shims are designed to be use once, almost throw away, so they are not annotated with any sort of helpful information. This becomes a problem with the scientists greatest challenge – provenance and data lineage (OK, I might be being a bit melodramatic about it being the greatest challenge but it is up there somewhere – maybe nearer to the challenge of making the perfect cup of tea, no mean feat). Data goes into a shim and comes out the other side but what is happening inside? Shims are not the only processors guilty of this, there are plenty of black box services out there in the world. So, how can we address this problem? Well, we have started to collect the shims that people actually use in a my experiment group, we will then figure out all the similarities and come up with an annotated set which we can all use from Taverna 2 (T2). The current idea is for the T2 workbench to have an intelligent workflow designer which will recognise that you are trying to do some shim magic and suggest one to use. Maybe we will need a Taverna clippy style pop up (think Word etc) – ideas on a postcard……

So, if you are a shim it’s time to stand up and announce to the world – “I’m a shim and I’m proud of it”.

Code freeze of Taverna 1.7.1

We’re hopefully doing our code freeze for Taverna 1.7.1 today, 2008-03-28.

There’s a few bug fixes, in addition to quite a bit of work on t2. There will also be a drag-and-drop-able, interactive workflow editor, basically you can edit the workflow by connecting lines between the processors. Those of you who remember Taverna 1.4 might notice that this is something we’ve had hidden in the shelves for a bit, we hope that as we bring it back this time it will be a bit more usable as an alternative way to build workflows. You will still be able to flip to the “Graphical” tab for the classic non-interactive (but usually prettier) diagram.

The t2 plugin will now be a bit more usable with support for BioMoby, we’ve also made the BioMart support use streaming so that as soon as the first row of the BioMart result set has been received, it will immediately be pushed down to the next step in the workflow. Further on, the results of those operations will also be streamed on immediately before the full list is processed. For large datasets this should significantly improve the execution time of the workflow.

The t2 plugin also now has a graphical representation of the workflow as it’s running, with progress bars. Note that when doing streaming the progress bar can look a bit weird as results are coming in – say initially processor 2 has finished 4 of 5 items, but 5 new ones comes in from above, then the progress bar will jump from 80% (4/5) to 40% (4/10). But the cool thing is that you will be able to almost see the data as it’s flowing through, almost like pipes with pumps between them. We’re planning to add more features to this view, say to let you click on a processor as it’s running and have a look, perhaps tweak some parameters or go deeper in detail.

After the 1.7.1 release, which we’re predicting will be out in about a week (2008-04-07), we’ll focus fully on the 2.0 release for June 2008. The 2.0 release will feature the t2 enactor as the core engine, and some graphical updates as well. We’ll try to post more on the progress for 2.0 here as we go along.