- Cross and dot product
- Changing iteration order of cross product
- Pairing up values with dot product
- Complex list handling
- Taverna tutorial 2014: Advanced Taverna features includes a tutorial on List handling
Cross and dot product
The default list handling when Taverna is doing implicit iteration is to do a cross product of all the ports, in the order they've been connected. This means that the ports are combined all-to-all.
The other basic list handling is a dot product, meaning that the port values are matched one-to-one.
In Computer Science terms, this dot product is called convolution and is comparable to the
zip-like function in functional programming. Note that Taverna stops at the shortest list above, as in the example above there is no value to combine
|Iteration strategy and implicit cross products|
List handling used to be called Iteration strategy in older versions of Taverna.
Changing iteration order of cross product
In some cases, Taverna's implicit iteration might not be combining the ports in the desired way.
As an example, you can load this attached workflow:
If you run this workflow with the list values
[a, b, c, d] on list1 and
[1, 2, 3] on list2, you would get a list of four lists, each with 3 values. This means that list2 is iterated over several times inside the outer iteration of list1.
Assume the next service is to receive a list of values, but you want it to receive
[a1, b1, c1, d1], not
[a1, a2, a3].
To flip the iteration order, go to Details -> List handling on the service.
In Taverna 2.4 and later, list handling can also be accessed by r-clicking on a service and selecting Configure running -> List handling
If you expand the Cross product, you should see that
string1 is above
string2 - meaning that
string1 is done outermost.
|Using the Move Up button|
This drag-dropping can be a bit tricky to get used to. A workaround for moving ports upwards is to use the Move up button.
You can verify the new list handling in the Details view.
Running the workflow again with the new list handling should reveal 3 lists
[a1, b1, c1, d1],
[a2, b2, c2, d2],
[a3, b3, c3, d3].
Pairing up values with dot product
Assume we instead of combining every
list1 item with every
list2 item we want to combine the lists by pairing up item 1 from both lists, item 2 from both lists, etc, gaining
[a1, b2, c3].
The list handling should now have changed to a Dot product of
The button Change to Cross Product, would reverse the operation.
Running the workflow this time should produce
[a1, b2, c3]. Notice how
d is lost as
list2 did not contain more than 3 elements.
|Order of dot product not important|
Note that as a dot product will combine the lists item by item, and there won't be any "outer" or "inner" iterations, the order of the ports below Dot product is not important.
You can do a dot product of several ports, which would pick the first item from all the selected ports, and stop when the smallest list has been exhausted.
|The empty list|
This means that if you do a dot product with an empty list on one of the ports, no iterations will be performed, and a single empty list returned.
Complex list handling
To show how to solve more complex list handling, load the attached workflow.
This workflow has a service
Concatenate_three_strings which has a single cross product of all three ports.
Running the workflow produces a single string combining every value from every port:
If you configure the list handling to change the cross product to a dot product, you will instead get:
This should be a natural expectation based on how cross and dot product works with two ports. Again one can reorder the ports within the cross product, while within the dot product the order is irrelevant.
Visualised, a cross product of three ports becomes slightly more complicated, as you can consider
1234 being combined both with
ab both with
XYZ. In the dot-product, on the other hand, excess values in the larger lists are simply skipped. (
d have been removed from this diagram for simplicity)
Combined cross product and dot product
Now, what if we want to iterate over every
abcd for every pair combination of
XYZ? This can be achieved by doing a cross product of
abcd with a dot product of the other two ports.
Finally you should then have something like
Running the workflow now will produce:
Meaning we've done a cross product of
a, b, c, d with the dot products
1X, 2Y, 3Z.
Visually this can be pictured like this (Removing
5, c, d for clarity):
Dot product matching of lists with values
If your service have two ports, expecting a list and an associated single value, you can also use the dot product to set up the expected list handling.
See this attached workflow:
list at depth 2, so a list of lists.
1234 outputs a single list of numbers. When run, the workflow will output three values,
a1 b1 c1 d1,
e2 f2 g2 h2 and
i3 j3 k3 l3 - the list
abc has been combined with
Making sure the port depths match up
Now remove the link from
list and instead connect
list1 contains the first list (depth 1) contained in the list of lists in
list (depth 2).
Taverna will warn you that you now have an invalid list handling, by placing red errors in the Workflow explorer:
If you click Validation report (or right-click on
Beanshell and select Show validation report) you should see details about what's wrong:
Mismatch of input list depths with the explanation There is possibly a problem with the service's list handling is shown. (See validation report for details)
The reason for this particular mismatch can be found if we look at Details->List handling and Predicted behaviour.
What we have asked for here is to match up each
in2 with each
in1, but the service expects
in1 at depth 1 and
in2 at depth 0. If we are to combine each value at
in2 with each value at
in1, both ports will dig down to depth 0 - but we need depth 1 on
The dot product can only be used on ports with implicit iteration, and the difference in depth must match the requirements of the service. So if the service expects depths
1,0 only a dot-product of incoming depths
2,1 would work.
The solution in this case is to change the list strategy back to Cross product, as a cross product between the single list
in1 and the individual values at
in2 should give you a valid workflow, combining
a,b,c,d while iterating over each of
Managing list depth using shims
Occasionally you might find that you need to add shims to the workflow in order to make lists compatible for a dot product.
For instance, you can use the local worker Echo list to force a value from depth 0 to depth 1, or Flatten list to merge a list from depth 2 to depth 1. You can also add your own Beanshell scripts, like this list wrapper which through implicit iteration can bump from depth 1 to depth 2:
or this script, which picks only the first element of a list, going from depth 1 to 0:
Using nested workflows to tweak list handling
Imagine you have two incoming links,
A, a list of lists of values, and
B, a list of values. Your service expects single values on both
inB, but you do not want a complete cross product. Assume instead you want to do a dot product, matching the Bs against the inner list of
A, repeated (cross product) for each of those lists in the outer list of A.
so we want to combine
A1, B1, C1, then
D2, E2, F2, etc.
See this attached workflow:, which should give errors due to the invalid list handling.
In this case we can't change the list strategy to Cross Product, as this would also combine
We can however insert a new nested workflow, and move in the service
You should now get a list with 3 inner lists, containing
[A1, B1, C1],
[D2, E2, F2],
[G3, H3, I3].
What happens here is that the outer dot product is performed over the nested workflow service, which expects a list and a value. The nested workflow will therefore receive the inner list
[ A, B, C] together with the corresponding single value
[D, E, F] with the single value
2, and finally
[G, H, J] with
3. The list with
[K, L, M] is discarded as there is no corresponding value at
Inside each invocation of the nested workflow, the regular cross product on
Concatenate_two_strings (which expects two values at depth 0) will combine say
[ A, B, C ] with
1 to form the single list of
[A1, B1, C1]. The reason this cross product does not add
3 is that those are not provided to that invocation of the nested workflow.
Flipping it - outer cross product and inner dot product
Assume you would rather want 4 lists, containing
[ [A1, B2, C3], [D1, E2, F3], [G1, H2, I3], [J1, K2, L3] ] .
You can't simply use the standalone
Concatenate_two_strings with a cross product, because it would give you
[ [A1, B1, C1], [A2, B2, C2], ...].
Instead, modify the setup from above:
You should get the desired 4 lists,
[ [A1, B2, C3], [D1, E2, F3], [G1, H2, I3], [J1, K2, L3] ] .
What happens here is that the cross product is performed over the nested workflow service in the outer workflow. The nested workflow expects two lists of depth 1. Taverna will therefore iterate over the list of lists from
multilist, combining each inner list
[A, B, C] with the same single
[1, 2, 3] from
In the nested workflow the inner service is receiving two lists (as both workflow input ports have declared depth 1), but the service
Concatenate_two_strings is expecting two values of depth 0. It is here iterating using dot product, therefore matching each element of
string1 with each element of
string2 - outputting lists of depth 1.
A good guide to getting such iteration cases right is to think of services as "consuming" its desired depth, and the List Handling deals with massaging down the actual inputs to fit the desired depths. Use the Intermediate Values in the Results perspective to check what goes in and out of each iteration - you can use a pass-thru nested workflow with no actual services for testing of list handling cases.
Wrapping a service in a nested workflows can be used to "push" the desired depths up or down and perform different list handling inside, which is necessary for "partial dot products" cases as described above.