Skip to end of metadata
Go to start of metadata
By customizing list handling, one can control the implicit iteration, such as pairing up values in equally sized lists.

Tutorial

Cross and dot product

The default list handling when Taverna is doing implicit iteration is to do a cross product of all the ports, in the order they've been connected. This means that the ports are combined all-to-all.

The other basic list handling is a dot product, meaning that the port values are matched one-to-one.

In Computer Science terms, this dot product is called convolution and is comparable to the zip-like function in functional programming. Note that Taverna stops at the shortest list above, as in the example above there is no value to combine 4 with.

Iteration strategy and implicit cross products

List handling used to be called Iteration strategy in older versions of Taverna.

Changing iteration order of cross product

In some cases, Taverna's implicit iteration might not be combining the ports in the desired way.

As an example, you can load this attached workflow:

If you run this workflow with the list values  [a, b, c, d] on list1 and [1, 2, 3] on list2, you would get a list of four lists, each with 3 values. This means that list2 is iterated over several times inside the outer iteration of list1.

Assume the next service is to receive a list of values, but you want it to receive [a1, b1, c1, d1], not [a1, a2, a3].

To flip the iteration order, go to Details -> List handling on the service.

In Taverna 2.4 and later, list handling can also be accessed by r-clicking on a service and selecting Configure running -> List handling

If you expand the Cross product, you should see that string1 is above string2 - meaning that string1 is done outermost.

To modify the order of the cross product, so that string2 is the outermost iteration, and string1 the innermost:

  1. Go to Details -> List handling for the selected service.
  2. Click Configure to modify the list handling
  3. In the List handling window that appears, expand Cross product
  4. Click to drag string2
  5. Drop the item so that the gray line is above string1
  6. Click OK to apply and close the window.

Using the Move Up button

This drag-dropping can be a bit tricky to get used to. A workaround for moving ports upwards is to use the Move up button.

To move the port using the toolbar:

  1. Resize the window so you can see the Move up button on the toolbar
  2. Select string2
  3. Click Move up
  4. The order should now be string2, string1.
  5. Click OK to apply and close the window.

You can verify the new list handling in the Details view.

Running the workflow again with the new list handling should reveal 3 lists [a1, b1, c1, d1], [a2, b2, c2, d2], [a3, b3, c3, d3].

Pairing up values with dot product

Assume we instead of combining every list1 item with every list2 item we want to combine the lists by pairing up item 1 from both lists, item 2 from both lists, etc, gaining [a1, b2, c3].

To change the list handling to perform a dot product instead of a cross product:

  1. Go to Details -> List handling for the selected service.
  2. Click Configure to modify the list handling
  3. In the& List handling window that appears, select Cross product
  4. Click Change to dot product on the toolbar. (You might need to resize the window on some operating systems).
  5. Click OK to apply and close the window.

The list handling should now have changed to a Dot product of string1 and string2.

The button Change to Cross Product, would reverse the operation.

Running the workflow this time should produce [a1, b2, c3]. Notice how d is lost as list2 did not contain more than 3 elements.

Order of dot product not important

Note that as a dot product will combine the lists item by item, and there won't be any "outer" or "inner" iterations, the order of the ports below Dot product is not important.

You can do a dot product of several ports, which would pick the first item from all the selected ports, and stop when the smallest list has been exhausted.

The empty list

This means that if you do a dot product with an empty list on one of the ports, no iterations will be performed, and a single empty list returned.

Complex list handling

To show how to solve more complex list handling, load the attached workflow.

This workflow has a service Concatenate_three_strings which has a single cross product of all three ports.

Running the workflow produces a single string combining every value from every port:

If you configure the list handling to change the cross product to a dot product, you will instead get:

This should be a natural expectation based on how cross and dot product works with two ports. Again one can reorder the ports within the cross product, while within the dot product the order is irrelevant.

Visualised, a cross product of three ports becomes slightly more complicated, as you can consider 1234 being combined both with ab and XYZ, and ab both with 1234 and XYZ. In the dot-product, on the other hand, excess values in the larger lists are simply skipped. (5, c and d have been removed from this diagram for simplicity)

Combined cross product and dot product

Now, what if we want to iterate over every abcd for every pair combination of 12345 and XYZ? This can be achieved by doing a cross product of abcd with a dot product of the other two ports.

To make a combined list handling for three (or more) ports string1, string2, string3:

  1. Go to Details -> List handling for the selected service.
  2. Click Configure to modify the list handling
  3. In the List handling window that appears, select Cross product. (If it is a Dot product, Change to Cross product)
  4. Click Add Dot. A new node called Dot product should be added below the last port.
  5. Drag the port string2
  6. Drop the port below - and to the right - of Dot product. (Imagine drawing an L). A gray line should appear below Dot product, together with a right-pointing arrow. string2 should now be a child of Dot product.
  7. Repeat for the port string3
  8. If the port ends up below Dot product, but not as a child, try dragging it again until a right-arrow appears.

Finally you should then have something like string1×(string2·string3):

Running the workflow now will produce:

Meaning we've done a cross product of a, b, c, d with the dot products 1X, 2Y, 3Z.

Visually this can be pictured like this (Removing 5, c, d for clarity):

Dot product matching of lists with values

If your service have two ports, expecting a list and an associated single value, you can also use the dot product to set up the expected list handling.

See this attached workflow:

The service multilist outputs list at depth 2, so a list of lists. 1234 outputs a single list of numbers. When run, the workflow will output three values, a1 b1 c1 d1, e2 f2 g2 h2 and i3 j3 k3 l3 - the list abc has been combined with 1, efgh with 2, etc.

Making sure the port depths match up

Now remove the link from list and instead connect list1 to in1. list1 contains the first list (depth 1) contained in the list of lists in list (depth 2).

Taverna will warn you that you now have an invalid list handling, by placing red errors in the Workflow explorer:

If you click Validation report (or right-click on Beanshell and select Show validation report) you should see details about what's wrong:

Mismatch of input list depths with the explanation There is possibly a problem with the service's list handling is shown. (See validation report for details)

The reason for this particular mismatch can be found if we look at Details->List handling and Predicted behaviour.

What we have asked for here is to match up each in2 with each in1, but the service expects in1 at depth 1 and in2 at depth 0. If we are to combine each value at in2 with each value at in1, both ports will dig down to depth 0 - but we need depth 1 on in1.

The dot product can only be used on ports with implicit iteration, and the difference in depth must match the requirements of the service. So if the service expects depths 1,0 only a dot-product of incoming depths 2,1 would work.

The solution in this case is to change the list strategy back to Cross product, as a cross product between the single list in1 and the individual values at in2 should give you a valid workflow, combining a,b,c,d while iterating over each of 1,2,3,4.

Managing list depth using shims

Occasionally you might find that you need to add shims to the workflow in order to make lists compatible for a dot product.

For instance, you can use the local worker Echo list to force a value from depth 0 to depth 1, or Flatten list to merge a list from depth 2 to depth 1. You can also add your own Beanshell scripts, like this list wrapper which through implicit iteration can bump from depth 1 to depth 2:

or this script, which picks only the first element of a list, going from depth 1 to 0:

Using nested workflows to tweak list handling

Imagine you have two incoming links, A, a list of lists of values, and B, a list of values. Your service expects single values on both inA and inB, but you do not want a complete cross product. Assume instead you want to do a dot product, matching the Bs against the inner list of A, repeated (cross product) for each of those lists in the outer list of A.

so we want to combine ABC with 1, getting A1, B1, C1, then D2, E2, F2, etc.

See this attached workflow:, which should give errors due to the invalid list handling.

In this case we can't change the list strategy to Cross Product, as this would also combine A3, D1, etc.

We can however insert a new nested workflow, and move in the service Concatenate_two_strings.

  1. Select the service to move (Concatenate_two_strings)
  2. Perform Edit->Cut in the menu.
  3. Click Insert->Nested workflow in the menu.
  4. Select New workflow and click Import workflow.
  5. Perform Edit->Paste in the menu.
  6. In the diagram toolbar, click to Display all service ports
  7. For each service input and output port, right click and do Connect with.. -> New workflow port..
  8. Name the ports as in the service, here string1, string2, output.
  9. For workflow input ports string1, set the depth to 1 instead of 0 (passing through the inner lists like [A, B, C])
  10. For workflow input ports string2, leave the depth at 0 - as we want to "freeze" this input for the inner iteration
  11. Confirm that the list handling of the service inside the nested workflow (e.g. Concatenate_two_strings) is set to Cross product of string1 and string2.
  12. Click File->Save and File->Close to save the nested workflow into the parent workflow.
  13. Reconnect the ports, in this case multiport to string1 and list to string2, in addition to the workflow output port to output.
  14. Select the nested workflow, and rename it to match the wrapped service, Concatenate_two_strings_wrapped
  15. Check that the list handling for the nested workflow is set to Dot product of string1 and string2
  16. Check that the workflow Validation Report is now valid
  17. Run the modified workflow

You should now get a list with 3 inner lists, containing [A1, B1, C1], [D2, E2, F2], [G3, H3, I3].

What happens here is that the outer dot product is performed over the nested workflow service, which expects a list and a value. The nested workflow will therefore receive the inner list [ A, B, C] together with the corresponding single value 1, then [D, E, F] with the single value 2, and finally [G, H, J] with 3. The list with [K, L, M] is discarded as there is no corresponding value at string2.

Inside each invocation of the nested workflow, the regular cross product on Concatenate_two_strings (which expects two values at depth 0) will combine say [ A, B, C ] with 1 to form the single list of [A1, B1, C1]. The reason this cross product does not add 2 and 3 is that those are not provided to that invocation of the nested workflow.

Flipping it - outer cross product and inner dot product

Assume you would rather want 4 lists, containing [ [A1, B2, C3], [D1, E2, F3], [G1, H2, I3], [J1, K2, L3] ] .

You can't simply use the standalone Concatenate_two_strings with a cross product, because it would give you [ [A1, B1, C1], [A2, B2, C2], ...].

Instead, modify the setup from above:

  1. Modify the inner nested workflow so that all its input ports string1 and string2 are at depth 1
  2. Modify the list handling of the inner Concatenate_two_strings to be a Dot product
  3. Save and close the nested workflow
  4. Reconnect list from 123 to string2 of the nested workflow
  5. Modify the list handling of the outer nested workflow to be a Cross product
  6. Check that the outer workflow Validation Report is now valid
  7. Run the new workflow

You should get the desired 4 lists, [ [A1, B2, C3], [D1, E2, F3], [G1, H2, I3], [J1, K2, L3] ] .

What happens here is that the cross product is performed over the nested workflow service in the outer workflow. The nested workflow expects two lists of depth 1. Taverna will therefore iterate over the list of lists from multilist, combining each inner list [A, B, C] with the same single list [1, 2, 3] from 123.

In the nested workflow the inner service is receiving two lists (as both workflow input ports have declared depth 1), but the service Concatenate_two_strings is expecting two values of depth 0. It is here iterating using dot product, therefore matching each element of string1 with each element of string2 - outputting lists of depth 1.

Conclusion

A good guide to getting such iteration cases right is to think of services as "consuming" its desired depth, and the List Handling deals with massaging down the actual inputs to fit the desired depths. Use the Intermediate Values in the Results perspective to check what goes in and out of each iteration - you can use a pass-thru nested workflow with no actual services for testing of list handling cases.

Wrapping a service in a nested workflows can be used to "push" the desired depths up or down and perform different list handling inside, which is necessary for "partial dot products" cases as described above.

Labels
  • None