Skip to end of metadata
Go to start of metadata

The Problem

List handling processors, such as Flatten List, are used in many workflows but are inefficient in their implementation because each list element is dereferenced, a new list is created by the local worker and each element in the new list is registered as a new value with the reference service.

BeanshellActivity
Flatten List Localworker
  • This is time inefficient because the dereference and re-register operations are not required and pipelining is not possible.
  • It's space inefficient because new references are created (and if running in memory the values are duplicated).
  • Lineage tracing is also affected because the link between input and output is lost.

Solutions

Consider the case where a local service (currently a subclass of beanshell) receives a list of values

  1. Beanshell receives a list of references + object that allows the resolution of references
  2. Beanshell receives a top-level reference + object as for (1).  Reference could be to a list, singleton value or even an error
  3. Separate type of special activity.  Would be difficult for users to extend, but could allow developers more flexibility
  4. Separate workflow object, like merges.  Not really distinct from (3) as just matter of presentation
  5. Part of processor customization e.g. as a dispatch layer that allows flattening of lists
    1. could potentially stream out the results
    2. Issue of how to see it as it would be an invisible shim - note that same issue currently applies to dot and cross products. 
  6. Add magic to reference service.  For example:
    1. List with dynamic lookup and getRef
    2. WeakHashMap that recognizes knows strings

Note that some of these overlap with issues relating to error handling.

Perhaps allow a service to accept errors, rather than them always being bounced.  Could be a configuration of error bounce layer that when a flag is set makes the layer "pass through".

David will arrange a review of the local services.

Labels
  • None