Skip to end of metadata
Go to start of metadata

Long-running Workflows

Q: Would it be reasonable to use Taverna Server to execute workflows that have long running steps (a remote task could take few hours to complete)?

A: Yes! It is entirely reasonable to have steps that take a few hours. If you've got some steps that take a few days, it is probably better to find a different way; it's too common for networks to have problems on that sort of scale that can throw things off. Such a “different way” might be having one workflow to prepare/start the long processing, and another workflow that picks up and post-processes the results.

If your workflow is likely to take longer than 24 hours, remember to increase its expiry time.

Tuning the Limits

Q: What are the implications of raising the limits “MaxRuns” and “OperatingLimit”? Are there maximum values?

A: The MaxRuns is the maximum number of workflow runs in any state that can exist at once. The OperatingLimit is the number of runs that are in the state Operating, where there's a workflow engine actually carrying out the workflow (as opposed to having you initialise things or collect the results); it should be lower than the MaxRuns, since it is a limit on a subset of the things limited by MaxRuns.

There are no hard-coded maximums, but that doesn't mean you should just raise the limits to super-high values.

The fundamental physical limit on MaxRuns is typically the amount of disk space you've got attached to the system (specifically, in the filesystem that is used for working space, which is usually /tmp). The amount of space required per run is highly dependent on the workflow being run; if you're making many copies of large files, it will obviously take more space per run than if you are just passing around short strings.

The usual physical limit on OperatingLimit is due to the amount of memory required to run the workflows, which is highly variable. (Each running workflow has at least one Java process to contain the execution engine, which may have many threads.) However, it is also quite possible to have workflows that consume a large amount of CPU, so that may also be a limiting factor. Since it depends quite strongly on what your typical workflow mix really is, it's hard to give definitive advice on how to tune.

Note that you can also tune the amount of memory that each workflow run is allocated via the administrative API, which allows setting various flags to be passed into the runtime.

Labels
  • None