Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents


 A good idea to put links to short article on explaining what web services are and why they are useful - especially for scientist developers - i.e. context - what is considered the best short introduction ? (web services 101 ?)

There are two main types of web-Web services: REST RESTful (REpresentational State Transfer) and WSDL (Web Services Description Language) /SOAP.

REST services

REST services use the HTTP protocol. As described on wikipediaRESTful services allow a client to manipulate resources (a resource can be essentially any coherent and meaningful concept that may be addressed) on the server providing the service. WSDL services allow the client to call operations on the server.


When you get to the giving guidelines, it might be an idea to have three sections: general guidelines; WSDL service guidelines; REST service guidelines. At the moment the sections jump from one to another with sometimes no indication to the reader that you are referring to a specific style of service unless you have a good understanding of web services already.


Add yourself if you edit this page

Alex Nenadic, Alan Williams, Robert Haines, Anton Güntsch


Alasdair Gray

RESTful services

RESTful services use the HTTP protocol. They are based around the concept of a resource  and HTTP methods that are used to access, replace, create or delete those resources. As described on Wikipedia, a common mapping of services is:






Collection URI, such as

List the URIs and perhaps other details of the collection's members.

Replace the entire collection with another collection.

Create a new entry in the collection. The new entry's URL is assigned automatically and is usually returned by the operation.

Delete the entire collection.

Element URI, such as

Retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type.

Replace the addressed member of the collection, or if it doesn't exist, create it.

Treat the addressed member as a collection in its own right and create a new entry in it.

Delete the addressed member of the collection.

REST stands for REpresentational State Transfer. REST services are type of Web services that typically expose some of the following four types of operations:

  • GET - to get a resource
  • POST - to make a new resource or to perform a request (such as search)
  • PUT - to update a resource
  • DELETE - to delete a resource

Taverna supports all four of the above operations; based on the chosen operation users can configure different parameters.




Configuration parameters


URL input parameters

response body, status code, redirection

URL signature, HTTP headers (such as Accept, etc.)


URL input parameters, request body

response body, status code, redirection

URL signature, HTTP headers (such as Accept, Content-Type, etc.)


URL input parameters, request body

response body, status code, redirection

URL signature, HTTP headers (such as Accept, Content-Type, etc.)


URL input parameters

response body, status code, redirection

URL signature, HTTP headers (such as Accept, etc.)

URL input parameters are used to generate service input ports dynamically at the time of the service configuration. During configuration, the service URL signature is defined using a standard notation that indicates the parameters that distinguish the resource, and what we want to know about the resource and any filters, e.g.:

Code Block{id}{id}{id}?title={name}

The elements surrounded by {} are URL input parameters and will be used generate input ports for the service named after the parameter. Parameters will be replaced with the data present at those ports at the time of execution. Taverna will URL-encode data used for URL parameters by default; they can also be sent as they arrive without any encoding.

Various HTTP headers can also be configured. The most notable ones are the Accept header (for all four operations), and the Content-Type (for PUT and POST only).

It is also possible to define a value for any other standard HTTP header and to add service-specific ones. Taverna will simply include the header name and value as is when doing the HTTP request and will not perform any action based on the headers being set. This means that, for example, there is not point in setting Content-MD5 header unless you calculate the MD5 digest of the message yourself and set the value of the header to contain the digest as Taverna will not do it for you.


Configured using a URL signature as described above, and optionally with the Accept header (the MIME type for the representation of the resource), and any other HTTP header the user may wish to configure.

Returns the body of the response, the final status code, and final redirection if applicable.


Configured using a URL signature as described above, and optionally with both the Content-Type (of the message being sent to the service), and the Accept header for the representation of result. Other HTTP headers may be included as well.

Can take either the body representing the resource (according to the Content-Type), values as parameters, or a combination of both.

Returns the body of the response, the final status code, and final redirection if applicable.


Same as PUT.


Configured using a URL signature as described above.



POST is also commonly used as a means of performing a complex get or search operation. The POSTed data specifies the parameters to the search.

Tools to build REST services

The Java API for XML Web Services (JAX-WS) provides full support for building and deploying REST Web services. It is tightly integrated with the Java Architecture for XML Binding (JAXB) for binding XML to Java data and is included in Java 6.

Some guidelines on building REST Web services can be found online.

WSDL/SOAP Web services

Taverna aims to support Web services that are compliant with Web Services Interoperability (WS-I) standard, but this is a long-term goal and currently there are a few nuances. However, it is possible to create Web services that are compatible with Taverna if you follow certain guidelines.Web Service Description Language (WSDL) Web services operate by exchanging Simple Object Access Protocol (SOAP) messages with clients over HTTP.


There is also a fifth binding style that is commonly referred to as the document/literal wrapped. Thus, developers have five binding styles to choose from when creating a WSDL file. A good description of the differences between these styles can be found online.


Although Taverna supports bindings that are RPC/encoded and RPC/literal to a fair extent, the preferred binding style is document/literal wrapped, i.e. the WSDL should have “style” attributes that are set to “document” and “use” attributes set to “literal”. This is particularly the case when dealing with complex types; for primitive types no problems are anticipated.

Currently not tested

The following are untested, and although not proven to fail, the behaviour is currently undefined. For this reason it is advised that the use of the following features is avoided.

  • Multiple WSDL imports. Taverna has only been tested on services that contain either no, or only one import of an additional WSDL file. For WSDLs that import more than one additional WSDL document, particularly if that WSDL has a different service endpoint to the others, the behaviour of Taverna is currently unclear. It is expected that it will fail when invoking the Web service. This does not affect imports of XML schema - this has been thoroughly tested and works as expected as long as schemas are publicly available.
  • Multiple service endpoints. For a given WSDL Taverna currently only references the first service endpoint. If more than one exists then operations belonging to the second endpoint are expected to fail.
  • Ambiguous type names. In the unusual case when an operation requires inputs that contain identically named types that belong to different namespaces it is expected that Taverna should not have any problems. However, because of the unusual nature of this it is untested and therefore not recommended.

Currently not working

The following are known to fail in Taverna and should be avoided.

  • Cyclic references. When processing the result of invoking an operation, Taverna resolves the XML into a single document. If the response contains cyclic references, this is detected and an error occurs (to prevent an infinitely long document). For this reason cyclic references should be avoided. (Taverna will, however, work with such an operation as long as the cyclic reference is not contained within the response data structure).
  • Overloaded operations. For a given service, Taverna distinguishes between its operations only by name. The operation signature is not used to distinguish between operations of the same name. For this simple reason, Taverna does not support overloaded operations.
  • xsd:anyType. xsd:anyType type can be used to represent a parameter that can be of any type. Although Taverna can invoke a service that deals with the anyType type, the XML splitting mechanism for the message to be sent/received cannot work since there is no information about the data structure required or received. Such services can only be used by providing and/or manipulating the XML directly inside the workflow.
  • Sending SOAP messages with attachments. This refers to the method of using Web Services to send and receive files as SOAP message attachments. Since the fact that a service can accept an attachment is not described in a WSDL document – Taverna cannot know this and therefore does not support it. However, if a service returns attachments (there can be more than one) – Taverna can extract them and returns them on a special service output port called attachmentList.


Service creation and deployment guidelines

Design the service

Consider how a service consumer will use the service

“An interface is a contract between data provider and data consumer”, Lincoln Stein, 2002.

Web services are nearly always implemented as an afterthought. Service providers usually already have some local code that they want to expose as a public service. So they generate an interface to it automatically in the "hack and publish" approach. The interface ends up being cumbersome and hard to use and understand, tied to the underlying local implementation and configuration, exposing internal ids, class names and formats.

Service providers should write contract-first services - define the service's interface and data types first the way you want service consumers to see it and then implement the service according to the interface. Also refer to "Contract-First Web Services: 6 Reasons to Start with WSDL and Schema" for more details.

Design your services with compatibility and interoperability in mind

More than 70% of the steps in a workflow are "gluey things" that convert data from one format to the other and less than 30% do some science with the data. If there is a data compliance model in your service domain you should consider coding against it.

There are a few service initiatives to try to define data exchange formats and service ontologies, such as BioXSD Data Exchange Format, EDAM Data and Methods Ontology, BioSharing, VO Table interchange format, etc.

If your service returns data for which well-defined vocabularies exist (e.g. ISO country names, ISO languages), use them and document their use. Client systems will then be able to easily integrate service using the same vocabularies.


Worth mentioning which domains these exchange formats are from and urls to them.

Consider reliability and stability of your services

Users value services that are reliable and stable but services decay over time if they are not maintained properly. Reliability and stability increase users' confidence and rating of your services.

Consider how registries and consumers can monitor the service

It is extremely useful for service consumers or service registries to be able to test whether a service is available and whether it is running correctly. It is good practice for service providers to consider how their service can be tested by external bodies. For example, a "ping" operation to check that a WSDL service is "alive". Some registries, such as the BioCatalogue, allow for tests to be included in the description of a service and those tests are automatically run by the registry.

Do not "press a button" on existing code.

It is very easy to think that providing a service can be done by simply running a tool over existing code. This may "tick the box" of making a service available, but the service is likely to be almost unusable by service consumers. When generating a service, it is necessary to think about what the service consumer needs. The best way to do this is to create the services "contract first" i.e. design the service's WSDL or REST and associated document formats, then from those generate the classes and methods for the service; the service methods are then implemented by calls to the existing code. Alternatively (and more commonly) the existing code should be wrapped and annotated so that consumer-oriented services are generated.


Any examples of the contract first or annotation approach - worth linking to


This section seems to be saying the same as the "Consider how a service consumer will use the service" section above – Alasdair

Allow the service to be used by consumers who know minimal information about the service

It is easy for the service provider to assume that service consumers will know almost as much about the service as they do. This assumption leads to lack of documentation, strange features (parameters that change meaning or that should not be used together), changes in the semantics of output and input data.

The service provider should assume that the service consumer's knowledge of the service is minimal. The service consumer merely wants to achieve the task exposed via the service.

Create services that do what consumers want to do

Services should expose consumer-oriented tasks i.e. what a service consumer wants to do. Services should not expose provider-implemented operations i.e. what the service provider's code does to perform a task.

If the service that is being exposed corresponds to methods or functions within the service provider's code, then the service consumer is forced to use it in a specific order to achieve a task and to process implementation-specific data.

Make your "REST service" truly RESTful

The concept of Hypermedia as the Engine of Application State and the suggestions described by Roy Fielding are extremely useful when creating a truly RESTful service. "A REST client needs no prior knowledge about how to interact with any particular application or server beyond a generic understanding of hypermedia." i.e. include links to related resources.

Structure REST resources



Will there be any content here - or could we have a sentence and point to a guide on structuring ?

Avoid creating WSDL services with unreasonable number of operations

There is a number of WSDL services that have a large number (e.g. hundreds) of operations. They make service annotation and usage very difficult. Consider grouping your operations into several services.


An example of such a WSDL service and in what way does it make it difficult to use (a bit more text by way of explanation needed)

Be consistent and standard

Follow service standards for your domain

There are a number of standards for services in particular domains, such as the TAPIR protocol, Web Map Service and the IVOA services . If your service is in a domain that has such standards then it is essential that you follow them.


Please mention which domains these are from.

It should be noted that most of these standards use similar concepts such as the need to deliver a capabilities document that describes the particular data or algorithms that a given service exposes. Such capabilities documents often include the description of the errors that the service can/should return. It is a good idea to deliver a capabilities document even if you are not implementing against a standard.


Need a sentence on what a capabilities document is

If there is no current standard for your domain/service, consider creating a group to develop such a standard. It will make the life of service consumers much easier and will ease the adoption of web services in your domain.

titleComment from Anton Güntsch

For the (existing) recommendation "Follow service standards for your domain" I am not quite sure whether you really want this from a workflow developers point of view. If you take TAPIR as an example (which doesn't have a lot of uptake compared to BioCASE, DiGiR and DwC-A in the community by the way): it might fulfil all the requirements of biodiversity primary data networking but it's hard to understand and therefore not easy to use for persons who come across it for the first time and want to integrate it into their workflows/systems. Do we really want to recommend the use of such "standards"? I would rather say: use the standards of your community put consider adding simplified streamlined access points to make "external" Users happy.

Use a registered format

If possible, use a format registered with the Internet Engineering Task Force (IETF). This is particularly important when the service is sending or receiving "leaf" data. If the format you want to use is not registered with the IETF, then take the small amount of effort needed to register it.


what is meant by 'leaf' data ?

How does this interact with domain specific standards, e.g. VOTable? I guess in most cases the domain ones will be built on top of IETF standards, but perhaps some explanation is needed.

Try to re-use formats

There are a large number of existing formats in which data can be specified. Wherever possible an existing format should be used by the service. The minimal advantage to the service provider of defining their own format is normally outweighed by maintenance costs; there are also inevitable costs for the service consumer in understanding and processing different formats.


This feels repetitive of the section immediately above and the "Design your services with compatibility and interoperability in mind" section

Specify the format

If a service is returning XML then the service provider should specify the xsd to which the data conforms. Similarly, if the service takes XML then the xsd of the data should be documented.

If the service is a WSDL service, then the data format should be specified via inclusion and referencing of the xsd within the WSDL. It is not correct to just specify any.


We need to be specific - is that xsd:anyType or is something else meant.

Return data in the format requested by the client

To quote from "Implementing REST Web Services: Best Practices and Guidelines"A resource may have more than one representation. There are four frequently used ways of delivering the correct resource representation to consumers:


An introductory sentence starting with 'A resource representation is a ....' to give some context to this section.

  1. Server-driven negotiation. The service provider determines the right representation from prior knowledge of its clients or uses the information provided in HTTP headers like Accept, Accept-Charset, Accept-Encoding, Accept-Language, and User-Agent. The drawback of this approach is that the server may not have the best knowledge about what a client really wants.
  2. Client-driven negotiation. A client initiates a request to a server. The server returns a list of available of representations. The client then selects the representation it wants and sends a second request to the server. The drawback is that a client needs to send two requests.
  3. Proxy-driven negotiation. A client initiates a request to a server through a proxy. The proxy passes the request to the server and obtains a list of representations. The proxy selects one representation according to preferences set by the client and returns the representation back to the client.
  4. URI-specified representation. A client specifies the representation it wants in the URI query string.


    Note that for URI-specified representation, the representation is specified as an extension to the URI, not as a parameter of it.

    When the service provider receives a request for a resource in a specific format (character set etc.), it should either:

  5. return data in that format, or
  6. return a HTTP 406 Not Acceptable error

Points 5 and 6 are not at the same level as 1-4

It should never return data in a default format that does not match that specifically requested. Incorrectly returning data in a default format prevents the client from being notified of the format problem and also causes the client to process data in the wrong format.

REST GET services are very similar to fetching pages for a web browser. Some service providers simply deliver an HTML document corresponding to what they would return when the corresponding web page is fetched.

Unless the service is requesting a HTML document, the service provider should not return an HTML document. The service should return the data that underlies the HTML document.


Because it is possible to use web-content providers as if they are REST services, it is possible that apparent REST services within a workflow will actually be fetching HTML documents. Wherever possible it should be made clear that it (what is 'it') is not a "proper" REST service.

Be consistent with what is returned

Some REST services return JSON for certain resources, a choice of JSON or XML for other resources, and XML for other resources. This is very confusing for service consumers. Services should be consistent in the formats in which they can return data.

When a service can return a resource in alternate formats, the data returned must (as far as the formats allow) contain the same information. For example, if the JSON that is returned contains the name, address and telephone number of a person, so should the XML.


Need a glossary or something (with links at the top - people will know XML but might not know JSON esp. if we are targeting scientists developers)

Return the correct error codes

Service providers should read the HTTP Status Code definitions and familiarize themselves with them. (

Services should only return error codes as specified in the HTTP specification. Creating error codes that are specific to a service breaks the HTTP specification and prevents clients from using the service.

Services themselves should never intentionally return any error code in the 500 family - these are reserved for use by the harness within which the service is running (eg Apache). This way you know that if you see a 500 error code then it is not your service that is broken but something further down the stack.


Further down the stack - is geek speak - needs to be made more developer scientist friendly.

Service providers should not only document what errors are returned, but also check (as far as feasible) what happens when unexpected requests are received.


When authorization is required, return a 401 (Unauthorized) error, not a 403 (Forbidden). In English the Forbidden seems appropriate but it is overloaded. In the HTTP specification, a client gaining authorization will not resolve a Forbidden error.

It is not uncommon for HTTP 500 code (Internal server error) to be returned when attempting to access data that does not exist. The correct code should be returned, for example 301 if the resource has been permanently moved.

Use standard security mechanisms

There are standard mechanisms for limiting access for both WSDL and REST services, for example WS-Security and HTTPS. It is almost never a good idea for a service provider to use a novel security mechanism.

For example, some services require that a token is passed as one of the input parameters. However, that token is commonly constant for a given service consumer. In order to call the service, it is usual for service consumers to hard-code the token value. When the service is included in, for example, a workflow, the value of the token is shared along with the workflow. If a standard mechanism such as HTTP username+password had been used, then the security credentials would remain hidden.

Comply with Web Services Interoperability Standard (WS-I)


The WS-I link is not the most informative page. Perhaps one of the other pages might be more useful:

"WS-I profiles define how existing WS (Web Services) specifications should be used in order to achieve maximum interoperability. The profiles effectively clarify the way existing standards should be used because the final documents were not clear in places or the flexibility they allowed was leading to interoperability nightmares. If the rules of WS-I profiles are followed, it is expected (but not guaranteed) that the resulting deployments would interoperate (at least as far as the underlying infrastructure is concerned)."

You can choose to make your Web service non-compliant, depending on your needs. For example, encoded style (RPC/encoded), SOAP over JMS protocols, and secured Web services, do not comply with the WS-I Basic Profile.


Might be worth stating why someone would want/need to make their WS non-compliant - e.g. examples of genuine need ?

Keep the service simple and task-oriented

Never expose implementation details

The internal details of the service implementation should be hidden, as far as possible. For example, just returning a serialization of a Java object is not a sensible or usable response to a service call.

The exposure of even internal ids can cause problems as the services may assume that an internal id is passed in. How is the service consumer meant to know the id?

Wherever possible, implementation classes, data and ids should not be exposed by the service. The service provider should look at what is actually returned by some example service calls and consider if they are actually usable.


This section feels more general and should be near the top of this document

A service should do one thing not many

Polymorphism is when the task performed by the service is determined by one of the parameters to the service call. For a polymorphic service, the validity, meaning or permitted values for other parameters may depend upon the value of the controlling parameter. It is possible to have polymorphic WSDL or REST services.

For example, consider a polymorphic service that allows you to query one of several databases, books or films, then a query may appear as:

where it is not valid to specify the director for a book and the semantics of author switches from writer to screenwriter depending upon the database.

It is far better to have separate resources corresponding to, in this case, the separate databases with the correct parameters specified for the resources.


worth indicating what a re-engineered set of endpoints would look like using the above example.

Transfer references to data until the data is actually requested

It is normally better to transfer references to data, for example a URL, rather than the actual data, especially when the data is of significant size. The only exception to this is when the data is explicitly fetched, for example by a HTTP GET operation.


Well put - but might need to put in some more text as this is quite a common issue with how to handle data when using web services - not sure what extra info is needed though - apart from 'note this is important - and here is a good example.'

Do not specify the format as a parameter of a REST service

The HTTP protocol used by REST has a mechanism for specifying in the message header the format that the data should be returned in. For example:Accept: application/xml


Accept: text/xml

REST services should not specify the format as part of the URL or message body e.g.

This is not RESTful. Such a resource should be requested from with Accept header application/xml.


Worth specifying why is not being RESTful is a bad thing - e.g. path of least surprise - harder to learn etc

The use of a format parameter can cause problems. For example, if the service consumer requests "" with "Accept: application/json" then it should be impossible for the transaction to take place.

An alternative is to use URI-specified resource representation. This would use the extension .xml in the URI, for example

Avoid parameter abuse

Many services abuse their input parameters to pass the information to the service that changes, for example, the method invoked, the database searched or the content type of the returned data. This parameter abuse hides the resources used and is particularly present in REST services (also see Avoid polymorphism and Do not specify the format as a parameter of a REST service points above).

For example, not all services follow the content negotiation principle whereby you can ask for the resource in different formats just by changing the HTTP header value to another known MIME type; some services expose the content type as an actual parameter to the REST call.

Example of using a service as a parameter:

Example of using a database name as a parameter:


not clear if example are showing good practice or bad practice (I think bad) - might be worth saying what the good equivalents would look like - is it worth putting in pertinent HTTP header lines in these examples - however I guess these might be at the mercy of point and click software that makes web services out of existing code


Feels like a lot of repetition of what has gone before 

Avoid complicated XML input and output data structures

XML data structures with a high level of nesting are hard to read and understand. Try to keep your data structures clean and simple. Do not expose implementation details in your resulting XML resulting from automatically generating services from existing code.

An example of a not very good XML is shown in the image below.


Where is the image ?

How does this play out with use domain standards?

Avoid anonymous message attachments

(for WSDL services)


What is this? How would you generate anonymous messages?

Why is this bad and what problems can it lead to

Avoid obscure WSDL

Even though WSDL documents are not meant for human consumption, avoid WSDL documents such as <wsdl:part name="in0" type="xsd:string"/>. Parameter name "in0" means nothing to a user trying to invoke your service.


Should the title for this be use meaningful parameter names in WSDL documents (can automatic WS creation tools get in the way here ?)

Apply good software engineering practices when naming parameters?


Some of the sub sections above get right down to the nitty gritty of the service styel (WSDL/REST) while some other bits are very generic.

Test and maintain the service

Assess and document performance limitations

Predicting performance bottlenecks for service can be difficult. There are however tools such as Apache JMeter ( which can be used to assess the behaviour of services in a structured way. Detected performance limitations should then be included in the service documentation.

Use your services in your own systems and workflows

Many service implementations lack a clear use case and are just exposed in addition to a human-readable portals for an unknown future application. To ensure that service provide the expected functionality and performance developers should try to use them as much as possible in their own systems such as web-portals using these services rather than local APIs.

Document and register the service

Describe the service

Many services are poorly described. Even when the services are registered with, for example, the BioCatalogue, only minimal effort is put into documenting them. An undocumented service is an unusable service. The documentation should include, at least,

  • description of the task that the service performs - what it does from a service consumer's point of view
  • what can be used as input, with example values and description of what happens if the value is not specified
  • what will be returned as output, including example values
  • possible error messages, including what they mean from the point of view of the input data and the intended task. The error message should not be described by relation to what has gone wrong in the provider's code

It is very useful if the documentation includes:

  • example programs, scripts and/or workflows (including example input data and results) that use the service
  • other services that work well with the service

Document any choreography

Some services require that they are called as part of a group and in a specific order - this is often termed a "choreography". An example of services requiring a choreography are the WSDL operations generated by Soaplab. If your services are expected to work as part of a choreography, or in a more extreme case, will not work except in a choreography, then you must document the choreography. The documentation should include the description of the order in which the individual services must be called and also, as far as possible, what will happen if services are called "out of order".


What form should this documentation of choreography take - is their a format or is what meant here examples and human readable text.

Register your service

Unless you intend to keep your service private, it is very good practice to register it. There are a number of service registries. Some of them such as seekdaare general and can be used for any WSDL service. Other registries are domain-specific, for example the BioCatalogue for Life Science Web Services or one of the registries of the International Virtual Observatory Alliance (IVOA) for archives of Astronomical data.

Consider separating availability information

If your service is currently unavailable, then normally service consumers will have no idea what has gone wrong or when the service will re-appear. Some services, such as those registered with IVOA, allow the registration of a URL where availability information is held, with the actual services preferably being at a separate address.

If the registries where you register your service do not support availability information, then it is a good idea to press them to support such information. Although there is currently no standard mechanism for describing availability information, one such mechanism is ...


Pointer to the way BioCatalogue does it ?

Standing up services

Consider cloud-based service provisioning

Hosting your services on a cloud provides greater reliability, stability and capacity for users in comparison to services run on a desktop PC under someone's desk.


Can you make a financial case for why providing a cloud service is better? At the end of the day, providers are likely to be grant funded and if they have not budgeted for providing the service, then there may be no money for hosting on the cloud and deploying on their desktop is free.

Other resources

Making your Web services compatible with Workflow Management Systems (e.g. Taverna)

In addition to the general advice on building Web services above, there are still a few things to bear in mind when building your services to work workflow management systems such as Taverna. Please refer to the "Web services in Taverna" document for details.