There are two main types of web-services: REST (REpresentational State Transfer) and WSDL (Web Services Description Language) /SOAP.
REST services use the HTTP protocol. As described on wikipedia, a common mapping of services is:
Collection URI, such as http://example.com/resources/
List the URIs and perhaps other details of the collection's members.
Replace the entire collection with another collection.
Create a new entry in the collection. The new entry's URL is assigned automatically and is usually returned by the operation.
Delete the entire collection.
Element URI, such as http://example.com/resources/item17
Retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type.
Replace the addressed member of the collection, or if it doesn't exist, create it.
Treat the addressed member as a collection in its own right and create a new entry in it.
Delete the addressed member of the collection.
POST is also commonly used as a means of performing a complex get or search operation. The POSTed data specifies the parameters to the search
Tools to build REST services
The Java API for XML Web Services (JAX-WS) provides full support for building and deploying REST Web services. It is tightly integrated with the Java Architecture for XML Binding (JAXB) for binding XML to Java data and is included in Java 6.
Some guidelines on building REST Web services can be found online.
WSDL/SOAP Web services
WSDL Web services are defined by their WSDL document, an XML format that represents an interface to a Web service. WSDL is machine-readable description of the operations (functions) and parameters offered by the service, i.e. XML message types that the service receives and produces and that get wrapped in SOAP messages exchanged between the client and the service.
WSDL binding describes how a particular Web service is bound to the underlying SOAP messaging protocol (or any other protocol used as a carrier). A WSDL-SOAP binding can be either a Remote Procedure Call (RPC)-style binding or a document-style binding. A SOAP binding can also have an encoded use or a literal use. This gives four style/use models:
There is also a fifth binding style that is commonly referred to as the
document/literal wrapped. Thus, developers have five binding styles to choose from when creating a WSDL file. A good description of the differences between these styles can be found online.
Common service creation mistakes
"Press a button" service publishing
It is very easy to think that providing a service can be done by simply running a tool over existing code. This may "tick the box" of making a service available, but the service is likely to be almost unusable by service consumers. When generating a service, it is necessary to think about what the service consumer needs. The best way to do this is to create the services "contract first" i.e. design the service's WSDL or REST and associated document formats, then from those generate the classes and methods for the service; the service methods are then implemented by calls to the existing code. Alternatively (and more commonly) the existing code should be wrapped and annotated so that consumer-oriented services are generated.
Assumption of knowledge
It is easy for the service provider to assume that service consumers will know almost as much about the service as they do. This assumption leads to lack of documentation, strange features (parameters that change meaning or that should not be used together), changes in the semantics of output and input data.
The service provider should assume that the service consumer's knowledge of the service is minimal. The service consumer merely wants to achieve the task exposed via the service.
Lack of documentation
Many services are poorly described. Even when the services are registered with, for example, the BioCatalogue, only minimal effort is put into documenting them. An undocumented service is an unusable service. The documentation should include, at least,
- description of the task that the service performs - what it does from a service consumer's point of view
- what can be used as input, with example values and description of what happens if the value is not specified
- what will be returned as output, including example values
- possible error messages, including what they mean from the point of view of the input data and the intended task. The error message should not be described by relation to what has gone wrong in the provider's code
It is very useful if the documentation includes:
- example workflows (including example input data and results) that use the service
- other services that work well with the service
Services should expose consumer-oriented tasks i.e. what a service consumer wants to do. Services should not expose provider-implemented operations i.e. what the service provider's code does to perform a task.
If the service that is being exposed corresponds to methods or functions within the service provider's code, then the service consumer is forced to use it in a specific order to achieve a task and to process implementation-specific data.
Exposure of implementation data
The internal details of the service implementation should be hidden, as far as possible. For example, just returning a serialization of a Java object is not a sensible or usable response to a service call.
The exposure of even internal ids can cause problems as the services may assume that an internal id is passed in. How is the service consumer meant to know the id?
Wherever possible, implementation classes, data and ids should not be exposed by the service. The service provider should look at what is actually returned by some example service calls and consider if they are actually usable.
Do not return HTML unless it is HTML data
REST GET services are very similar to fetching pages for a web browser. Some service providers simply deliver an HTML document corresponding to what they would return when the corresponding web page is fetched.
Unless the service is requesting a HTML document, the service provider should not return an HTML document. The service should probably return the data that underlies the HTML document.
Because it is possible to use web-content providers as if they are REST services, it is possible that apparent REST services within a workflow will actually be fetching HTML documents. Wherever possible it should be made clear that it is not a "proper" REST service.
Do not specify the format as a parameter of a REST service
The HTTP protocol used by REST has a mechanism for specifying in the message header the format that the data should be returned in. For example:
REST services should not specify the format as part of the URL or message body e.g.
Such a resource should be requested from http://www.example.com/fred with Accept header application/xml.
The use of a format parameter can cause problems. For example, if the service consumer requests "http://www.example.com/fred?format=xml" with "Accept: application/json" then it should be impossible for the transaction to take place.
Be consistent with what is returned
Some REST services return JSON for certain resources, a choice of JSON or XML for other resources, and XML for other resources. This is very confusing for service consumers. Services should be consistent in the formats in which they can return data.
Specify the format
If a service is returning XML then the service provider should specify the xsd or dtd to which the data conforms. Similarly, if the service takes XML then the xsd or dtd of the data should be documented.
If the service is a WSDL service, then the data format should be specified via inclusion and referencing of the xsd within the WSDL. It is not correct to just specify xsd:any.
Try to re-use formats
There are a large number of existing formats in which data can be specified. Wherever possible an existing format should be used by the service. The minimal advantage to the service provider of defining their own format is normally outweighed by maintenance costs; there are also inevitable costs for the service consumer in understanding and processing different formats.
Return the correct errors
Many service providers do not return the correct errors. For example, it is not uncommon for HTTP 500 code (Internal server error) to be returned when attempting to access data that does not exist. Service providers should not only document what errors are returned, but also check (as far as feasible) what happens when unexpected requests are received.
Polymorphism is when the task performed by the service is determined by one of the parameters to the service call. For a polymorphic service, the validity, meaning or permitted values for other parameters may depend upon the value of the controlling parameter. It is possible to have polymorphic WSDL or REST services.
For example, consider a polymorphic service that allows you to query one of several databases, books or films, then a query may appear as:
where it is not valid to specify the director for a book and the semantics of author switches from writer to screenwriter depending upon the database.
It is far better to have separate resources corresponding to, in this case, the separate databases with the correct parameters specified for the resources.
Avoid creating services with unreasonable number of operations
There is a number of services that have a large number of operations. For example, 'togows' services, described in BioCatalogue, have between 72 to 369 operations. They make annotation and usage very difficult.
Avoid parameter abuse
Many services abuse their input parameters to pass the information to the service that changes, for example, the method invoked, the database searched or the content type of the returned data. This parameter abuse hides the resources used an is particularly present in REST services (also see Avoid polymorphism and Do not specify the format as a parameter of a REST service points above).
For example, not all services follow the content negotiation principle whereby you can ask for the resource in different formats just by changing the HTTP header value to another known MIME type; some services expose the content type as an actual parameter to the REST call.
Service as a parameter:
Database name as a parameter:
Favour contract-first over code-first approach
“An interface is a contract between data provider and data consumer”, Lincoln Stein, 2002.
Web services are nearly always implemented as an afterthought. Service providers usually already have some local code that they want to expose as a public service. So they generate an interface to it automatically in the "hack and publish" approach. The interface ends up being cumbersome and hard to use and understand, tied to underlying implementation, exposing internal ids, class names and formats.
Service providers should write contract-first services - define the service's interface and data types first and then implement the service according to the interface.
Please refer to "Contract-First Web Services: 6 Reasons to Start with WSDL and Schema" for more details.
Design service with compatibility and interoperability in mind
More than 70% of the steps in a workflow are "gluey things" that convert data from one format to the other and less than 30% do some science with the data. If there is a data compliance model in your service domain you should consider coding against it. There are a few service initiatives to define data exchange formats and service ontologies, such as BioXSD Data Exchange Format, EDAM Data and Methods Ontology, BioSharing, VO Table interchange format, etc.