Skip to end of metadata
Go to start of metadata

Requirements for Semantic Infrastructure

This page discusses the API level and (middleware) user requirements for Semantific Web (Linked Data) support for Taverna and other e-Science tools (e.g. e-Laboratories for the BioBanking community).

Background information: AIDA

The AIDA (Adaptive Information Disclosure Application) toolbox is a suite of services for knowledge management (semantic infrastructure) and knowledge extraction. AIDA provides (middleware) components for user-oriented applications such as Taverna or the AIDA Web Application. Requirements from these applications drive its development. Core services support Semantic Web applications by providing Java code (Apache licenced) and SOAP-based services for communication with Semantic Web repositories, both at a low level (e.g. SPARQL endpoints supporting the Sesame API and others) and via mappings to simpler ontologies, in particular SKOS (enables browsing of RDF graphs in terms of 'narrow-than' and 'broader-than' relations). A Taverna plugin is being developed to enable personalised annotation of workflow results, thereby enabling data from personal workflows to become part of the Linked Data cloud.

User Focus groups

  • Data miners (e-Lico project)
  • Human Geneticists and BioBankers (BioSemantics focus group Human Genetics Department Leiden, NL)
  • Taverna users
  • Concept Web users
  • Social scientists (to be specified by Tom Visser)

General requirements for Semantic Infrastructure

User wishes
  • Infrastructure for personal semantic annotations and semantically annotated data as part of the Linked Data cloud.
    While the user can annotate with familiar terms in his/her domain, storing the actual RDF data is invisible to him/her. Setting up and using the repository should be as easy as setting up a DropBox or Google account. Applications such as Taverna should be aware of such a personal Linked Data store, presumably stored in the user preferences.
  • Authorized edits of semantic data from within applications (pertains to controlled collaborative ontology building, where changes are linked to the author of the changes)

Requirements for Semantic Support within Taverna 

This part discusses the API level and (middleware) user requirements for Semantific Web support in Taverna.

User wishes specific for Taverna (e-Lico, BioSemantics/Human Genetics, Social sciences?)

  • Define personal annotations of workflow results at design time: these will be linked to the results produced at run-time, presumably connected 'under the hood' to Taverna provenance.
  • Annotate workflow results after running the workflow.
  • A way to store the links between workflow and semantic annotations (in theory this can all be done 'by reference' cf RDF principles, myExperiment seems an obvious choice to provide workflow identifiers).
  • Workflow components that will interogate the semantic store at run-time (current AIDA services already allow this, but how much of that should be visible in the workflow?)
  • Simple browsing and drag-drop of semantic data at design time (before running), run-time (components in the workflow), after run-time (workflow results)
  • Advanced SPARQL querying at design time, run-time, after run-time.
  • Collaborative ontology building = authorized edits of semantic data from within Taverna

Quicklist of general requirements:

  • Identifiers for ports (especially this!)
  • Identifiers for workflow components
  • A way to store and represent (i.e. export) the semantic annotations of workflow components 
    • A URL to the default repository to store user annotations (this will be provided by AID/CWA, details in a separate project "Semantic Dropbox")
    • SA-WSDL - Semantic Annotation of WSDL, for export of workflows that contain semantic annotation

 Use Case Scenario:

User has started to compose a workflow from several web services. One of the web services produces chromosome position output which is declared in WSDL to be of type Integer. The user wants to provide a Semantic Data Type of "Chromosome Position" (@URI TBD, e.g. from NCBO ontology at http://bioportal.bioontology.org). The user right-mouse clicks on the corresponding port and selects "Label with Semantic Type" from the context menu. A pop up window enables the user to select the appropriate concept from OWL/RDF by autocompletion/search/browsing (this functionality is already provided by AIDA). The concept URI is then stored in association with the port identifier. Example triple: (taverna:port12345 hasSemanticDataType GO:'Chromosome Position').

API requirements:

Warning: Pseudocode scratchpad from rusty programmer - to be refined later

In general, URI's (interoperable identifiers) are made part of Semantic Java Objects such as SemanticType and WorkflowPort.

get and setSemanticType

getSemanticType retrieves the SemanticType ( Java Object  ) associated a given port ID.

SemanticType getSemanticType(SemanticRepository currentSemanticRepository, WorkflowPort id)

{          RepositoryConnection connection = SemanticRepository(connect);          connection.getSemanticType(id);          ....  }

 AIDA has services that make it possible to do the above with calls to web services. For example, see addRDF and selectQuery at http://ws.adaptivedisclosure.org/ .

See http://www.adaptivedisclosure.org/aida  for details and javadocs.

Labels