Copyright © TDWG (2010). All Rights Reserved.

This version: tdwg_tapir_specification_2010-05-05.htm

Latest version: www.tdwg.org/activities/tapir/specification

Previous working draft: www.tdwg.org/dav/subgroups/tapir/1.0/docs/tdwg_tapir_specification_2009-09-08.htm

Date: 05 May 2010

Technical writers of this document

Main contributors to the creation of the TAPIR protocol

Abstract

This document specifies the TAPIR protocol and the structure and syntax of TAPIR messages. This work is a product of the TDWG TAPIR task group, details of which can be found at www.tdwg.org/activities/tapir.

Target audience

This document contains technical details about the TAPIR protocol. Previous knowledge about HTTP, XML and XML Schema is assumed. This document was written for developers that need to build TAPIR compliant software, network administrators that need to prepare TAPIR documents, and users that want to directly interact with TAPIR services.

There are other documents about TAPIR available. For a general overview there is an Executive Summary. More specific information about how to build TAPIR networks can be found in the TAPIR Networks Guide.

Status of this document

Final version approved as a TDWG standard.

The English language version of this document is the only normative version available.

Comments about this document can be sent to the TDWG Architecture Group mailing list: tdwg-tag@lists.tdwg.org. Subscription and archives are open to the public.

Copyright notice

This document follows the Creative Commons License Deed: Attribution 3.0

You are free:

For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to this web page. Any of the above conditions can be waived if you get permission from TDWG. Apart from the remix rights granted under this license, nothing in this license impairs or restricts the author's moral rights. Your fair use and other rights are in no way affected by the above. This is a human-readable summary of the Legal Code (the full license).

Disclaimer

This document and the information contained herein are provided on an "as is" basis. TDWG makes no warranties regarding the information provided, and disclaims liability for damages resulting from its use.

Table of Contents


  1. Introduction
  2. Symbols, terminology conventions, and examples
  3. General Aspects of the Protocol
  4. TAPIR XML Documents
  5. Operations
  6. Global Parameters
  7. Counting and Paging
  8. Filters, Expressions and Operators
  9. KVP (Key-Value Pair) Requests
  10. The TAPIR XML Schema
  11. Appendix


1. Introduction

This document specifies the TDWG Access Protocol for Information Retrieval (TAPIR). TAPIR is a Web Service protocol to perform queries across distributed databases of varied physical and logical structure. It was originally designed to be used by federated networks.

TAPIR is intended for communication between applications, using HTTP as the transport mechanism. Its functionality is available through five types of request-response operations addressing the following needs: retrieve service metadata, retrieve service settings, inspect available content, perform queries, and monitor service availability. TAPIR does not include operations for adding, updating or deleting data on provider databases. Requests can be encoded in XML or simple URL parameters. Responses are always structured in XML. TAPIR uses the XML Schema Definition language to describe and validate the structure of XML request and response messages sent between a client (the requesting software) and a server (the provider of the data or service).

The underlying implementation and data model of provider databases remain opaque to TAPIR clients because all queries reference elements (concepts) from data abstraction layers (conceptual schemas). TAPIR providers advertise through the capabilities operation which elements from one or more data abstraction layers are supported by the service. Clients can thefore formulate queries referencing these elements.

Since TAPIR is not bound to any particular data abstraction layer, it is necessary to define (or reuse) at least one data abstraction layer to set up a TAPIR service. Networks or individual providers are free to define their own data abstraction layers.

TAPIR was also designed to be independent of any particular structure for search responses (known as output models). It is necessary to define at least one output model, or to reference an existing one, to use the search operation. Networks or individual providers are free to define their own output models.

TAPIR's flexibility makes it suitable to both very simple service implementations where the provider only responds to a set of pre-defined queries, or more advanced implementations where the provider software can dynamically parse complex queries referencing output models supplied by the client.

Although TAPIR XML requests and responses can be validated using an XML Schema, the full TAPIR protocol contains additional rules that need to be followed and are specified on this document.

2. Symbols, terminology conventions, and examples

The following conventions are used throughout this document to aid clarity:

<...> Enclosing angle brackets indicate that the enclosed term refers to an XML element name.
@... @ sign before a term indicates that the term is an attribute name (but @ is not part of the name).

The following symbols are used to define elements and terms:

::= Used to indicate the content equivalent of an element or term.
? When used after an element or term in a term definition, it indicates that the element or term is optional. When used in a URL it indicates a parameter list for a GET statement.
+ Indicates that an element or term must be represented one or more times.
* Indicates that and element or term can occur zero or more times.
| A vertical bar between terms or elements indicates alternatives, as in a choice from a list.

The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are applied in the normative sense according to RFC 2119.

This specification makes use of numerous examples. They all refer to fictitious service addresses and data, unless otherwise declared. Most examples focus on specific parts of TAPIR messages, omitting other parts for the sake of clarity. This means that most XML examples will not be entirely valid according to the TAPIR XML Schema since they are incomplete.

3. General Aspects of the Protocol

3.1. Available Operations

The TAPIR protocol defines five operations:

Each operation consists of a request and the corresponding response. Being a stateless protocol, operation calls are always completely independent from each other.

3.2. Message Transport

TAPIR messages (requests or responses) are intended to be transmitted by means of the Hypertext Transfer Protocol (HTTP) version 1.0 or greater. TAPIR providers must accept both GET and POST methods for requests.

3.3. Access Points

TAPIR access points (or end points) are represented by an HTTP Uniform Resource Locator (URL) which is used to interact with the service. Access points therefore include the transport protocol (HTTP or HTTPS), the host name, an optional port number, an optional path with or without a script name, and an optional query string. The URL used to interact with a TAPIR service must be valid according to the HTTP Common Gateway Interface (CGI) standard, and the query part of the URL must be encoded to protect special characters according to UTF-8 RFC 3986.

To ensure the widest access possible to the service, TAPIR providers are recommended to use default port numbers (80 for HTTP, 443 for HTTPS) since clients may sometimes be located in places with restrictive firewall policies.

3.4. Message Encoding

3.4.1. Requests

TAPIR requests can be encoded in two ways:

3.4.1.1. Key-Value Pairs (KVP) Parameters

Requests are possible through HTTP GET or POST with the specific KVP parameters for each operation. Support for KVP request encoding is mandatory for all TAPIR service implementations.

An example of a KVP TAPIR capabilities request is:

http://example.net/tapir.cgi?op=capabilities

3.4.1.2. XML

TAPIR requests can be encoded entirely as an XML document. The support of XML request encoding is optional for TAPIR service implementations. XML requests have the following general format:

  <?xml version="1.0" encoding="utf-8" ?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0">
    <header>
      <!-- header specific elements -->
    </header>
    <operation_name>
      <!-- operation specific parameters -->
    </operation_name>
  </request>

A capabilities request, which does not require specific parameters, takes the following form:

  <?xml version="1.0" encoding="utf-8" ?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0">
    <header>
      <!-- header specific elements -->
    </header>
    <capabilities/>
  </request>

XML requests can be sent to a service in two ways:

Since all encodings can potentially be present in the same HTTP request, the following precedence should take place:

  1. If the provider supports XML encoding, when the "request" parameter is present (either though GET or POST), TAPIR parameters must be taken from the corresponding XML.
  2. When the "request" parameter is not present and the HTTP request includes other parameters (either through GET or POST), TAPIR parameters must be read as KVP.
  3. If the provider supports XML encoding, when the HTTP method is POST without parameters, TAPIR parameters must be taken from the XML as raw POST data.

3.4.2. Responses

TAPIR responses should always be encoded in valid XML and by default have the HTTP Content-Type set to "text/xml". The charset must be consistent considering the HTTP header, the XML declararion and the output stream. It is not recommended to omit the charset in the HTTP header because RFC 3023 indicates that in this case parsers should assume "us-ascii", which may be inconsistent with the XML declararion and the output stream. The recommended charset to be used with "text/xml" is UTF-8.

The XML content of TAPIR reponses normally includes: a header section, the specific result from the requested operation and an optional diagnostics section. All structure that encloses the specific operation result is known as TAPIR envelope. The envelope consists of the response element, the header content, the operation element, the summary content (for inventory or search operations) and the diagnostics content.

Metadata, capabilities, inventory and ping responses should always include the TAPIR envelope. Only in search responses this may not occur if the "envelope" parameter is set to "false". The "envelope" parameter is only availabe in the search operation. When the "envelope" parameter is set to "false", the content returned will be completely determined by the output model definition, and the service is free to perform HTTP content negotiation. This may result in a different HTTP Content-type being returned (for instance "application/rdf+xml" for RDF encoded in XML). When the "envelope" parameter is set to "false" and the search returns no results, the service must return an HTTP 204 code (No Content).

  <?xml version="1.0" encoding="utf-8" ?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0">
    <header>
      <!-- header specific elements -->
    </header>
    <operation_name>
      <!-- operation specific results -->
    </operation_name>
    <diagnostics>
      <!-- diagnostics information -->
    </diagnostics>
  </response>

Example of the general message format of a TAPIR response. Everything except what goes in the placeholder indicated by "operation specific results" is known as the TAPIR envelope

3.5. Namespaces

The namespace for this version of TAPIR is http://rs.tdwg.org/tapir/1.0

The TAPIR namespace must be the default namespace in all TAPIR messages.

TAPIR metadata responses include elements from the following namespaces:

Prefix Namespace Source
xml http://www.w3.org/XML/1998/namespace XML
xsd http://www.w3.org/2001/XMLSchema XML Schema
dc http://purl.org/dc/elements/1.1/ Dublin Core
dct http://purl.org/dc/terms/ Dublin Core Terms
geo http://www.w3.org/2003/01/geo/wgs84_pos# Basic geo vocabulary
vcard http://www.w3.org/2001/vcard-rdf/3.0# VCARD

TAPIR metadata responses may change the prefixes for all namespaces above, except the "xml" prefix which is reserved according to Namespaces in XML 1.1.

TAPIR search responses may include other namespaces according to the response structure being used.

3.6. Data Abstraction

3.6.1. Conceptual Schemas

In TAPIR, conceptual schemas can be understood as a formal definition of concepts that are used for querying and reporting the content of databases. They provide the necessary data abstraction layer to be used on top of specific implementations from each participant of a TAPIR federated network. Conceptual schemas usually focus on specific areas of knowledge, providing data models with various levels of detail.

Although the main TAPIR operations always need to reference concepts, the protocol was created to be independent of any particular conceptual schema. TAPIR networks and data providers are free to create or choose from existing conceptual schemas. However, it is important to note that the interoperability level across different TAPIR providers will depend on the conceptual schemas that they use. TAPIR providers can only understand queries that reference known concepts, i.e., concepts that were locally mapped and that are advertised in its capabilities. Therefore, TAPIR clients cannot send the same search request to two TAPIR providers that have mapped different conceptual schemas (unless the request references by alias equivalent query templates or output models that have been assigned the same alias by both providers).

TAPIR messages can reference concepts from multiple conceptual schemas, which means that conceptual schemas can be modularised and extended if necessary.

TAPIR does not enforce any particular format or encoding for conceptual schemas. Examples of languages that can be used to define conceptual schemas include XML Schema, RDF Schema, XMI and others. From TAPIR's perspective, conceptual schemas should minimally list the concepts, indicating their meaning and datatypes. Since provider software may typically need to parse existing conceptual schemas during configuration, a common format is suggested in appendix 2 to describe conceptual schemas used by TAPIR networks.

3.6.2. Concepts

Concepts are general definitions of classes of objects or their characteristics. They are defined externally to TAPIR. Although concepts can potentially represent classes, relationships or attributes, this version of TAPIR limits its use to attributes (e.g., species name, observation date, locality name, registration number, etc.) whose context is defined by the conceptual schema or output model being used.

In TAPIR, concepts are referenced by identifiers, which are always treated as simple strings. These references take place in different parts of the protocol, such as filter expressions, output model mappings, capabilities responses and inventory operations. The TAPIR XML Schema defines a "qualifiedConceptReferenceType" that is used by most elements representing concepts, and which consists of a complex type with an attribute called "id". TAPIR makes no assumptions about how concept identifiers are defined and it does not enforce any particular pattern. However, fully qualified concept identifiers are recommended to be:

By being globally unique they can be distinguished from any other possible concepts. By being permanently resolvable, a formal definition of the concept can be retrieved whenever necessary. By being free from reserved characters for the query term of URLs, they can be used directly as HTTP GET parameters in TAPIR KVP request encoding.

When concepts come from a data abstraction layer defined in XML Schema, the recommendation for concept identifiers is to concatenate the namespace of the schema with the local xpath to the instance element that corresponds to the concept. Taking the following XML Schema as an example of a conceptual schema:

  <?xml version="1.0" encoding="UTF-8"?>
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
             targetNamespace="http://example.net/cs/1.0/"
             elementFormDefault="qualified">
    <xs:element name="Person">
      <xs:complexType>
        <xs:sequence>
          <xs:element name="Name" type="xs:string"/>
          <xs:element name="Email" type="xs:string"/>
          <xs:element name="Birth" type="xs:date" minOccurs="0"/>
        </xs:sequence>
        <xs:attribute name="Id" type="xs:string" use="required"/>
      </xs:complexType>
    </xs:element>
  </xs:schema>

The element represented by the xpath "/Person/Name" would be identified as:

http://example.net/cs/1.0/Person/Name

And the "Id" attribute would be identified as:

http://example.net/cs/1.0/Person/@Id

Providers can also associate an alias to each concept, declaring aliases in capabilities responses. In this case, concepts may also be referenced directly by the alias (when no alias is associated with the corresponding conceptual schema) or by the following notation when both the concept and its conceptual schema are associated with an alias:

Concept_Alias '@' Conceptual_Schema_Alias

Assuming that the concept in the previous example is associated with the alias "PersonName" and its conceptual schema is associated with the alias "cs1.0", this would enable the concept to be also identified using the following short notation:

PersonName@cs1.0

3.7. Output Models

Output models are central to the search operation. They define or reference a generic XML response structure based on XML Schema and specify a mapping between nodes in the schema and concepts from one or more conceptual schemas. To a certain extent, the mapping section gives meaning to XML nodes in the structure, and clearly shows that the same concepts can be represented and structured in different ways in XML. Different output models can therefore be created for the same set of concepts.

Output models may indicate which global element defined in the response structure should be the root element in search responses. When not specified, the first global element must be used.

Output models must specify an indexing element by pointing to a node in the structure. The indexing element will be used as a reference for counting records and paging results.

Output models can be defined and used in various ways. They can be created or recognised by the provider and then advertised as <knownOutputModels> in the capabilities response. If the provider supports <anyOutputModels> then the client may create their own models either as external documents or as in-line definitions in an XML search request. All required concepts in each output model advertised by the provider must refer to concepts mapped by the provider and advertised as <mappedConcepts> in the capabilities response.

The different ways of using output models in TAPIR allow for providers with different levels of service capability. Some providers may have a fixed (hard-coded) way of producing XML results that corresponds to each of their known output models, which means they do not need to have the ability to dynamically parse output model definitions. On the other hand, providers that have the ability to parse output model definitions (<anyOutputModels> capability) may choose to parse known models in the same way as they do for arbitrary models provided by clients.

3.8. Query Templates

Query templates extend the idea of output models. If output models define what type of content should be returned and how it should be structured, query templates can add pre-defined, parameterised filters and other constraints depending on the operation.

Query templates can be used by search and inventory operations. An inventory template specifies one or more concepts and an optional filter. A search template specifies an output model, an optional filter, and an optional order by parameter (pointing to concepts in the output model).

In search operations, the same output model can be referenced and used by many different search templates, which could be related to different parts of the output model, or include different filter criteria for different contexts, etc.

The same flexibility for creating and using output models is available for query templates. Data providers can create their own specific templates, or choose from existing templates and then declare them in capabilities responses. Providers can also have the ability to dynamically parse arbitrary templates defined by clients.

4. TAPIR XML Documents

TAPIR XML documents can be messages exchanged in operations or external resources such as output models and query templates, which can be referenced by search or inventory requests. All TAPIR documents must be valid XML documents and have a root element that validates against the TAPIR XML Schema. The root element of a TAPIR document declares its purpose and includes the namespaces referenced inside the document. It is recommended that the root element includes the "xsi" namespace and the TAPIR schema location to facilitate validation of messages. The possible root elements for each type of TAPIR document are:

  Root element ::= ( <request> | <response> | <outputModel> | <inventoryTemplate> | <searchTemplate> )

  <?xml version="1.0" encoding="UTF-8" ?> 
  <request xmlns="http://rs.tdwg.org/tapir/1.0" 
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0 
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
     <!--   header and operation specific content -->
  </request>

Example: Root element with namespace declarations in a TAPIR request document

4.1. Request and Response Documents

TAPIR XML request documents consist of a header followed by an element indicating the operation, which in turn may contain additional parameters.

  Request type ::= <header> 
                   ( <metadata> | <capabilities> | <inventory> | <search> | <ping> )

TAPIR XML response documents usually consist of a header followed by an element indicating the operation and including the result. The operation element can be followed by an optional diagnostics element.

  Response type ::= <header>
                    ( <metadata> | <capabilities> | <inventory> | <search> | <pong> | <error> | <logged> ) 
                    <diagnostics>?

This general structure of a TAPIR <response> element including a <header>, an operation related element and an optional <diagnostics> is the TAPIR envelope. The TAPIR envelope can be turned off in search operations by setting the "envelope" parameter to "false". In this case the root element of a response will be determined by the output model response structure definition, and the TAPIR namespace should not be included at all.

4.1.1. Header

The purpose of headers is to give information about the source and destination of the operation, as well as timestamp and software related to the source. Headers must be present in all XML requests and all XML responses, except in search responses when the parameter "envelope" is set to "false" in the corresponding request. A TAPIR header has three parts:

  Header type ::= <source>+ <destination>? <custom>?

The <source> element gives information about where the message originated and is repeatable to enable tracing back through any intermediary steps when the message has passed through more than one server in a cascading operation. Each intermediary service must add its address as a new <source> item at the end of the list.

The <destination> element is used to indicate the final target for a TAPIR message. It can be used when there are intermediary layers between the client and the server. This element is intended to help communication between clients and message brokers. The destination element takes a simple string, which will usually be a URI but can be anything, including codes or identifiers specific to networks. The <destination> element is optional and a TAPIR provider is free to ignore it.

A <custom> element serves as an extension slot for any additional information not defined in the schema. It can be used to put whatever extra information an implementer wishes to add.

Source elements correspond to each software agent that created or processed the message until it reached the current service that received the message. It has three parts:

  Source type ::= @accesspoint @sendtime <software>?

In requests, the "accesspoint" attribute of each <source> element must contain the IP address of the corresponding software agent, except the last <source> element. The IP address of the last source should always be taken from the REMOTE_ADDR environment variable. In responses, the "accesspoint" attribute must contain the service URL.

The "sendtime" attribute must be used to record the time that the message was sent or processed in the associated software agent. Its content must be recorded in ISO 8601 datetime format.

The <software> element can be used to identify the software used to process the TAPIR message. It is defined as follows:

  Software type ::= @name @version <dependencies>*

The attributes "name" and "version" are simple strings and should be used to indicate the name and version number of the software.

The <dependencies> element can be used to list any other software, libraries, framework or operating system related to the declared software. The dependencies element includes repeatable instances of <dependency> which references back the software type.

  <request xmlns=http://rs.tdwg.org/tapir/1.0
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
                 xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                                     http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
        <software name="TapirClient" version="3.0"/>
      </source>
    </header>
    <capabilities/>
  </request>

Example of a TAPIR request message encoded in XML. Here the client is asking for service capabilities.

4.1.2. Operation-specific Elements

Operation elements can contain specific parameters (in requests) or specific results (in responses) related to the operation. They will be described in more details in the next sections.

4.1.3. Fatal Errors

Errors are normally listed in the diagnostics section of a TAPIR message. When there are extreme errors that cause the system not to be able to formulate a proper response, for instance if a database connection error occurs or the user requests a service that the provider does not supply, then the <error> element may be used inside the <response> body.

  <response>
    <!-- omitting header -->
    <error code="DBS_CONNECTION_ERROR" 
           level="fatal" 
           time="2005-11-11T12:23:57.023+01:00">Could not connect to database</error>
  </response>

Example of a fatal error message given in replacement for the normal operation element in a response

4.1.4. Log-only Operations

When requests have the "log-only" attribute set to true, TAPIR providers produce a <logged> element in responses instead of the expected operation element. Log-only operations can be used to report back to the original data providers when users access their data from data aggregators.

4.1.5. Diagnostics

The <diagnostics> element can be used by responses for declaring errors and statistics. It consists of an optional list of multiple <diagnostic> elements.

  diagnostic type ::= @code? @level @time? message

The "code" attribute is an optional system code for the error or message. There are no standard codes defined in this specification. TAPIR providers are free to define and use their own codes.

The "level" attribute is mandatory and can assume the following values to indicate the severity of the diagnostic:

  @level ::= ( "debug"|"info"|"warn"|"error"|"fatal" )

An optional ISO 8601 datetime value can be included in the "time" attribute.

Diagnostic messages are text strings defined by implementations.

  <response>
    <!-- omitting header -->
    <!-- omitting operation -->
    <diagnostics>
      <diagnostic level="error" code="REQ_READ_MODEL_FAILED">The requested model could not be read.</diagnostic>
      <diagnostic level="warning" code="REQ_STRUCTURE_UNKNOWN_SCHEMA_TAG">
        Unknown xml schema element "test" encountered in line 3. Ignored.
      </diagnostic>
      <diagnostic level="info" code="RSP_ELEM_DROP">
        XML element "myel" dropped because it misses required attribute "myattr"
      </diagnostic>
      <diagnostic level="debug">Start reading the datasource configurations</diagnostic>
    </diagnostics>
  </response>

Example of a diagnostics section in a response document

4.2. Output Models

TAPIR search requests and search templates can refer to an output model document which is accessed as an external resource by its URL. An output model document must be an XML document that validates against the "outputModelType" defined in the TAPIR XML Schema. Output model documents have no header or diagnostic sections and use <outputModel> as the root element including the namespace declarations.

Output model documents can include optional (but recommended) documentation elements for a name (<label>) and description (<documentation>) that are of value in managing multiple output models and informing users of their function.

Output models can be defined with the following elements:

<structure>
Defines the search response structure using a subset of the XML Schema language. Required.

<rootElement>
Indicates the name of a global element in the structure that should be the root element in search responses. The name is specified in the @name attribute. When not present, the first global element must be the root element. Optional.

<indexingElement>
Points to a node in the structure, usually unbounded, that should be used for counting the records returned and paging results. The indexing element is referenced by the @path attribute using a simple XPath that points to a response structure node. Required.

<mapping>
Maps nodes from the structure definition (given under <structure>) to concepts, literals or environment variables known to the provider software. Required.

4.2.1. Response Structures

Response structures define through XML Schema how search responses should be structured in XML. A response structure must start with the root element <schema> and all its elements must be in the namespace http://www.w3.org/2001/XMLSchema. TAPIR response structures must always specify a "targetNamespace" and there should be at least one global element definition. The global element that should be used to instantiate the root element in the resulting XML is specified in the output model. When not specified, the first global element should be used for this purpose.

TAPIR providers must declare as part of their capabilities to what extent they understand the XML Schema language. This may vary from full support to a large set of XML Schema constructs to no knowledge at all (in cases when only a limited number of pre-defined output models or query templates needs to be understood, or in cases when searches are not supported). Only providers that declare they support "any" output models must be able to dynamically parse response structures to understand how search results should be generated. In these cases, the minimum set of XML Schema constructs that needs to be understood is known as the "basic schema language", which includes:

TAPIR providers are not forced to guarantee the entire validity of search responses according to the XML Schema defined in the response structure, except to the extent of its own declared XML Schema capabilities. When response structures are parsed, providers are recommended to raise warnings instead of errors when an unsupported XML Schema construct is found.

4.2.2. Output Model Mapping

Each data node in the output model structure must be mapped to one or more concepts (<concept>), literals (<literal>) or system variables (<variable>). The entire mapping is made of individual <node> mappings, where the attribute @path takes a simple Xpath to identify the output structure node. Individual mappings are completed by specifying one or more sub-elements (<concept>|<literal>|<variable>) associated to each node.

Node identifiers follow a subset of the XPath language using relative location paths from the root element inside the schema definition (inside <structure>) to the desired node. All steps are separated by the "/" separator. The XPath expression used here is actually based on the corresponding nodes of an instance document, and not the schema definition itself. Attribute nodes need the prefix "@".

When multiple mappings are cited for the same node, the final result must be the concatenation of the respective values.

  <node path="/FeatureCollection/featureMember/LocationGML/Point/coordinates">
    <concept id="http://example.net/schema/Longitude" required="true"/>
    <literal value=","/>
    <concept id="http://example.net/schema/Latitude" required="true"/>
  </node>

Example: Concatenated concepts in an output model mapping

When the response structure makes use of different namespaces, they all need to be declared and associated to a prefix in the "outputModel" element. In this case, node paths must include the prefixes. The same applies to the "indexingElement" path.

  <node path="/r:records/r:record/dc:modified">
    <concept id="http://example.net/schema/DateLastModified" required="true"/>
  </node>

Example: Node mapping involving different namespaces. Prefixes "r" and "dc" must be declared in the "outputModel" element

Concepts and variables have an optional attribute "required" which defaults to "false". When a concept is required but is either not mapped by the provider or is mapped but evaluates to NULL, the provider must return an error. When a variable is required and is either not available from the provider or evaluates to NULL, the provider must return an error. If a concept or variable is optional, associated to a mandatory node, and is either not understood by the provider or evaluates to NULL, the node must be included with an empty content in the response. In this case it is recommended for providers to raise a warning in the diagnostics. If a concept or variable is optional, associated to an optional node, and is either not understood by the provider or evaluates to NULL, the node must not be included in the response. The resulting XML should always be "greedy", which means that when a provider has content for a node, the node must always be included in the response. The following table summarizes how providers should behave in all possible situations.

@required mandatory node (*) provider has content (**) no content (***)
true yes include node raise error
true no include node raise error
false yes include node include empty node & raise warning
false no include node do not include node

(*) node is mandatory in response structure.

(**) action when provider has associated content.

(***) action when provider does not have associated content, or content is NULL.

In a concatenation, optional concepts or variables should be replaced by an empty string if they evaluate to NULL or are not understood by the provider.

The <mapping> element also includes an optional Boolean @automapping attribute. When automapping is set to "true", nodes should be automatically mapped to their equivalent concept identifiers (by concatenating namespace and local path. This is done when the model’s structural schema is also seen as a conceptual schema, therefore avoiding redundant mappings. This kind of special model is also referred to as a canonical model.

  <?xml version="1.0" encoding="UTF-8"?>
  <outputModel xmlns="http://rs.tdwg.org/tapir/1.0"
               xmlns:xs="http://www.w3.org/2001/XMLSchema"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                                   http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd
                                   http://www.w3.org/2001/XMLSchema
                                   http://www.w3.org/2001/XMLSchema.xsd">
    <label>Specimen Records</label>
    <documentation>Simple output model for specimen data.</documentation>
    <structure>
      <xs:schema targetNamespace="http://example.net/simple_specimen">
        <xs:element name="dataset">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="specimen" minOccurs="0" maxOccurs="unbounded">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="identification" minOccurs="0" maxOccurs="unbounded">
                      <xs:complexType>
                        <xs:sequence>
                          <xs:element name="name" type="xs:string"/>
                          <xs:element name="identifier" type="xs:string" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attribute name="date" type="xs:string" use="optional"/>
                      </xs:complexType>
                    </xs:element>
                  </xs:sequence>
                  <xs:attribute name="catnum" type="xs:int" use="required"/>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:schema>
    </structure>
    <rootElement name="dataset"/>
    <indexingElement path="/dataset/specimen"/>
    <mapping>
      <node path="/dataset/specimen/@catnum">
        <concept id="http://example.net/schema1/CatalogNumber" required="true"/>
      </node>
      <node path="/dataset/specimen/identification/name">
        <concept id="http://example.net/schema1/ScientificName" required="true"/>
      </node>
      <node path="/dataset/specimen/identification/identifier">
        <concept id="http://example.net/schema2/PersonName"/>
      </node>
      <node path="/dataset/specimen/identification/@date">
        <concept id="http://example.net/schema2/DateText"/>
      </node>
    </mapping>
  </outputModel>

Example: A complete example of an output model showing the use of <structure>, <rootElement>, <indexingElement> and <mapping> elements

4.3. Query Templates

4.3.1. Inventory Templates

TAPIR inventory requests can refer to an inventory template document which is accessed as an external resource by its URL. An inventory template must be an XML document that validates against the "inventoryTemplateType" defined in the TAPIR XML Schema. Inventory template documents have a root element <inventoryTemplate> that includes its namespace declarations. There are no header or diagnostic sections in template documents.

Inventory templates can include optional (but recommended) documentation elements for a name (<label>) and description (<documentation>) that are of value in managing multiple templates and informing users of their function. The body of an inventory template includes a list of concepts upon which the inventory should be built and an optional filter section which allows the use of client-supplied parameters.

  <?xml version="1.0" encoding="UTF-8"?>
  <inventoryTemplate xmlns="http://rs.tdwg.org/tapir/1.0"
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                     xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                                         http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <label>Specimen Scientific Names</label>
    <documentation>Search and list unique scientific names of specimen 
    identifications ordered alphabetically. The parameter 
    "name" can be used as a filter condition indicating the 
    first characters of interest.</documentation>
    <concepts>
      <concept id="http://example.net/schema/ScientificName"/>
    </concepts>
    <filter>
      <like>
        <concept id="http://example.net/schema/ScientificName"/>
        <parameter name="name"/>
      </like>
    </filter>
  </inventoryTemplate>

Example: A simple inventory template which allows the client to supply a scientific name (with wild cards if required) as a filter

4.3.2. Search Templates

TAPIR search requests can refer to a search template document which is accessed as an external resource by its URL. A search template must be an XML document that is valid with respect to the "searchTemplateType" defined in the TAPIR XML Schema.

Search templates can include optional (but recommended) documentation elements for a name (<label>) and description (<documentation>) that are of value in managing multiple templates and informing users of their function. The body of a search template includes a choice of external or internal output model and elements for refining, filtering and ordering the search results, as follow:

<outputModel>
Reference to an external output model document or inline definition of an output model.

<filter>
Optional element to specify filter conditions (see section about Filters).

<orderBy>
Optional repeatable element that can be used to declare one or more concepts for ordering returned data. If the optional Boolean attribute @descend is set to "true", a descending ordering will be used instead of the default ascending one. The ordering behaviour is determined by the concept datatype declared in the capabilities response.
  <?xml version="1.0" encoding="UTF-8"?>
  <searchTemplate xmlns="http://rs.tdwg.org/tapir/1.0"
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                  xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                                      http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <label>Specimen by Name</label>
    <documentation>Search specimens by their scientific name. Result is 
    ordered by name (ascending) and catalog number (descending). 
    A parameter "name" can be used to build the filter.</documentation>
    <externalOutputModel location="http://example.net/models/names.xml"/>
    <filter>
      <like>
        <concept id="http://example.net/schema/ScientificName"/>
        <parameter name="name"/>
      </like>
    </filter>
    <orderBy>
      <concept id="http://example.net/schema/ScientificName"/>
      <concept id="http://example.net/schema/CatalogNumber" descend="true"/>
    </orderBy>
  </searchTemplate>

Example: An XML search template definition based on an external output model.

5. Operations

5.1. Metadata

The Metadata operation retrieves a basic description of the TAPIR service, such as its title, an abstract, keywords, related people and organizations, and copyright details. The inclusion of a language attribute in content elements allows content to be served in multiple languages.

Metadata responses should include enough information to be used by registries, such as UDDI, and to be used by other directory services that need to know general information about the content provided. Metadata are always related to a single TAPIR data provider, which is regarded as a completely independent service.

5.1.1. Metadata Request

Metadata is the default operation in TAPIR. The simplest way to invoke this operation is by calling the TAPIR access point directly without any parameters.

Using XML, the Metadata operation can be invoked by inserting the <metadata/> element after the header section in a request document. Metadata requests take no arguments or parameters.

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
      </source>
    </header>
    <metadata/>
  </request>

Example: Example of an XML metadata request document

In KVP request encoding, the Metadata operation can be invoked with a single parameter:

http://example.net/tapir.cgi?op=metadata

Or with no parameters at all:

http://example.net/tapir.cgi

5.1.2. Metadata Response

The structure of metadata responses must conform to the "metadataResultType" defined in the TAPIR XML Schema. Many of the elements in the "metadataResultType" are derived from the DC (Dublin Core) schemas.

The "metadataResultType" also includes an optional @xml:lang attribute to define a default language associated with all language-aware elements. Language-aware elements are those elements whose content is expressed in natural language. They are all unbounded and accept an optional attribute @xml:lang to specify the related language code. The language tag syntax of @xml:lang attributes is defined by the RFC 4646, Tags for the Identification of Languages (based on ISO 639) and the list of codes can be found in the IANA Language Subtag Registry. The @xml:lang attribute applies to the element that defines it and also to all of its sub-elements. Sub-elements can also specify the @xml:lang attribute, in which case it will override the default language defined in the scope of any parent elements.

The metadata elements include;

<dc:title>
The name or names of the service, which may be in multiple languages (using the xml:lang attribute to identify the language code). String. Required.

<dc:type>
The type of resource according to the Dublin Core Type Vocabulary. This value should be the same for all TAPIR providers: http://purl.org/dc/dcmitype/Service, unless the type vocabulary is refined or changed in the future. The purpose is to indicate that the resource is actually a service. String. Required.

<accesspoint>
The URL of the service. String. Required.

<dc:description>
The description may include, but is not limited to, an abstract, a table of contents, a reference to a graphical representation of content, or a free-text account of the content. Can be provided in different languages. String. Required.

<dc:language>
The language of content that can be returned by search and inventory responses. This element must follow RFC 4646 and use language codes specified by the IANA Language Subtag Registry. More than one language can be specified, in case the provider can serve content in multiple languages. When there is no linguistic content the "zxx" code must be used. String. Required.

<dc:subject>
Subject and Keywords. Typically, a subject will be expressed as keywords, key phrases or classification codes that describe content provided by the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme. String. Optional.

<dc:bibliographicCitation>
Recommended practice is to include sufficient bibliographic detail to identify the resource as unambiguously as possible, whether or not the citation is in a standard form. Can be provided in different languages. String. Optional.

<rights>
Information about who can access the resource or about its security status, access regulations, etc. String. Can be provided in different languages. Optional.

<dct:modified>
Date on which the service was last modified. Date string, optional.

<dct:created>
Date on which the service was created. Date string, optional.

<indexingPreferences>
Used to inform data aggregators and indexers about the preferred start time, duration and frequency for performing this operation. Optional element with three attributes:

  • @startTime: In the XML Schema time format.
  • @maxDuration: In the XML Schema duration format.
  • @frequency: In the XML Schema duration format.

<relatedEntity>
A required, complex element indicating the entities related to the service.

<custom>
Optional element of any type to include any additional information that goes beyond the standard TAPIR metadata.

5.1.2.1. Related Entities

Related Entities describe one or more entities and their roles with respect to the service. In UDDI terms, TAPIR Related Entities correspond to Business Entities. A Related Entity can be for example the organization or group that is hosting the service, providing the data, sponsoring the network, etc. This allows acknowledgement to any kind of organization or even person that is somehow related to the service.

Related Entities are defined by the "relatedEntityInformationType", which is comprised of <role> and <entity> elements defined by the "entityInformationType".

The elements defined by the "relatedEntityInformationType" are as follow:

<role>
Used to specify one or more roles of a related entity. The suggested vocabulary includes two values, "data supplier" or "technical host", but accepts other values. String. Required.

<entity>
Required complex element with the following sub-nodes:

@type
Attribute indicating if the entity is an "organization" (default) or a "person". String. Optional.

<identifier>
A globally unique identifier for the entity. String. Optional.

<name>
One or more names for the organization. Includes an @xml:lang attribute to record language. String. Required.

<acronym>
The usual acronym for the organization. String. Optional.

<logoURL>
A URL pointing to a small logo of the organization. String. Optional.

<description>
Text description of the service. Includes an @xml:lang attribute to record language. String. Optional.

<address>
Text to indicate where the entity is located. It can include street name, street number, district, city, county and other complements. Use the next elements to specify state/province, country and zip code. String. Optional.

<regionCode>
Region (e.g., state or province) of the specified address. Use standard abbreviation accepted in the country. String. Optional.

<countryCode>
Country code (ISO 3166-1-alpha-2 code) of the specified address. String. Optional.

<zipCode>
Zip or postal code of the specified address. String. Optional.

<relatedInformation>
A URL that points to further information about the entity. String. Optional.

<hasContact>
Required complex element for details of individuals related to the entity. Includes role and VCARD details (see Contacts).

<geo:Point>
Optional complex element to indicate the entity location in decimal degrees (datum WGS84). Conforms to the W3C Basic Geo Vocabulary.

  • <geo:lat>: Latitude in decimal degrees (WGS84). Float. Required.
  • <geo:long>: Longitude in decimal degrees (WGS84). Float. Required.
  • <geo:alt>: Altitude in meters. Float. Optional.

<custom>
Slot available for extending the scope of the entity information provided. AnyType. Optional.

5.1.2.2. Contacts

Related entities must indicate at least one contact and its role.

<role>
Used to specify one or more roles of a contact with respect to the service. The suggested vocabulary includes two values, "data administrator" or "system administrator", but accepts other values. String. Required.

<vcard:VCARD>
Complex element from the VCARD namespace for personal details. Includes an @xml:lang attribute to record language.

<vcard:FN>
Free text non-atomised, full name of contact. Includes optional @xml:lang attribute. String. Required.

<vcard:TITLE>
Free text for contact's job title (e.g., "Director"). Includes optional @xml:lang attribute. String. Optional.

<vcard:TEL>
Free text for contact's telephone number. Includes optional @xml:lang attribute and optional enumerated @TYPE attribute with any of the values home, msg, work, pref, voice, fax, cell, pager, bbs, modem, car, isdn, or pics. String. Optional.

<vcard:EMAIL>
Text for contact's email. String. Optional.

<custom>
Optional element that allows the inclusion of any additional structured information related to the contact.
  <?xml version="1.0" encoding="UTF-8"?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0"
            xmlns:dc="http://purl.org/dc/elements/1.1/"
            xmlns:dct="http://purl.org/dc/terms/"
            xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
            xmlns:vcard="http://www.w3.org/2001/vcard-rdf/3.0#"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0 
                                http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source accesspoint="http://example.net/tapir.cgi" 
              sendtime="2005-11-11T12:23:56.023+01:00">
        <software name="TapirProvider" version="1.0"/>
      </source>
    </header>
    <metadata>
      <dc:title>Global Dragonflies Database</dc:title>
      <dc:type>http://purl.org/dc/dcmitype/Service</dc:type>
      <accesspoint>http://example.net/tapir.cgi</accesspoint>
      <dc:description>Global database about Dragonflies observation and specimen records</dc:description>
      <dc:language>en</dc:language>
      <dc:subject>dragonflies dragonfly observation specimen arthropoda insecta odonata</dc:subject>
      <dct:bibliographicCitation>Global Dragonflies Database</dct:bibliographicCitation>
      <dc:rights>Creative Commons License</dc:rights>
      <dct:modified>2006-07-01T09:35:14+01:00</dct:modified>
      <dct:created>2006-01-01T00:00:00+01:00</dct:created>
      <indexingPreferences startTime="01:30:00Z" maxDuration="PT1H" frequency="P1M" />
      <relatedEntity>
        <role>data supplier</role>
        <entity type="organization">
          <identifier>http://purl.org/biodiv/myorg</identifier>
          <name>My Organization</name>
          <acronym>MYORG</acronym>
          <logoURL>http://example.net/myorg.png</logoURL>
          <description>My Organization hosts and maintains biodiversity databases</description>
          <relatedInformation>http://example.net/myorg</relatedInformation>
          <hasContact>
            <role>data administrator</role>
            <vcard:VCARD>
              <vcard:FN>My Name</vcard:FN>
              <vcard:TITLE>Director</vcard:TITLE>
              <vcard:TEL>11 11 11111111</vcard:TEL>
              <vcard:EMAIL>myname@example.net</vcard:EMAIL>
            </vcard:VCARD>
          </hasContact>
          <geo:Point>
            <geo:lat>45.256</geo:lat>
            <geo:long>-71.92</geo:long>
          </geo:Point>
        </entity>
      </relatedEntity>
    </metadata>
  </response>

Example: Example of a metadata response document

5.2. Capabilities

The Capabilities operation is used to retrieve the essential settings and technical information about a TAPIR service.

5.2.1. Capabilities Request

In XML, the Capabilities operation is invoked by inserting the <capabilities/> element after the header section in a request document. Capabilities requests take no arguments or parameters.

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
      </source>
    </header>
    <capabilities/>
  </request>

Example: Example of an XML capabilities request document

In KVP request encoding, the Capabilities operation can be invoked with a single parameter, as follows:

http://example.net/tapir.cgi?op=capabilities

5.2.2. Capabilities Response

Capabilities responses contain five mandatory top level sections to indicate available operations, supported request encodings and parameters, mapped concepts, available variables and global settings. An optional section <archives> can be used to indicate possible dump files available. Another optional <custom> section can be used to include any extra information not covered by the other sections.

5.2.2.1. Operations

The <operations> element is intended to return the list of operations supported by the service, including optional capabilities that are specific to each operation. When an operation is supported, an element with the same name must be present inside the <operations> element. Ping, metadata and capabilities are mandatory operations that are simply declared with no further arguments.

  <operations>
    <ping/>
    <metadata/>
    <capabilities/>
  </operations>

Example: A provider supporting only the mandatory operations

A provider that supports the inventory operation must indicate one or more supported inventory templates, or the <anyConcepts/> capability. Providers that only support inventory templates should not accept inventory requests that do not reference them. Each inventory template must be indicated with a <template> element with an attribute @location pointing to an external document defining the template. An optional @alias attribute can be specified, in which case the alias can also be used as the value of the template parameter in KVP inventory requests. A WSDL (Web Service Description Language) document describing the inventory template and its interface can optionally be included with an attribute @wsdl.

  <operations>
    <ping/>
    <metadata/>
    <capabilities/>
    <inventory>
      <templates>
        <template location="http://example.net/tmpl/collector_inventory.xml"/>
        <template location="http://example.net/tmpl/genus_inventory.xml"
                  alias="genus_inventory"
                  wsdl="http://example.net/tmpl/genus_inventory.wsdl" />
      </templates>
    </inventory>
  </operations>

Example: A provider supporting inventory with templates

Providers declaring the <anyConcepts/> capability should accept inventory requests involving one or more concepts that were advertised as being mapped by the provider. They must also support inventory requests involving any external inventory template that references any known concepts. Therefore, providers in this situation must additionally support arbitrary filters. The <anyConcepts/> capability is declared without any further arguments.

  <operations>
    <ping/>
    <metadata/>
    <capabilities/>
    <inventory>
      <anyConcepts/>
    </inventory>
  </operations>

Example: A provider supporting inventory in any concept

Providers are also allowed to support both <anyConcepts/> and templates, in case they wish to point to specific inventory templates for any particular reason. However providers are not allowed to advertise support of the inventory operation just with an empty <inventory/> element.

If a provider supports the search operation, it must indicate either one or more supported search templates, or the <outputModels/> capability. Providers that only support search templates should not accept search requests that do not reference them. Each search template must be indicated with a <template> element having an attribute @location pointing to an external document defining the template. An optional @alias attribute can be specified, in which case the alias can also be used as the value of the template parameter in KVP search requests. A WSDL (Web Service Description Language) document describing the search template and its interface can optionally be included with an attribute @wsdl.

  <operations>
    <ping/>
    <metadata/>
    <capabilities/>
    <inventory>
      <anyConcepts/>
    </inventory>
    <search>
      <templates>
        <template location="http://example.net/tmpl/search_by_taxonomy.xml"/>
        <template location="http://example.net/tmpl/search_by_geography.xml"
                  alias="geo"
                  wsdl="http://example.net/tmpl/search_by_geography.wsdl" />
      </templates>
    </search>
  </operations>

Example: A provider supporting search with templates

Providers declaring the <outputModels/> capability should indicate either one or more <knownOutputModels> or the <anyOutputModels> capability. Both can be declared, but an empty <outputModels/> element will be considered invalid.

Providers that only support a specific list of output models must understand filters and "order by" parameters. But they can optionally support the <anyOutputModels> capability. When a provider declares support for a specific output model, it must be able to process any search request that references that same output model, either directly or through a search template. Known output models are declared with the <outputModel> element with a @location attribute pointing to the document defining it. An optional @alias attribute can be specified, in which case the alias can be used as the value of the output model parameter in KVP search requests.

  <operations>
    <ping/>
    <metadata/>
    <capabilities/>
    <inventory>
      <anyConcepts/>
    </inventory>
    <search>
      <outputModels>
        <knownOutputModels>
          <outputModel location="http://example.net/models/taxonomy_rss.xml"/>
          <outputModel location="http://example.net/models/geography_kml.xml" alias="kml"/>
        </knownOutputModels>
      </outputModels>
    </search>
  </operations>

Example: A provider supporting search with known output models

Providers may also declare the <anyOutputModels> capability, in which case they need to indicate which subset of the XML Schema language they understand. <anyOutputModels> refers to the ability to respond to search requests involving arbitrary output model definitions, assuming they make use of concepts that are mapped by the provider. Output models include a response structure defined with XML Schema. XML Schema is a large and very complex specification and TAPIR does not expect providers to be able to understand or parse the whole language. The minimum set of the XML Schema language that needs to be understood by providers in this case is represented by the <basicSchemaLanguage/> capability, and it includes the following constructs of XML Schema: targetNamespace definition, element definition (including minOccurs and maxOccurs), attribute definitions (including attribute "use"), local definitions of complexType and simpleType, sequences, and the "all" definition. Therefore, when a provider declares the <anyOutputModel> capability, it must declare inside it at least the element <basicSchemaLanguage/>.

  <operations>
    <ping/>
    <metadata/>
    <capabilities/>
    <inventory>
      <anyConcepts/>
    </inventory>
    <search>
      <anyOutputModels>
        <responseStructure>
          <basicSchemaLanguage/>
          <import/>
        </responseStructure>
      </anyOutputModels>
    </search>
  </operations>

Example: A provider supporting search with any output models

The following constructs of XML Schema can be optionally supported and declared as part of the <anyOutputModels> element:

Providers are allowed to support both search templates and output models. However, they are not allowed to advertise support of the search operation with an empty <search/> element.

When providers support search with output models they are also allowed to support both <knownOutputModels> and <anyOutputModels>, but they are not allowed to declare an empty <outputModels/> element.

Besides standard operations, providers can also declare they support custom operations inside an optional <custom> element.

5.2.2.2. Requests

The <requests> element is intended to provide information on what request encodings the service can respond to, whether it handles log-only requests and what filter capabilities are supported.

This section includes three sub-sections:

<encoding>
Indicates what request encodings are supported. The options are <kvp/> (Key-Value Pairs) and <xml/>. Support of key-value pairs is mandatory, while the XML encoding is optional.

<globalParameters>
Indicates if the provider "accepts", "requires" or "denies" log-only requests.

<filter>
Lists the expressions and Boolean operators understood by the provider service and which may be used in constructing filters.
  <requests>
    <encoding>
      <kvp/>
    </encoding>
    <globalParameters>	
      <logOnly>denied</logOnly>
    </globalParameters>
    <filter/>
  </requests>

Example: Fragment of a <requests> declaration in a capabilities response document showing the minimum functionality that a TAPIR service must be able to provide. Note that in this case no filter capabilities are declared.

Filters are used in search and inventory operations. The <filter> element in capabilities responses lists all the filter operations and terms that are supported by the provider. When a provider declares the filter <encoding> element, a minimum set of filtering capabilities must be supported and indicated for the sake of clarity. The only optional filtering capability in this case is related to the arithmetic operators.

  <requests>
    <encoding>
      <kvp/>
      <xml/>
    </encoding>
    <globalParameters>	
      <logOnly>accepted</logOnly>
    </globalParameters>
    <filter>
        <encoding>
          <expression>
            <concept/>
            <literal/>
            <parameter/>
            <variable/>
            <arithmetic>
              <add/>
              <sub/>
              <div/>
              <mul/>
            </arithmetic>
          </expression>
          <booleanOperators>
            <logical>
              <not/>
              <and/>
              <or/>
            </logical>
            <comparative>
              <equals caseSensitive="false"/>
              <greaterThan/>
              <greaterThanOrEquals/>
              <lessThan/>
              <lessThanOrEquals/>
              <in/>
              <isNull/>
              <like caseSensitive="false"/>
            </comparative>
          </booleanOperators>
        </encoding>
    </filter>
  </requests>

Example: A TAPIR provider declaring support to both xml and kvp encoding, accepting log-only requests, and declaring the complete filter functionality.

When a provider supports filters, it can indicate if they are case sensitive or not. When not indicated, it defaults to "true" (case sensitive).

5.2.2.3. Concepts

The <concepts> element is intended to provide details of recognised conceptual schemas and individually mapped concepts from those schemas. At least one conceptual schema must be mapped with at least one concept. The underlying database structure remains opaque to TAPIR clients, which interact with the underlying database by reference to the mapped concepts. Recognised conceptual schemas are declared using the @namespace and @location attributes of the <schema> element, both of which are required and must be valid URIs. An optional @alias can be assigned to the schema.

Within each schema declaration the provider must list the recognised concepts of that schema using the <mappedConcept> element. Each concept is declared through an instance of the <mappedConcept> element, which contains four attributes:

@id
The fully qualified concept identifier. Required. String.

@searchable
Denotes whether the concept can be used in filters. Optional (defaults to "true"). Boolean.

@required
Denotes that this is a mandatory concept that needs to be present in output models of all search requests. Optional (defaults to "false"). Boolean.

@alias
A local alias associated with the concept. Optional. String.

@datatype
Indicates the corresponding datatype of the mapped concept. This will determine the ordering behaviour in results and the I/O data formats supported by the provider for the concept. Must be one of the fully qualified XML Schema primitive datatypes. The datatype declared should always try to match the datatype defined by the conceptual schema. If the conceptual schema uses a non primitive datatype for the concept, this match should happen with the corresponding primitive datatype (going back in the type hierarchy). Optional (defaults to http://www.w3.org/2001/XMLSchema#string). String.
  <concepts>
    <schema namespace="http://example.net/s/1"
            location="http://example.net/s/1/schema.xsd">
      <mappedConcept id="http://example.net/s/1/CollectionCode"/>
      <mappedConcept id="http://example.net/s/1/CollectionName" searchable="false"/>
    </schema>
    <schema namespace="http://example.net/s/2"
            location="http://example.net/s/2/schema.xsd">
      <mappedConcept id="http://example.net/s/2/CatalogNumber" datatype="http://www.w3.org/2001/XMLSchema#decimal"/>
      <mappedConcept id="http://example.net/s/2/Rights" required="true"/>
    </schema>
  </concepts>

Example: Example of a concepts declaration that might appear in a capabilities response document

When a <mappedConcept> and its <schema> are declared with an @alias attribute, providers must be able to understand requests that reference the concept by the short notation: concept_alias@schema_alias. When a <mappedConcept> is declared with an @alias attribute but its <schema> is not, providers must be able to understand requests that reference only the concept by its alias. These two situations do not exempt providers from understanding requests referencing concepts by fully qualified identifiers.

5.2.2.4. Variables

The <variables> element indicates system environment variables that can be used as filter expressions (see section about filter expressions). Each supported variable must be indicated with an element of the same variable name inside the element <environment>. When the <variables> element is empty it means that no variables are supported.

  <variables>
    <environment>
      <accessPoint/>
      <date/>
      <timestamp/>
      <metadataLanguage/>
      <dataSourceName/>
      <dataSourceDescription/>
      <subject/>
      <rights/>
      <bibliographicCitation/>
      <dataSourceLanguage/>
      <lastUpdate/>
      <dateCreated/>
      <technicalContactName/>
      <technicalContactEmail/>
      <contentContactName/>
      <contentContactEmail/>
    </environment>
   </variables>

Example: Fragment of a capabilities response declaring all environmental variables defined by the TAPIR XML Schema and an extra one.

5.2.2.5. Settings

The <settings> element indicates specific service settings related to server overload caused by requests for excessive amounts of data. There are five settings of interest to client software, all of them optional, and further ones can be declared using the <custom> element. The five settings all take positive integer values:

minQueryTermLength
The minimum length of a wildcard string used in "like" expressions.

maxElementRepetitions
The maximum number of repetitions allowed for any repeatable XML elements in search and inventory responses. It can also be used as a reference for paging.

maxElementLevels
The maximum number of nested XML levels allowed for search responses (not including the TAPIR envelope).

maxResponseTags
The maximum number of XML tags that can be returned in responses.

maxResponseSize
The maximum size in kilobytes allowed to be returned in responses.
  <settings>
    <minQueryTermLength>2</minQueryTermLength>
    <maxElementRepetitions>100</maxElementRepetitions>
    <maxElementLevels>20</maxElementLevels>
  </settings>

Example: Fragment of a capabilities response showing service settings.

5.2.2.6. Archives

An <archives> element can be used to advertise one or more dump files available. A dump file contains all records from the provider at a particular time and can be useful for clients that need to harvest entire datasets. When a dump file is available, clients can make use of it when interacting with the provider for the first time, and then use incremental harvesting through the search operation in subsequent interactions. Note that incremental harvesting will only be possible if the provider has mapped some concept that indicates when a record was last edited, and ideally another concept that indicates when a record was deleted. Such concepts should be part of a data abstraction layer and are therefore outside the scope of this specification.

Each dump file is represented by an <archive> element that can contain the following attributes:

@format
Format of the archive. It can be "xml" or any custom value. Mandatory. String.

@location
URL from where the archive can be downloaded. Mandatory. Any URI.

@creation
Timestamp when the archive was created. Mandatory. DateTime.

@compression
When the archive is compressed, this attribute must be present to indicate the type of compression. When present, it can be "gzip" or any custom value. When absent, no compression was used. Optional. String.

@numberOfRecords
Number of records in the archive. Mandatory. Integer.

@outputModel
Output model used to generate the archive. This attribute should only be present if the archive format is "xml". Optional. Any URI.
  <archives>
    <archive format="xml" 
             location="http://example.net/file.gz" 
             creation="2005-10-31T12:23:56.023+01:00" 
             compression="gzip" 
             numberOfRecords="351056" 
             outputModel="http://example.net/model.xml"/>
  </archives>

Example: A provider advertising a dump file with all records at a specific time

  <?xml version="1.0" encoding="UTF-8"?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0 
                                http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source accesspoint="http://example.net/tapir.cgi" 
              sendtime="2005-11-11T12:23:56.023+01:00">
        <software name="TapirProvider" version="1.0"/>
      </source>
    </header>
    <capabilities>
      <operations>
        <ping/>
        <metadata/>
        <capabilities/>
        <inventory>
          <anyConcepts/>
        </inventory>
        <search>
          <outputModels>
            <anyOutputModels>
              <responseStructure>
                </basicSchemaLanguage>
              </responseStructure>
            </anyOutputModels>
          </outputModels>
        </search>
      </operations>
      <requests>
        <encoding>
          <kvp/>
          <xml/>
        </encoding>
        <globalParameters>	
          <logOnly>accepted</logOnly>
        </globalParameters>
        <filter>
            <encoding>
              <expression>
                <concept/>
                <literal/>
                <parameter/>
                <variable/>
                <arithmetic/>
              </expression>
              <booleanOperators>
                <logical>
                  <not/>
                  <and/>
                  <or/>
                </logical>
                <comparative>
                  <equals caseSensitive="false"/>
                  <greaterThan/>
                  <greaterThanOrEquals/>
                  <lessThan/>
                  <lessThanOrEquals/>
                  <in/>
                  <isNull/>
                  <like caseSensitive="false"/>
                </comparative>
              </booleanOperators>
            </encoding>
        </filter>
      </requests>
      <concepts>
        <schema namespace="http://example.net/s"
                location="http://example.net/s/schema.xsd">
          <mappedConcept id="http://example.net/s/CatalogNumber"/>
          <mappedConcept id="http://example.net/s/ScientificName"/>
        </schema>
      </concepts>
      <variables/>
      <settings>
        <minQueryTermLength>2</minQueryTermLength>
        <maxElementRepetitions>100</maxElementRepetitions>
        <maxElementLevels>20</maxElementLevels>
      </settings>
    </capabilities>
  </response>

Example: Example of a full capabilities response document containing only the mandatory sections

5.3. Inventory

The Inventory operation is used to retrieve distinct values for one or more concepts specified as parameters. It returns aggregated data in the mode of a DISTINCT select in SQL, as opposed to individual records returned by the Search operation. When more than one concept is specified, inventory responses must return distinct combinations of values.

Concepts used as parameters may come from different conceptual schemas and must be specified either with their fully qualified identifiers or with aliases. If aliases are used, the TAPIR implementation must be configured to use the relevant concept name server.

If a provider supports the inventory operation, it must advertise either one or more inventory templates or support the <anyConcepts> capability. Providers may support both options and choose how they wish to process requests.

5.3.1. Inventory Request

Inventory requests can make use of inventory templates or may specify the concept(s) and an optional filter directly in the message. Paging parameters can also be used.

In XML, the inventory operation can be invoked by inserting the <inventory> element after the header section in a request document, and then specifying an inventory template or the specific parameters.

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
      </source>
    </header>
    <inventory>
      <template location="http://example.net/tmpl/genus_inventory.xml"/>
    </inventory>
  </request>

Example: Example of an XML inventory request document using a template.

In KVP request encoding, the same example could be invoked with:

http://example.net/tapir.cgi?op=inventory&template= http://example.net/tmpl/genus_inventory.xml

If the template included a filter with a parameter "type" restricting results according to the basis of record, it could be invoked with

http://example.net/tapir.cgi?op=inventory&template= http://example.net/tmpl/genus_inventory.xml&type=specimen

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
      </source>
    </header>
    <inventory>
      <concepts>
        <concept id="http://example.net/schema1/Country"/>
        <concept id="http://example.net/schema1/Genus"/>
      </concepts>
    </inventory>
  </request>

Example: Example of an XML inventory request document (looking for unique combinations of genus and country) specifying concepts but no filter.

In KVP request encoding, the same example could be invoked with

http://example.net/tapir.cgi?op=inventory&concept=http://example.net/schema1/Country& concept=http://example.net/schema1/Genus

or, using concept aliases

http://example.net/tapir.cgi?op=inventory&count=false&start=0&limit=100& concept=Country@schema1&concept=Genus@schema1

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
      </source>
    </header>
    <inventory count="true" limit="100" start="0">
      <concepts>
        <concept id="http://example.net/schema1/Country" tagName="country"/>
        <concept id="http://example.net/schema1/Genus" tagName="genus"/>
      </concepts>
      <filter>
        <like>
          <concept id="http://example.net/schema1/Genus"/>
          <literal value="Luzu*"/>
        </like>
      </filter>
    </inventory>
  </request>

Example: Example of an XML inventory request document (looking for unique combinations of genus and country) specifying concepts, custom tag names for the resulting values, paging parameters and a filter.

In KVP request encoding, the same example could be invoked with

http://example.net/tapir.cgi?op=inventory&count=true&start=0&limit=100& concept=Country@schema1&concept=Genus@schema1&tagname=country&tagname=genus& filter=Genus@schema1 like "Luzu*"

5.3.2. Inventory Response

The structure of inventory responses must conform to the "inventoryResultType" defined in the TAPIR XML Schema. The body of the inventory message must list the concepts used to create the inventory using the <concepts> element. Individual inventory records, which represent unique combinations of multiple concepts, are returned as one or more <record> elements.

Each <record> element lists the value or values found in the order that concepts are listed under <concepts>. If count was requested, then each <record> must include an attribute @count, giving the number of occurrences of this combination in the underlying data source. If paging was requested, then a <summary> element must also be returned. The order of <record> elements should be ascending according to the concept's datatype declared in the capabilities response.

  <?xml version="1.0" encoding="UTF-8"?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0 
                                http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source accesspoint="http://example.net/tapir.cgi" 
              sendtime="2005-11-11T12:23:56.023+01:00">
        <software name="TapirService" version="1.0"/>
      </source>
    </header>
    <inventory>
      <concepts>
        <concept id="http://example.net/schema1/Country"/>
        <concept id="http://example.net/schema1/Genus"/>
      </concepts>
      <record>
        <value>AUSTRALIA</value>
        <value>Calicium</value>
      </record>
      <record>
        <value>AUSTRALIA</value>
        <value>Fellhanera</value>
      </record>
      <summary start="0" next="2" totalReturned="2" totalMatched="35"/>
    </inventory>
  </response>

Example: An inventory response showing country and genus combinations

If the request references an unmapped concept an <error> should be returned.

When the request specifies a custom "tagName" for the concept, then this name should be used instead of the default <value> tag.

  <?xml version="1.0" encoding="UTF-8"?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0 
                                http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source accesspoint="http://example.net/tapir.cgi" 
              sendtime="2005-11-11T12:23:56.023+01:00">
        <software name="TapirService" version="1.0"/>
      </source>
    </header>
    <inventory>
      <concepts>
        <concept id="http://example.net/schema1/Country"/>
        <concept id="http://example.net/schema1/Genus"/>
      </concepts>
      <record>
        <country>AUSTRALIA</country>
        <genus>Calicium</genus>
      </record>
      <record>
        <country>AUSTRALIA</country>
        <genus>Fellhanera</genus>
      </record>
      <summary start="0" next="2" totalReturned="2" totalMatched="35"/>
    </inventory>
  </response>

Example: An inventory response showing country and genus combinations with values enclosed by custom tag names

5.4. Search

The Search operation is used to return non-aggregate records from data sources. Search requests make use of output models and filters to select the requested data. The returned records may also be counted and paged.

If a provider supports the search operation, it must advertise either one or more search templates or it must support the <outputModels> capability. Providers may support both options and choose how they wish to process requests. If a provider supports the <outputModels> capability, it must advertise either one or more known output models or it must support the <anyOutputModels> capability.

5.4.1. Search Request

Search requests can make use of search templates or specify all parameters directly in the message. Paging parameters can also be used.

In XML, the search operation can be invoked by inserting the <search> element after the header section in a request document, and then specifying a search template or the specific parameters.

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00">
      </source>
    </header>
    <search>
      <template location="http://example.net/tmpl/search_by_taxon.xml"/>
    </search>
  </request>

Example: Example of an XML search request document using a template.

In KVP request encoding, the same example could be invoked with

http://example.net/tapir.cgi?op=search&template= http://example.net/tmpl/search_by_taxon.xml

If the template included a filter with a parameter "genus" restricting results according to a specified genus name, it could be invoked with

http://example.net/tapir.cgi?op=search&template= http://example.net/tmpl/search_by_taxon.xml&genus=Physalis

In addition to referring to external output models (both those known to the provider and user defined ones) by their URI, it is possible to declare an output structure directly within a request document through the <outputModel> element, whose structure is exactly as that used in external models.

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0" 
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                               http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source sendtime="2005-11-11T12:23:56.023+01:00"/>
    </header>
    <search count="true" start="0" limit="1000">
      <outputModel>
        <structure>... concepts and relationships ... </structure>
        <rootElement name="... name of global element ..."/>
        <indexingElement path="... node path(s) for paging and counting ..."/>
        <mapping>... concept mapping elements ...</mapping>
      </outputModel>
      <filter>
        <like>
          <concept id="http://example.net/schema/ScientificName"/>
          <literal value="Luzu*"/>
        </like>
      </filter>
      <orderBy>
        <concept id="http://example.net/schema/Family"/>
        <concept id="http://example.net/schema/ScientificName"/>
      </orderBy>
    </search>
  </request>

Example: Simplified example of an XML search document with in-line outputModel definition

When used in KVP encoding, output models must always be externally defined, and referenced by the parameter "model". Output model definitions cannot be encoded in KVP.

http://example.net/tapir.cgi?op=search&start=0&limit=10& model=http://example.net/models/specimens.xml&filter= http://example.net/schema/ScientificName like "Luzu*"&orderby= http://example.net/schema/ScientificName

The same example with concept aliases would be

http://example.net/tapir.cgi?op=search&start=0&limit=10& model=http://example.net/models/specimens.xml&filter= ScientificName@schema like "Luzu*"&orderby= ScientificName@schema

Search operations also include the possibility to remove the TAPIR envelope, i.e., only the content that goes inside the "search" element is returned. This can be specified by the parameter "envelope". In XML it is an optional attribute (defaults to false) of the element search, and in KVP it is an independent parameter with the same name.

If an error occurs when the envelope in turned off, the response should be an "error" element containg the error message. It should usually be possible to get more information about the error by sending another request with envelope turned on and then inspecting the diagnostics.

5.4.2. Search Response

All methods of formulating search queries are processed by the provider software to select data from its underlying data source and to return to the client in the form of an XML response message. The way in which the provider chooses to do this is not defined by the protocol. Search responses with the TAPIR envelope must validate against the "searchResultType" defined by the TAPIR XML Schema.

  <?xml version="1.0" encoding="UTF-8"?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                                http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
    <header>
      <source accesspoint="http://example.net/tapir.cgi" 
              sendtime="2005-11-11T12:23:56.023+01:00"/>
    </header>
    <search>
      <dataset xmlns="http://example.net/simple_specimen">
        <specimen catnum="234">
           <identification>
             <name>Luzula luzuloides</name>
           </identification>
        </specimen>
        <specimen catnum="290">
          <identification>
            <name>Luzula alpestris</name>
          </identification>
        </specimen>
      </dataset>
      <summary start="0" totalReturned="2"/>
    </search>
  </response>

Example: Example of a search response document.

5.5. Ping

The Ping operation provides a means of establishing whether services are currently on-line and whether appropriate wrapper software is installed. This operation can also provide basic data about response times. It does not require a query to be run against a connected database, as is sometimes required of metadata and capabilities requests. Data providers are free to include as part of diagnostics any extra information that may be of value to monitor networks.

5.5.1. Ping Request

In XML, the ping operation is invoked by inserting the <ping/> element after the header section in a request document. Ping takes no arguments or parameters.

  <?xml version="1.0" encoding="UTF-8"?>
  <request xmlns="http://rs.tdwg.org/tapir/1.0"
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                  xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
                                      http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
  <header>
    <source sendtime="2005-11-11T12:23:56.023+01:00">
         <software name="TapirClient" version="1.0"/>
    </source>
  </header>
    <ping/>
  </request> 

Example: TAPIR Ping Request. The simplest operation in TAPIR.

In KVP request encoding, the ping operation can be invoked with a single parameter

http://example.net/tapir.cgi?op=ping

5.5.2. Ping Response

  <?xml version="1.0" encoding="UTF-8"?>
  <response xmlns="http://rs.tdwg.org/tapir/1.0"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:schemaLocation="http://rs.tdwg.org/tapir/1.0
            http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd">
  <header>
    <source accesspoint="http://example.net/tapir.cgi" 
            sendtime="2005-11-11T12:23:57.023+01:00">
         <software name="TapirProvider" version="1.0"/>
    </source>
  </header>
    <pong/>
  </response> 

6. Global Parameters

All operations can make use of global common parameters. In XML, global parameters are passed as specific attribute values of the operation element, and are defined in the "globalParametersGroup". In KVP, they are passed as individual parameters. The list of global parameters include:

log-only
The log-only parameter can be used to instruct the service to log the request but not process it as it would normally do. It can be used to forward requests to the original providers, when users are querying third-party cached databases. This way, original providers can track usage of their data. Although it is possible to send log-only requests, clients must check the provider capabilities response before sending them. The <globalParameters> element of capabilities responses declares if log requests are required, accepted or denied. Providers that are unable to log requests on their system or do not want to receive them, should reply to a log-only request with an error message using the <error> element. If log-only requests are accepted or required, then the response must include the element <logged/> instead. Boolean. Optional (defaults to "false").

xslt
The xslt parameter points to an XML style sheet that can be used by XSLT processors to transform TAPIR responses. When xslt is specified, providers are instructed to include the corresponding "xml-stylesheet" processing instruction after the XML header, as shown below. However, providers are also free to place restrictions on the allowable domains related to the stylesheet location. In this case, if a stylesheet comes from an unknown location, it can be ignored - but an associated warning should be raised inside the diagnostics section. String representing a URL. Optional.

  <?xml version="1.0" encoding="utf-8" ?>
  <?xml-stylesheet type="text/xsl" href="http://example.net/trans.xsl"?>
  <!-- TAPIR response -->
Example: Fragment of a TAPIR response showing the XML headers when the xslt parameter is passed.

7. Counting and Paging

Inventory and search operations include counting and paging functionality. Both are done with reference to an indexing element (search operation) or to a record element (inventory operation). In XML, counting and paging parameters are defined in the "pagingParametersGroup" and passed as attribute values of the operation element. In KVP, counting and paging is done through specific individual parameters.

count
For Inventory operations the count parameter instructs the service to return the total number of distinct records and number of occurrences of each record matching the selection criteria. For search operations, the count parameter instructs the service to return the number of records returned and the number of records that matched the query in the @returned and @totalMatched attributes of the <summary> element respectively. Boolean. Optional (default = "false")

start
Index of the first record to be returned when paging results. Non-negative integer. Optional (default = "0", which corresponds to the index of the first record).

limit
Indicates the maximum number of records to be returned when paging results. If a request asks for count but specifies limit = "0" then no records are returned but the total number of records found is recorded in @totalMatched. Non-negative integer. Optional (when not specified it means "unlimited").

In XML, a typical use of paging parameters attributes would be

    <search count="true" start="0" limit="50">
     ....... 
    </search>

The same parameters in KVP would be

http://example.net/tapir.cgi?op=search&count=1&start=0&limit=50&...

When paging is requested, response documents must include a <summary> element of the "resultSummaryType", which returns information in the following set of attributes:

@start
Index of the first element that was returned.

@next
Index of the next record that could be retrieved using a subsequent request.

@totalReturned
The number of records actually being returned.

@totalMatched
If count was requested the totalMatched attribute gives the "estimated" number of total matching records - not necessarily the number of valid records that can possibly be returned by paging through the entire record set. This happens because records can be "dropped" or ignored if they fail to produce a valid XML representation according to the response structure.
  ...
  <record count="10">
    <value>AUSTRALIA</value>
    <value>Calicium</value>
  </record>
  <record count="20" >
    <value>AUSTRALIA</value>
    <value>Fellhanera</value>
  </record>
  <summary start="0" next="2" totalReturned="2" totalMatched="35"/>
  ...

Example: Fragment of an inventory response showing count and paging values.

8. Filters, Expressions and Operators

8.1. Filters

Inventory and search operations may contain a filter specifying conditions to restrict returned data to a specific subset. TAPIR filters encode expressions and operators in an atomised form that can be translated to other query languages (e.g., SQL).

The ability to dynamically parse filters is optional in TAPIR and can be expressed as part of the capabilities response. Providers that do not support filters may still support query templates advertised in their capabilities. In this case, when the query template includes a filter in its definition, the meaning and the functionality of the filter must be understood by the provider. The provider may hard code a local query that translates the entire filter and then, when processing a request, substitute parameters that are usually present in query template's filters with their respective values.

8.2. Expressions

The atomised values in a TAPIR filter are represented by expressions. There are three types of expressions in TAPIR: simple expressions, complex expressions and variables. Expressions evaluate to a single value and are used by filter operators.

8.2.1. Simple Expressions

Simple expressions include elements that are directly associated with a single value. There are three possible types of simple expressions:

8.2.2. Complex Expressions

Complex expressions are represented by four arithmetic operators. The list below shows the arithmetic operators followed by their respective XML element.

Arithmetic operators are binary, so they always combine exactly two expressions as their arguments. The first argument must always be associated with the leftmost expression in the operation. This means that in subtractions the first expression corresponds to the minuend and the second corresponds to the subtrahend. In divisions the first expression corresponds to the divident while the second corresponds to the divisor.

  <add>
    <literal value="20" />
    <literal value="22" />
  </add>

Example: Use of a binary arithmetic operator.

8.2.3. Variables

Variables are elements that represent environment variables from the data provider system. TAPIR defines the following system variables that may be supported by provider implementations:

In XML, variables are represented by a <variable> element with a "name" attribute, such as

  <variable name="date" />

Variables that are supported by a data provider must be advertised in capabilities responses. Data providers are also free to define and make use of additional system variables.

8.3. Logical and Comparative Operators

TAPIR supports a range of Boolean (logical and comparative) operators for building filters.

8.3.1. Comparative Operators

There are three types of comparative operators - unary, binary, and multiple.

8.3.1.1. Unary Comparative Operators

Unary comparative operators always take a single concept as argument. The only operator of this type is the isNull operator.

8.3.1.2. Binary Comparative Operators

Binary comparative operators always compare a concept with an expression. The first argument must always be the concept and is associated with the leftmost expression in the operation. The following operators are binary:

8.3.1.3. Multiple Comparative Operators

Multiple comparative operators always compare a concept with one or more simple expressions. This operator is equivalent to a sequence of "or" operators comprising equals comparisons between the concept and each simple expression. The only operator of this type is the "in" operator.

8.3.2. Logical operators

There are two types of logical operators – unary and multiple.

8.3.2.1. Unary Logical Operators

Unary logical operators take as argument a single Boolean operator, which can be any comparison operator or any logical operator.

8.3.2.2. Multiple Logical Operators

Multiple logical operators combine two or more Boolean operators, which can be any comparison operator or any logical operator.

8.4. Additional Interpretation Rules

When a TAPIR provider receives a request, there are additional rules that need to be followed when interpreting filters:

9. KVP (Key-Value Pair) Requests

TAPIR requests can be encoded as KVP, as opposed to the XML encoding, and can be sent through HTTP GET or HTTP POST. Therefore, interaction with a TAPIR service can be done by means of URLs using CGI-style parameters. For instance, to ping a TAPIR service one can use the simple KVP GET encoded message

http://example.net/tapir.cgi?op=ping

All TAPIR operations can be invoked with KVP, though output model definitions cannot be expressed with KVP.

9.1. Parameter Rules

Parameter names are always case insensitive. Parameter values are case insensitive by default, except when used with "equals" or "like" comparisons and the provider explicitly declared these operators to be case sensitive (see capabilities response for more details).

Parameters may be specified in any order. Any unknown parameters can be ignored. Parameters without values can also be ignored.

When creating custom parameters in filters, it is necessary to make sure that their names do not conflict with TAPIR specific parameters (see Appendix for the full list of reserved parameter names).

9.2. Global Parameters

The following parameters can be used in all TAPIR operations:

op = [ p | ping | m | metadata | c | capabilities | i | inventory | s | search ]
Specifies the requested TAPIR operation.

xslt = [ URI ]
Gives the address of an XML style sheet to be included after the XML header.

log-only = [ true | false | 1 | 0 ]
Used to indicate if the request should only be logged, not processed. Returns a log message instead of data.

source-ip = [ URI | NONE ]
Used to indicate the IP address of the original client when the message was processed by intermediate agents.

9.3. Ping Parameters

op = [ ping | p ]
The ping request has no other parameters.

9.4. Metadata Parameters

op = [ metadata | m ]
The metadata request has no other parameters.

9.5. Capabilities Parameters

op = [ capabilities | c ]
The capabilities request has no other parameters

9.6. Inventory Parameters

op = [ inventory | i ]

cnt, count ::= [ true | false | 1 | 0 | NONE ]
Indicates if the total number of distinct records and the number of occurrences for each record must be returned.
s, start = [ non-negative integer | NONE ]
Index of the first record to be returned.
l, limit = [ non-negative integer | NONE ]
The number of records to be returned.

A choice must be made to use either a template, or one or more direct references to concepts with an optional filter.

t, template = [ URI | string ]
The URL of an Inventory Template document or an alias of a template declared in the capabilities response. When a template is present the concept and filter parameters are ignored.

OR

c, concept = [ fully qualified identifiers or aliases ]
One or more concepts.
n, tagname = [ string ]
One or more custom tag names (one for each concept).
f, filter = [ expression ]
A KVP filter.

9.7. Search Parameters

op = [ search | s ]

cnt, count = [ true | false | 1 | 0 | NONE ]
Indicates if the count of the records returned in the response and the number of matching records should be returned.
s, start = [ non-negative integer | NONE ]
Index of the first record to be returned.
l, limit = [ non-negative integer | NONE ]
The number of records to return.
e, envelope = [ true | false | 1 | 0 | NONE ]
Indicates if the TAPIR envelope (response, header, search, summary and diagnostics tags) should be suppressed or not.

A choice must be made to use either a template, or an output model parameter with optional "filter" and "orderby" parameters. The "template" parameter takes precedence over the "model" so if both are present the "model" and the optional "filter" and "orderby" parameters should be ignored.

t, template = [ URI | string ]
The URL of a Search Template document or an alias of a template declared in the capabilities response. When a template is present, the model, filter, orderby and descend parameters should be ignored.

OR

m, model = [ URI | string ]
The URL of an output model document or an alias of an output model declared in the capabilities response.
f, filter = [ expression ]
A KVP filter.
o, orderby = [ fully qualified concept identifiers or aliases ]
One or more concept identifiers to order results.
d, descend = [ true | false | 1 | 0 ]
Indicates if the order should be ascending or descending for each concept specified in "orderby". When present, it must have the same number of instances of "orderby".

9.8. Filters

Filter expressions in KVP requests will be infix equivalents to their XML counterparts.

9.8.1. Backus-Naur Form (BNF) Grammar for Filter Expressions

  <expression>                    ::= <logical_operator> | <comparative_operator>
  
  <comparative_operator>          ::= <unary_comparison_expression> |
                                      <binary_comparison_expression> | 
                                      <unbound_comparison_expression>
  
  <logical_operator>              ::= <unary_logical_expression> | <binary_logical_expression>
  
  <literal>                       ::= '"' <string> '"'
  
  <concept>                       ::= <concept_alias> | <qualified_concept>
  
  <concept_alias>                 ::= <local_concept_alias> "@" <namespace_alias>
  
  <local_concept_alias>           ::= <string>
  
  <namespace_alias>               ::= <string>
  
  <qualified_concept>             ::= <string>
  
  <value>                         ::= <literal> | <concept> | <arithmetic_expression>
  
  <arithmetic_expression>         ::= <value> <arithmetic_operator> <value>
  
  <unary_comparison_expression>   ::= <unary_comparison_operator> <concept>
  
  <binary_comparison_expression>  ::= <value> <binary_comparison_operator> <value>
  
  <unbound_comparison_expression> ::= <unbound_comparison_operator> <expression> {<expression>}
  
  <unary_logical_expression>      ::= <unary_logical_operator> <expression>
  
  <binary_logical_expression>     ::= <expression> <binary_logical_operator> <expression>
  
  <unary_comparison_operator>     ::= "isNull"
  
  <binary_comparison_operator>    ::= "equals" | "like" | "greaterThan" | "lessThan" | 
                                      "greaterThanOrEquals" | "lessThanOrEquals"
  
  <unbound_comparison_operator>   ::= "in"
  
  <unary_logical_operator>        ::= "not"
  
  <binary_logical_operator>       ::= "and" | "or"
  
  <arithmetic_operator>           ::= <add> | <div> | <mul> | <sub>
  
  <add>                           ::= "+"
  <sub>                           ::= "-"
  <mul>                           ::= "*"
  <div>                           ::= "/"
  
  <string>                        ::= <any_char> { <any_char> }

9.8.2. Operators Precedence

Following are lists showing the precedence of filter operators:

expressions
arithmetic_operator, comparative_operator, logical_operator
arithmetic_operators
mul, div, add, sub
comparative_operators
isNull, equals, like, greaterThan, lessThan, greaterThanOrEquals, lessThanOrEquals
logical_operators
not, and, or

Blocks can be formed by using simple parentheses ( ).

9.8.3. Examples

  isnull country@cs1 or FullScientificName@cs2 like "Abies*" and country@cs1 equals "Spain"

The same example can be more explicit using parentheses, as follows:

  ((isnull country@cs1) or ((FullScientificName@cs2 like "Abies*") and (country@cs1 equals "Spain")))

10. The TAPIR XML Schema

The official version of the TAPIR XML Schema (tdwg_tapir.xsd) is located at:

http://rs.tdwg.org/tapir/1.0/schema/tdwg_tapir.xsd

11. Appendix

11.1. Reserved Parameter Names

Parameter names defined in filters can be any string valid according to the HTTP Common Gateway Interface (CGI) standard. But as TAPIR operations can be called through pure KVP requests, some parameter names are reserved as TAPIR parameters and cannot be used as parameter names in filters.

The following parameter names are reserved for TAPIR:

  c
  cnt
  concept
  count
  descend
  d
  e
  envelope
  f
  filter
  l
  limit
  log-only
  m
  model
  n
  o
  op
  orderby
  s
  start
  t
  tagname
  template
  xslt

11.2. Simple XML Encoding for Conceptual Schemas

The following XML Schema defines a standard encoding that can be used to describe conceptual schemas:

  <?xml version="1.0"?>
  <xsd:schema targetNamespace="http://rs.tdwg.org/tapir/cns/1.0" 
              xmlns="http://rs.tdwg.org/tapir/cns/1.0" 
              xmlns:xsd="http://www.w3.org/2001/XMLSchema"
              elementFormDefault="qualified" 
              attributeFormDefault="unqualified" xml:lang="en" >
    <xsd:annotation>
      <xsd:documentation>
        Simple XML Encoding to describe Conceptual Schemas used by TAPIR networks.
      </xsd:documentation>
    </xsd:annotation>
    <xsd:element name="cns">
      <xsd:annotation>
        <xsd:documentation>
          Root element consisting of one schema element.
        </xsd:documentation>
      </xsd:annotation>
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="schema" type="conceptualSchemaType"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:complexType name="conceptualSchemaType">
      <xsd:annotation>
        <xsd:documentation>
          Type representing a conceptual schema. It must contain the namespace attribute,
          one or more label elements (for different languages), one or more location 
          elements (when the conceptual schema can be downloaded from multiple places) 
          and one concepts element with at least one concept. An optional alias can be 
          assigned to the schema.
        </xsd:documentation>
      </xsd:annotation>
      <xsd:sequence>
        <xsd:element name="label" type="langType" maxOccurs="unbounded"/>
        <xsd:element name="alias" type="xsd:string" minOccurs="0"/>
        <xsd:element name="location" type="xsd:anyURI" maxOccurs="unbounded"/>
        <xsd:element name="concepts">
          <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="concept" type="conceptType" maxOccurs="unbounded"/>
            </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
      </xsd:sequence>
      <xsd:attribute name="namespace" type="xsd:anyURI" use="required"/>
    </xsd:complexType>
    <xsd:complexType name="conceptType">
      <xsd:annotation>
        <xsd:documentation>
          Type representing a concept. It must contain the id attribute and the
          datatype element. An optional alias and documentation (doc) element can be 
          present. The documentation be a textual description or a link. The required 
          attribute can be used to indicate if this concept is mandatory or not.
          Although not enforced by this schema, the datatype must be a fully qualified 
          XML Schema primitive datatype.
        </xsd:documentation>
      </xsd:annotation>
      <xsd:sequence>
        <xsd:element name="alias" type="xsd:string" minOccurs="0"/>
        <xsd:element name="datatype" type="xsd:string"/>
        <xsd:element name="doc" type="langType" minOccurs="0" maxOccurs="unbounded"/>
      </xsd:sequence>
      <xsd:attribute name="id" type="xsd:string" use="required"/>
      <xsd:attribute name="required" type="xsd:boolean"/>
    </xsd:complexType>
    <xsd:complexType name="langType">
      <xsd:simpleContent>
        <xsd:extension base="xsd:string">
          <xsd:attribute ref="xml:lang" use="optional"/>
        </xsd:extension>
      </xsd:simpleContent>
    </xsd:complexType>
  </xsd:schema>

The example below is an excerpt of an instance document describing DarwinCore as a conceptual schema:

  <cns xmlns="http://rs.tdwg.org/tapir/cns/1.0">
    <schema namespace="http://rs.tdwg.org/dwc/dwcore/">
    <label>DarwinCore v1.4</label>
    <alias>dwc_1_4</alias>
    <location>http://rs.tdwg.org/dwc/tdwg_dw_core.xsd</location>
    <concepts>
      <concept id="http://rs.tdwg.org/dwc/dwcore/GlobalUniqueIdentifier" required="true">
        <alias>GlobalUniqueIdentifier</alias>
        <datatype>http://www.w3.org/2001/XMLSchema#string</datatype>
        <doc>http://wiki.tdwg.org/twiki/bin/view/DarwinCore/GlobalUniqueIdentifier</doc>
      </concept>
      <concept id="http://rs.tdwg.org/dwc/dwcore/DateLastModified" required="true">
        <alias>DateLastModified</alias>
        <datatype>http://www.w3.org/2001/XMLSchema#dateTime</datatype>
        <doc>http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DateLastModified</doc>
      </concept>
    </schema>
  </cns>

11.3. Term Definitions

Backus-Naur Form A metasyntax used to express context-free grammars. See http://en.wikipedia.org/wiki/Backus-Naur_form.
DarwinCore Standard to facilitate the exchange of species occurrence data. See http://www.tdwg.org/activities/darwincore.
Dublin Core Dublin Core Metadata Initiative. See http://dublincore.org.
GET HTTP communication method where form data are encoded as parameters in an extension to a URL. The GET method is principally used to transmit requests for data to a web server (e.g., a simple database search).
HTML Hypertext Markup Language. A subset of Standard Generalised Markup Language (SGML), used for authoring pages for the World Wide Web.
HTTP Hypertext Transfer protocol, the commonly used protocol for transmitting requests and documents between applications on the World Wide Web.
KVP Key-Value Pair. One of the possible encodings for TAPIR requests.
normative Referring to a standard or set of norms that are understood to be correct. A normative document is one which describes how things ought to be and why.
POST POST is an HTTP communication method that can include any kind of data or command. The data are encoded separately and do not form part of the URL as in a GET message so this method is better for complex, sensitive, lengthy or non-ascii data.
protocol An agreed format for transmitting data between two or more applications.
Provider In the context of TAPIR, an organization or person hosting one or more TAPIR services.
Provider software Software running on a web server that facilitates access to data.
RDF Resource Description Framework. See http://www.w3.org/RDF/.
RDF Schema A language for describing vocabularies in RDF. See http://www.w3.org/TR/rdf-schema/.
TDWG Taxonomic Databases Working Group. See http://www.tdwg.org/.
UDDI Universal Description, Discovery and Integration. UDDI is a specification for maintaining standardised directories of information about web services.
URL Uniform Resource Locator. The address of a resource on the Internet
URI Uniform Resource Identifier. A formatted string that serves as an identifier for a resource, typically, but not exclusively, on the Internet. URIs are used in HTML hyperlinks.
W3C World Wide Web consortium. See http://www.w3c.org.
Web Service A service based on Internet Protocols, such as HTTP, SMTP or FTP.
wrapper Provider software that allows standardised queries to be run against an underlying database.
WSDL Web Services Description Language. An XML format for describing Web Services as a set of end points operating on messages containing either document-oriented or procedure-oriented information. WSDL is the language used by UDDI. See: http://www.w3.org/TR/wsdl.
XMI XML Metadata Interchange is an OMG (Object Management Group) standard for exchanging metadata information via XML. See http://www.omg.org/technology/documents/formal/xmi.htm.
XML Extensible Markup Language developed by the W3C. A means of tagging data for transmission, validation and manipulation. See http://www.w3.org/XML and http://www.w3.org/TR/REC-xml.
XML Schema A formal definition of the required and optional structure and content of XML formatted documents within its domain. See http://www.w3.org/XML/Schema.
XPath Defines a way of locating and processing items in XML documents by using an addressing syntax based on the path through the documents logical tree structure. See http://www3.org/TR/xpath.

11.4. History of Changes

The following changes were made to this document since its first public release.

Date: September, 8th, 2009

Date: July, 21st, 2009

Date: February, 5th, 2009

Date: September, 18th, 2008

Date: February, 7th, 2008

Date: July, 18th, 2007

Date: February, 24th, 2007

Date: February, 7th, 2007

Date: January, 22nd, 2007