About | Domains | Data | API | Use Cases | FAQ | Home

API

Overview

BioGUID.org is primarily intended as a web service (rather than a web site). The website includes basic search functionality, and may continue to expand in the future. However, the main development focus is on web services, available through the APIs described below. As BioGUID.org expands in content and the web services mature, updates will be reported on this page. If you have any questions, or wish to report a bug or request new features, please contact Richard Pyle via email on deepreef [at] bishopmuseum.org.

Controlled Vocabularies

The following lists of terms are used by several BioGUID services.

Identifier Classses

Object Classses

Relationship Types

Search Services

Search Identifier

URL: http://bioguid.org/searchIdentifier
Description:This service allows searching for identifiers within the BioGUID index. Searches are conducted against both the raw identifier (e.g., a UUID, DOI, integer, or other text representing an identifier sensu stricto, and against the identifier in full context (with dereference service prefixes and suffixes). For example, a search on ‘8BDC0735-FEA4-4298-83FA-D04F67C3FBEC’ yields the same results as a search on ‘urn:lsid:zoobank.org:act:8BDC0735-FEA4-4298-83FA-D04F67C3FBEC’ or ‘http://zoobank.org/8BDC0735-FEA4-4298-83FA-D04F67C3FBEC’.
Example Use Case 1: “I have an identifier, and I want to find out if any other identifiers exist that have been assigned to the same object.”
Example Use Case 2: “I have an non-self-resolving identifier, and I want to see what web services are available to dereference the identifier.”
Parameters:

Output Columns: Examples:
  1. http://bioguid.org/searchIdentifier?q=234 [Search for identifiers with “234” as part of the identifier string]
  2. http://bioguid.org/searchIdentifier?q=234&format=html [Same as previous example, but formatted in HTML]
  3. http://bioguid.org/searchIdentifier?q=cf0ada98-2dc0-4402-8ada-b192335d2d7e [Search for identifiers with “cf0ada98-2dc0-4402-8ada-b192335d2d7e” as part of the identifier string]
  4. http://bioguid.org/searchIdentifier?q=cf0ada98-2dc0-4402-8ada-b192335d2d7e&format=html [Same as previous example, but formatted in HTML]
Notes: This service uses a form of full-text indexing that matches only complete “words”. Matches are not limited to full-identifier exact matches, but the searched text must be delimited by standard word-breakers. For example, a search for “234” will not include within the results identifiers such as “23456” or “1234”, but it will include identifiers such as “10.2108/zsj.27.234” and “10.3897/zookeys.234.3417”. All horizontal tab characters (ASCII-09), Line-Feed (LF) characters (ASCII-10), and Carriage Return (CR) characters (ASCII-13) are stripped from search terms, stored identifier values and indexed values, and all unicode text is converted to non-unicode text (most identifiers do not include unicode characters anyway, and they certainly should not include embedded tab, LF and CR characters). Also, hyphens (-) and single-quote characters (') are stripped from both the entered search term and from the index that is searched. This was done to allow searching of UUIDs and ISSNs both with and without hyphens embedded (among others), and to prevent other problems that single quote characters can introduced. These characters are maintained within the stored identifier values, however.

Search Identifier Domain

URL: http://bioguid.org/searchIdentifierDomain
Description:This service allows searching for Identifier Domains Registered on BioGUID.org. Searches are conducted against the standard Abbreviation, the full name, and the text description of the IdentifierDomain.
Example Use Case 1: “Has my Identifier Domain been registered within BioGUID.org?”
Example Use Case 2: “Show me a list of Registered Identifier Domains matching a search term.”
Parameters:

Output Columns: Examples:
  1. http://bioguid.org/searchIdentifierDomain?q=ZooBank [Search for ZooBank Identifier Domains]
  2. http://bioguid.org/searchIdentifierDomain?q=ZooBank&format=html [Same as previous example, but formatted in HTML]
  3. http://bioguid.org/searchIdentifierDomain?q=doi [Search for the DOI Identifier Domain]
  4. http://bioguid.org/searchIdentifierDomain?q=doi&format=html [Same as previous example, but formatted in HTML]
  5. http://bioguid.org/searchIdentifierDomain?q=USNM [Search for Identifier Domains with the standard abbreviation for the U.S. National Museum (Smithsonian Institution)]
  6. http://bioguid.org/searchIdentifier?q=USNM&format=html [Same as previous example, but formatted in HTML]

Export Data

Data can be downloaded and extracted from BioGUID.org through both dynamic and static datasets. These datasets all share the same data structure, as follows:

Output Columns:

Dynamic Data Download

This service allows you to submit the UUID for an Identifier Domain, and get back a full index of all identifiers that BioGUID has indexed within that Identifier Domain.

URL: http://bioguid.org/searchIdentifierDomain
Description:This service allows you to submit the UUID for an Identifier Domain, and get back a full index of all identifiers that BioGUID has indexed within that Identifier Domain. Optionally, all linked identifiers can also be included in the data export.
Example Use Case 1: “Show me a list of all DOIs in BioGUID.org.”
Example Use Case 2: “Show me a list of all ISSNs with their corresponding linked identifiers.”
Parameters:
Examples:
  1. http://bioguid.org/identifierDomainIndex?d=5ccc425c-8d84-427a-9dbe-40abc4ff5118&ir=1&format=tab [All DOIs in BioGUID.org, in tab-delimited form]
  2. http://bioguid.org/identifierDomainIndex?d=3aae0b26-2d17-4ec0-8d7f-55f46e158476&ir=1&format=html [All ISBN identifiers with all associated cross-linked identifiers, as an html table]
  3. http://bioguid.org/identifierDomainIndex?d=7d5ac2f3-5b2a-46b0-91c8-a6285aa757aa&ir=1&format=csv [All ISSN identifiers with all associated cross-linked identifiers, in comma-delimited form]

Certain kinds of pre-processed datasets will be provided as downloadable Zip files with static datasets. These take too long to generate dynamically, so they are periodically updated and archived on the BioGUID.org website. More of these static datasets will be developed as requests are submitted to us by email to deepreef [at] bishopmuseum.org.

Downloadable Static Datasets:

Write Services

Create Identifier Domain

Currently, new Identifier Domains can only be created through the website, here. Before entering a new Identifier Domain, please use the search feature to see if it already exists. Once you're sure it's not already in the system, go ahead and add it using the form on the Domains page. The fields for generating a new Identifier Domain record are as follows:

Note: AgentUUID and DereferenceServiceDescription are not yet supported in the upload web page. These fields will be supported at a later time. DereferenceServiceLogo is assumed to be the same as IdentifierDomainLogo, but this will also be improved to allow different logos at a later time.

Once inserted, the new Identifier Domain will be generated and IdentifierDomainUUID will be assigned. Use this IdentifierDomainUUID when submitting bulk uploads, as described below. Also, a DereferenceServiceUUID will be generated for any new Dereference Service created as part of this import.

Bulk Upload Identifiers

Anyone can now upload bulk content to BioGUID.org! For the moment, this can only be done through the website, but we plan to add a service that will allow batch files to be uploaded programmatically. To upload a batch of identifiers, create a CSV file encoded as UTF-8, with all values enclosed in double-quote characters ("), using the following column headers (and corresponding values):

  1. ObjectClass* [The Class of Object (Occurrence, Reference, Taxon, Agent, etc.) represented by the identifier; use Controlled Vocabulary]
  2. IdentifierDomainUUID* [The BioGUID identifier for the Identifier Domain]
  3. DereferencePrefix [Prefix text that is prepended to the Identifer to make it actionable]
  4. Identifier*
  5. DereferenceSuffix [Suffix text that is appended to the Identifier to make it actionable]
  6. RelatedObjectClass [The Class of Object (Occurrence, Reference, Taxon, Agent, etc.) represented by the related identifier; use Controlled Vocabulary]
  7. RelatedIdentifierDomainUUID [The BioGUID identifier for the Related Identifier Domain]
  8. RelatedDereferencePrefix [Prefix text that is prepended to the RelatedIdentifer to make it actionable]
  9. RelatedIdentifier [The identifier for the related record]
  10. RelatedDereferenceSuffix [The identifier for the related record]
  11. RelationshipType

*Items with an asterisk are required; all others are optional.

There should be NO BLANK LINES anywhere in the file (including the last line)!

Download a sample file here.

Note: We have not yet tested this on very large CSV files. It should be fine for a file containing a few hundred thousand rows, but if you have a dataset containing millions of identifiers, by all means give it a try — but we haven't yet been able to test it with large batch files. Also, his service tracks details about each identifier in terms of whether it has already been imported, or whether there are problems with the records as submitted. In the near future, these reports will be made available as soon as the batch of submitted records is processed.

The time it takes to process depends on the number of records, and the percentage of records that represent new objects (as opposed to identifiers that can be mapped to existing objects). It can range from a few seconds to a few hours. As we gather more data on uploaded datasets, we'll be able to make more accurate predictions about how long each batch import will take to fully process. You don't need to remain on the page for the entire process time; you only need to wait until the zip file has been transferred to the BioGUID server. This depends on your internet connection speed and the size of the zip file, but it shouldn't require more than a minute or two, unless the file is very large.

Logos

Small Logo

Large Logo


Creative Commons Zero All content within the BioGUID.org site is available under the Creative Commons Zero license (Public Domain).