Ontology (information science)

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Example of an ontology visualized: the Mason-ontology.

In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.

In theory, an ontology is a "formal, explicit specification of a shared conceptualisation".[1] An ontology provides a shared vocabulary, which can be used to model a domain — that is, the type of objects and/or concepts that exist, and their properties and relations.[2]

Ontologies are used in artificial intelligence, the Semantic Web, systems engineering, software engineering, biomedical informatics, library science, enterprise bookmarking, and information architecture as a form of knowledge representation about the world or some part of it. The creation of domain ontologies is also fundamental to the definition and use of an enterprise architecture framework.

Contents

[edit] Overview

The term ontology has its origin in philosophy, and has been applied in many different ways. The core meaning within computer science is a model for describing the world that consists of a set of types, properties, and relationship types. Exactly what is provided around these varies, but they are the essentials of an ontology. There is also generally an expectation that there be a close resemblance between the real world and the features of the model in an ontology.[3]

What ontology has in common in both computer science and in philosophy is the representation of entities, ideas, and events, along with their properties and relations, according to a system of categories. In both fields, one finds considerable work on problems of ontological relativity (e.g., Quine and Kripke in philosophy, Sowa and Guarino in computer science)[4], and debates concerning whether a normative ontology is viable (e.g., debates over foundationalism in philosophy, debates over the Cyc project in AI). Differences between the two are largely matters of focus. Philosophers are less concerned with establishing fixed, controlled vocabularies than are researchers in computer science, while computer scientists are less involved in discussions of first principles (such as debating whether there are such things as fixed essences, or whether entities must be ontologically more primary than processes).

[edit] History

Historically, ontologies arise out of the branch of philosophy known as metaphysics, which deals with the nature of reality – of what exists. This fundamental branch is concerned with analyzing various types or modes of existence, often with special attention to the relations between particulars and universals, between intrinsic and extrinsic properties, and between essence and existence. The traditional goal of ontological inquiry in particular is to divide the world "at its joints", to discover those fundamental categories, or kinds, into which the world’s objects naturally fall.[5]

During the second half of the 20th century, philosophers extensively debated the possible methods or approaches to building ontologies, without actually building any very elaborate ontologies themselves. By contrast, computer scientists were building some large and robust ontologies (such as WordNet and Cyc) with comparatively little debate over how they were built.

Since the mid-1970s, researchers in the field of artificial intelligence have recognized that capturing knowledge is the key to building large and powerful AI systems. AI researchers argued that they could create new ontologies as computational models that enable certain kinds of automated reasoning. In the 1980s, the AI community began to use the term ontology to refer to both a theory of a modeled world and a component of knowledge systems. Some researchers, drawing inspiration from philosophical ontologies, viewed computational ontology as a kind of applied philosophy.[6]

In the early 1990s, the widely cited Web page and paper "Toward Principles for the Design of Ontologies Used for Knowledge Sharing" by Tom Gruber[7] is credited with a deliberate definition of ontology as a technical term in computer science. Gruber "introduced the term to mean a specification of a conceptualization. That is "an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set of concept definitions, but more general. And it is a different sense of the word than its use in philosophy".[8]

According to Gruber (1993) "ontologies are often equated with taxonomic hierarchies of classes, class definitions, and the subsumption relation, but ontologies need not be limited to these forms. Ontologies are also not limited to conservative definitions – that is, definitions in the traditional logic sense that only introduce terminology and do not add any knowledge about the world.[9] To specify a conceptualization, one needs to state axioms that do constrain the possible interpretations for the defined terms.[1]

In the early years of the 21st century, the interdisciplinary project of cognitive science has been bringing the two circles of scholars closer together[citation needed]. For example, there is talk of a "computational turn in philosophy" that includes philosophers analyzing the formal ontologies of computer science (sometimes even working directly with the software), while researchers in computer science have been making more references to those philosophers who work on ontology (sometimes with direct consequences for their methods). Still, many scholars in both fields are uninvolved in this trend of cognitive science, and continue to work independently of one another, pursuing separately their different concerns.

[edit] Ontology components

Contemporary ontologies share many structural similarities, regardless of the language in which they are expressed. As mentioned above, most ontologies describe individuals (instances), classes (concepts), attributes, and relations. In this section each of these components is discussed in turn.

Common components of ontologies include:

  • Individuals: instances or objects (the basic or "ground level" objects)
  • Classes: sets, collections, concepts, classes in programming, types of objects, or kinds of things.
  • Attributes: aspects, properties, features, characteristics, or parameters that objects (and classes) can have
  • Relations: ways in which classes and individuals can be related to one another
  • Function terms: complex structures formed from certain relations that can be used in place of an individual term in a statement
  • Restrictions: formally stated descriptions of what must be true in order for some assertion to be accepted as input
  • Rules: statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form
  • Axioms: assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application. This definition differs from that of "axioms" in generative grammar and formal logic. In those disciplines, axioms include only statements asserted as a priori knowledge. As used here, "axioms" also include the theory derived from axiomatic statements.
  • Events: the changing of attributes or relations

Ontologies are commonly encoded using ontology languages.

[edit] Domain ontologies and upper ontologies

A domain ontology (or domain-specific ontology) models a specific domain, or part of the world. It represents the particular meanings of terms as they apply to that domain. For example the word card has many different meanings. An ontology about the domain of poker would model the "playing card" meaning of the word, while an ontology about the domain of computer hardware would model the "punch card" and "video card" meanings.

An upper ontology (or foundation ontology) is a model of the common objects that are generally applicable across a wide range of domain ontologies. It contains a core glossary in whose terms objects in a set of domains can be described. There are several standardized upper ontologies available for use, including Dublin Core, GFO, OpenCyc/ResearchCyc, SUMO, and DOLCE. WordNet, while considered an upper ontology by some, is not strictly an ontology. However, it has been employed as a linguistic tool for learning domain ontologies[10].

The Gellish ontology is an example of a combination of an upper and a domain ontology.

Since domain ontologies represent concepts in very specific and often eclectic ways, they are often incompatible. As systems that rely on domain ontologies expand, they often need to merge domain ontologies into a more general representation. This presents a challenge to the ontology designer. Different ontologies in the same domain can also arise due to different perceptions of the domain based on cultural background, education, ideology, or because a different representation language was chosen.

At present, merging ontologies that are not developed from a common foundation ontology is a largely manual process and therefore time-consuming and expensive. Domain ontologies that use the same foundation ontology to provide a set of basic elements with which to specify the meanings of the domain ontology elements can be merged automatically. There are studies on generalized techniques for merging ontologies, but this area of research is still largely theoretical.

[edit] Ontology engineering

Ontology engineering (or ontology building) is a subfield of knowledge engineering that studies the methods and methodologies for building ontologies. It studies the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them.[11][12]

Ontology engineering aims to make explicit the knowledge contained within software applications, and within enterprises and business procedures for a particular domain. Ontology engineering offers a direction towards solving the interoperability problems brought about by semantic obstacles, such as the obstacles related to the definitions of business terms and software classes. Ontology engineering is a set of tasks related to the development of ontologies for a particular domain.[13]

[edit] Ontology languages

An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based:

  • Common Algebraic Specification Language is a general logic-based specification language developed within the IFIP working group 1.3 "Foundations of System Specifications" and functions as a de facto standard in the area of software specifications. It is now being applied to ontology specifications in order to provide modularity and structuring mechanisms.
  • Common logic is ISO standard 24707, a specification for a family of ontology languages that can be accurately translated into each other.
  • The Cyc project has its own ontology language called CycL, based on first-order predicate calculus with some higher-order extensions.
  • DOGMA (Developing Ontology-Grounded Methods and Applications) adopts the fact-oriented modeling approach to provide a higher level of semantic stability.
  • The Gellish language includes rules for its own extension and thus integrates an ontology with an ontology language.
  • IDEF5 is a software engineering method to develop and maintain usable, accurate, domain ontologies.
  • KIF is a syntax for first-order logic that is based on S-expressions.
  • Rule Interchange Format (RIF) and F-Logic combine ontologies and rules.
  • OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL. OWL is intended to be used over the World Wide Web, and all its elements (classes, properties and individuals) are defined as RDF resources, and identified by URIs.
  • SADL captures a subset of the expressiveness of OWL, using an English-like language entered via an Eclipse Plug-in.
  • OBO, a language used for biological and biomedical ontologies.

[edit] Examples of published ontologies

  • Basic Formal Ontology,[14] a formal upper ontology designed to support scientific research
  • BioPAX,[15] an ontology for the exchange and interoperability of biological pathway (cellular processes) data
  • BMO,[16] an e-Business Model Ontology based on a review of enterprise ontologies and business model literature
  • CCO (Cell-Cycle Ontology),[17] an application ontology that represents the cell cycle
  • CContology,[18] an e-business ontology to support online customer complaint management
  • CIDOC Conceptual Reference Model, an ontology for cultural heritage[19]
  • COSMO,[20] a Foundation Ontology (current version in OWL) that is designed to contain representations of all of the primitive concepts needed to logically specify the meanings of any domain entity. It is intended to serve as a basic ontology that can be used to translate among the representations in other ontologies or databases. It started as a merger of the basic elements of the OpenCyc and SUMO ontologies, and has been supplemented with other ontology elements (types, relations) so as to include representations of all of the words in the Longman dictionary defining vocabulary.
  • Cyc, a large Foundation Ontology for formal representation of the universe of discourse.
  • Disease Ontology,[21] designed to facilitate the mapping of diseases and associated conditions to particular medical codes
  • DOLCE, a Descriptive Ontology for Linguistic and Cognitive Engineering[22]
  • Dublin Core, a simple ontology for documents and publishing
  • Foundational, Core and Linguistic Ontologies[23]
  • Foundational Model of Anatomy,[24] an ontology for human anatomy
  • Gene Ontology for genomics
  • GUM (Generalized Upper Model),[25] a linguistically-motivated ontology for mediating between clients systems and natural language technology
  • Gellish English dictionary, an ontology that includes a dictionary and taxonomy that includes an upper ontology and a lower ontology that focusses on industrial and business applications in engineering, technology and procurement. See also Gellish as Open Source project on SourceForge.
  • GOLD,[26] General Ontology for Linguistic Description
  • IDEAS Group,[27] a formal ontology for enterprise architecture being developed by the Australian, Canadian, UK and U.S. Defence Depts.
  • Linkbase,[28] a formal representation of the biomedical domain, founded upon Basic Formal Ontology.
  • LPL, Lawson Pattern Language
  • NIFSTD Ontologies from the Neuroscience Information Framework: a modular set of ontologies for the neuroscience domain. See http://neuinfo.org
  • OBO Foundry, a suite of interoperable reference ontologies in biomedicine
  • Ontology for Biomedical Investigations, an open access, integrated ontology for the description of biological and clinical investigations
  • OMNIBUS Ontology,[29] an ontology of learning, instruction, and instructional design
  • Plant Ontology[30] for plant structures and growth/development stages, etc.
  • POPE, Purdue Ontology for Pharmaceutical Engineering
  • PRO,[31] the Protein Ontology of the Protein Information Resource, Georgetown University.
  • Program abstraction taxonomy program abstraction taxonomy
  • Protein Ontology[32] for proteomics
  • Systems Biology Ontology (SBO), for computational models in biology
  • Suggested Upper Merged Ontology, a formal upper ontology
  • SWEET,[33] Semantic Web for Earth and Environmental Terminology
  • ThoughtTreasure ontology
  • TIME-ITEM, Topics for Indexing Medical Education
  • YATO,[34] Yet Another Top-level Ontology
  • WordNet, a lexical reference system

[edit] Ontology libraries

The development of ontologies for the Web has led to the apparition of services providing lists or directories of ontologies with search facility. Such directories have been called ontology libraries.

The following are static libraries of human-selected ontologies.

  • DAML Ontology Library[35] maintains a legacy of ontologies in DAML.
  • Protege Ontology Library[36] contains a set of owl, Frame-based and other format ontologies.
  • SchemaWeb[37] is a directory of RDF schemata expressed in RDFS, OWL and DAML+OIL.

The following are both directories and search engines. They include crawlers searching the Web for well-formed ontologies.

  • OBO Foundry / Bioportal[38] is a suite of interoperable reference ontologies in biology and biomedicine.
  • OntoSelect[39] Ontology Library offers similar services for RDF/S, DAML and OWL ontologies.
  • Ontaria[40] is a "searchable and browsable directory of semantic web data", with a focus on RDF vocabularies with OWL ontologies.
  • Swoogle is a directory and search engine for all RDF resources available on the Web, including ontologies.


[edit] Examples of applications using ontology engines

[edit] See also

Related philosophical concepts

[edit] References

  1. ^ a b T. Gruber (1993). "A translation approach to portable ontology specifications". In: Knowledge Acquisition. 5: 199-199.
  2. ^ F. Arvidsson and A. Flycht-Eriksson. Ontologies I. Retrieved 26 Nov 2008.
  3. ^ L. M. Garshol (2004). Metadata? Thesauri? Taxonomies? Topic Maps! Making sense of it all on www.ontopia.net. Retrieved 13 October 2008.
  4. ^ J. F. Sowa. Top-level ontological categories. In International Journal of Human-Computer Studies, 43 (November/December), 1995, pp. 669-85.
  5. ^ P. C. Benjamin et al. (1994). IDEF5 Method Report. Knowledge Based Systems, Inc.
  6. ^ T. Gruber (2008). "Ontology". In the Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu (Eds.), Springer-Verlag, 2008.
  7. ^ T. Gruber, "Toward Principles for the Design of Ontologies Used for Knowledge Sharing". In: International Journal Human-Computer Studies, 43(5-6):907-928, 1995
  8. ^ T. Gruber (2001) . What is an Ontology?. Online entry. Accessed Nov 9, 2009.
  9. ^ H. B. Enderton (1972). A Mathematical Introduction to Logic. San Diego, CA: Academic Press.
  10. ^ R. Navigli, P. Velardi (2004). "Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites", Computational Linguistics, 30(2), MIT Press, pp. 151-179.
  11. ^ A. Gómez-Pérez, M. Fernández-López, O. Corcho (2004). Ontological Engineering: With Examples from the Areas of Knowledge Management, E-commerce and the Semantic Web. Springer, 2004.
  12. ^ A. De Nicola, M. Missikoff, R. Navigli (2009). "A Software Engineering Approach to Ontology Building". Information Systems, 34(2), Elsevier, 2009, pp. 258-275.
  13. ^ L. Pouchard, N. Ivezic and C. Schlenoff (2000). "Ontology Engineering for Distributed Collaboration in Manufacturing", In Proceedings of the AIS2000 conference, March 2000.
  14. ^ Basic Formal Ontology (BFO)
  15. ^ BioPAX http://biopax.org
  16. ^ A. Osterwalder (2002)."An e-Business Model Ontology for Modeling e-Business" at Bled, Slovenia, June 17 - 19, 2002.
  17. ^ CCO
  18. ^ CContology
  19. ^ CIDOC Conceptual Reference Model
  20. ^ COSMO
  21. ^ Disease Ontology
  22. ^ DOLCE
  23. ^ Foundational, Core and Linguistic Ontologies
  24. ^ Foundational Model of Anatomy
  25. ^ Generalized Upper Model
  26. ^ GOLD
  27. ^ The IDEAS Group Website
  28. ^ Linkbase
  29. ^ OMNIBUS Ontology
  30. ^ Plant Ontology
  31. ^ PRO
  32. ^ Protein Ontology
  33. ^ SWEET
  34. ^ YATO
  35. ^ DAML Ontology Library
  36. ^ Protege Ontology Library
  37. ^ SchemaWeb
  38. ^ OBO Foundry / Bioportal
  39. ^ OntoSelect
  40. ^ Ontaria

[edit] Further reading

[edit] External links