Ontology vs. Vocabulary: How should I use them?

by | Sep 17, 2024 | 0 comments

owl reading a dictionary

Few terms in the world of semantic data require quite as much disambiguation as ontologies and vocabularies. Oftentimes, they may be used interchangeably as if they are just two different ways to say the same thing.

However, understanding what these terms truly mean can set a semantic data user apart, enabling them to drive value within their organization. Ultimately, our goal should be to understand how ontologies and vocabularies are both tools in the toolbelt of the successful knowledge engineer. Beyond simple understanding, we would hope to learn when and how to use both.

What is an Ontology?

Ontologies are data structures that we use to paint a broad picture of the world around us, often represented as an interlinking web of nodes that we call classes. If you’re coming from the world of relational databases, (i.e. SQL), you might think of an ontology as a database schema and each class as an individual table. The true value of an ontology, the links between the various classes, replaces the need for complicated joins between relational tables. Overall, this gives us the power to paint a more wholistic picture of the world.

 

Let’s picture a game of “20 questions” and use that to design a very simple ontology. Our goal is to get an “Answer”, so let’s represent that in the ontology. When visualizing ontologies, Classes often appear as circles.

Suppose now that our first question is “Is it an Animal?” Effectively, based on the answer to that question, we are delving into a finer degree of granularity. To reflect this with an ontology, we will create a “subclass” or more specific version of the Answer class. Important to note is that the subclass will automatically have all the properties available to its parent class. If you are coming from the world of object-oriented programming, the concept of Classes, Subclasses, and Inheritance will probably seem very familiar. Creating a subclass in the context of our example, it will look something like this. A dotted arrow in this diagram indicates a subclass/parent class relationship.

Expanding out this ontology slightly more, suppose we want to be able to represent a person who owns a dog. Understanding parent and subclasses, we know that both Person and Dog should be subclasses of Animal. However, we want to create a more concrete link between person and dog. Extending our ontology like this, we might end up with something resembling this. A solid arrow represents an Object Property, a relationship from one class to another.

This relationship can be either one-to-one or one-to-many, depending on the need. Here, though there is no difference in the visualization, we want to indicate a one-to-many relationship as one person may own multiple animals. Furthermore, since Dog is a subclass of Animal, this relationship indicates that a Person may own a Dog and/or any other type of Animal.

Another, less ideal way that we could create this model resembles the following:

Here we see that we have expanded the Ontology in a number of ways to reference more potential types of Animal as well as a distinction between a Pet Owner and a more specific type of Pet Owner, Dog Owner.

There are a few reasons why this ontology is less ideal, but largely this stems from the difference between a Pet Owner and a Dog Owner. If we think back to the beginning, where we established that an Ontology should paint a broad picture of the world, part of that ethos is to not create granularity for the sake of granularity. In other words, we need to consider what if any value we actually gain by having Dog Owner be a separate class to Pet Owner. If they are for all intents and purposes the same, the only difference being that a Dog Owner ostensibly owns at least one dog, this is nothing that can’t be reflected by simply having a Pet Owner class, or even better sticking to the Human class and having the HasPets Object Property there.

Of course, everything described above are simply guidelines and best practices. Ultimately, one of the great and empowering aspects of Ontology design is the lack of strict rules and the freedom to paint the world however best fits your use case requirements.

As mentioned earlier, another way to think about an ontology, using the above figure for reference, is in terms of Object-Oriented Programming. Often, as with our “20 Questions” example, we start with an abstract concept to be from which we expand out with subclasses and other relationships and links. So, an Insect, being a subclass of Animal has all the properties that we might define on Animal implicitly. A Pet Owner can also own one or many Insects, since the subclasses inherit the relationships from their parents as well.

As an additional, more real-world, exercise, let’s imagine that we are going to design a car, and we want to represent the factors that go into making a car. Obviously, the result is a complete car, so let’s start there.

Note that we do not create a specific class for something like “Toyota Camry”, since we are just painting a broad picture of the world.

As we are all aware, one of the key factors that uniquely defines a car is the manufacturer. We could define a text property on car for the Make, but it’s difficult to capture other useful metadata about a manufacturer in just a string. So, let’s create a second class in our ontology for the Manufacturer, and let’s link it to Car.

Note here that the relationships between classes in an ontology can be either one-to-one or one-to-many. Both are represented by just one arrow/one object property.

Expanding out the example further, we can easily get to something that looks like this:

So, with an understanding of ontologies. The question ultimately comes: If ontologies are tools that I can use to paint a broad/abstract picture of the world, how can I represent the state of the world as it actually is? Or, if we were going to ask the question more directly, how can we use the ontology that we designed above to represent a 1998 Toyota Camry with an Engine that was produced in Japan?

When we get to the point where we want to leverage ontologies to represent instance data, we need Vocabularies.

What is a Vocabulary?

If a dictionary contains all the words in a written Language with which we might construct thoughts written or spoken, a Vocabulary is effectively our collection of terms with which we might construct thoughts that pertain to our specific domain. Another way to think of a Controlled Vocabulary, in its most basic form, is as a collection of the reference data for your particular domain. So, for our example using cars, our controlled vocabulary might have terms for “Toyota”, “Camry”, “Japan”, etc. It’s important to note the difference between this type of reference data and what is commonly referred to as “instance data”. While our reference data might define “Camry” as a type of Make, a specific 1998 Toyota Camry that is on a car lot would be an instance of a Car.

This reference data is represented using ontologies as the backbone, i.e. the terms we are defining within the Vocabulary are all instances of previously defined classes in the ontology. Typically, in the world of semantic data, when we discuss Vocabularies, we are typically discussing a specific kind of Vocabulary called a Controlled Vocabulary. A Controlled Vocabulary is a Vocabulary that leverages a specific framework to give the Vocabulary a clearly defined structure that can add to its value and usability. Most semantic tools that work with vocabularies, including Mobi, prefer a specific framework called SKOS to provide this structure.

SKOS, or the Simple Knowledge Organization System, is an ontology designed by the W3C that we use to represent all manner of controlled vocabularies. For the purposes of this explanation, we won’t cover all the details of the SKOS ontology (although we would encourage you to explore this at your leisure), but we will focus on a few key parts of SKOS, starting with the Concept.

In SKOS, a Concept is exactly what it sounds like. This is a class meant to represent any concept we can imagine. In our example with cars, we would add an axiom to all the classes we defined to make them all subclasses of skos:Concept. Once we’ve done that, we can then create a new Concept which is of type Car, and then we can create a label.

SKOS gives us the power to create two main types of labels: Preferred Label and Alternate Label. Overall, our goal when defining Concepts is to enhance our data to be both searchable and more easily related to other classes. Often, our Preferred Label might have to be a primary key coming from our data source. In that situation, we can then leverage the Alternate Label to help us understand what we might be looking at. In many cases, we might consider the Alternate Label to be a synonym of the Preferred Label. These aren’t hard rules, but rather guidelines to consider when creating Concepts.

 

 

Here is what an example Controlled Vocabulary might look like inside Mobi.  This screenshot shows a concept describing “Toyota,” and some additional properties.

Here we finally get to instantiate an instance of a class from our ontology to represent a controlled vocabulary term.

When Should I Use an Ontology or Vocabulary?

In the field, it can be tempting to jump straight toward Vocabulary development. After all, the use cases where we don’t care about representing reference data are rare if they even exist at all. However, considering a progression from ontology to vocabulary can drastically increase the overall solution usability as well as value present in the representation of the data.

Step One: The “Canonical” or “Business” Ontology

Often, when we first define ontologies within an environment, we may be forced to define the class labels of our ontology in terms derived from our data source system. This is not ideal for many reasons, but a big reason that we try to steer away from this methodology when defining our own ontologies is simple human readability. Considering that there are many situations where one use-case might involve leveraging many different ontologies from many different source systems, and we very soon find ourselves with a need to create the “Canonical” Ontology, an ontology which puts all of our classes in simple, human-readable terms for the purpose of linking together multiple different source systems.

 

The Canonical Ontology has a secondary, more significant purpose. If we consider that the broader goal of any data scientist is to leverage data to answer some form of question relating to the domain, we can use the canonical ontology to distill the various different columns and tables defining our source data into one easily digestible representation of the organization’s knowledge. Ultimately, using the broad strokes described earlier, we hope to paint a picture of the domain knowledge. (For more information on why the Canonical Ontology is a useful tool, please read our article on Knowledge Graphs in the additional reading section.)

Step Two: Enhancing the Canonical Ontology with SKOS

The true reason that we don’t jump straight to this step is that we don’t want to waste time mapping Concepts that aren’t of any concern to the end-business questions. Let’s consider for a moment a question we asked earlier:  What kinds of Engines are produced in Japan and used in a 1998 Toyota Camry? If all our questions are of this nature, why would we need to worry about representing tires, auto dealers, or other concepts that exist in the domain of automobiles in our canonical ontology?

The best advice when it comes to ontology design is often to design a simple, easily modular schema that’s just enough to answer your existing business questions, but can easily be added to in order to answer new ones.  See our related “Don’t Boil the Ocean…” article.

So, once we know which questions we want to answer and what data we want to represent, we can then leverage SKOS Concepts to represent the instances of our classes representing controlled terms, remembering that our classes should be subclasses of SKOS Concept. Once we have successfully done this, we can then ask our question in the form of a SPARQL (please see additional reading section for SPARQL Syntax References).

What's Next?

With a proper understanding of best practice, ontologies and vocabularies can be utilized together to power expressive and powerful knowledge graph solutions the capability to scale and expand. An ontology describes the type of data and relationships a data consumer can expect and encodes human understanding into machine readable format. A vocabulary is a thesaurus of terms to be utilized across a knowledge graph providing consistently and clarity during data integration and dissemination. Mobi provides the same collaborative, compliant, versioned editing experience for both ontologies and vocabularies so you can seamlessly develop both of these critical artifacts in support of your data harmonization needs. Download Mobi and Contact Us to give it a try!

Additional Reading

To learn more about the SKOS Ontology, please view the W3C’s documentation.

To learn more about SPARQL Syntax, please view the W3C’s SPARQL documentation.

To see an example of a Controlled Vocabulary representing adverse effects utilizing both preferred and alternate labels, check out MedDRA.

To learn more about practical approaches to knowledge modeling, read our other knowledge article, “Don’t Boil the Ocean…”.

To learn more about how to use Mobi to represent both Ontologies and Controlled Vocabularies, please see our public documentation.

Need a Custom Solution?

Elevate your Mobi experience with our specialized professional services tailored to meet your unique needs. Our expert teams are ready to assist you in seamless integration and development of custom solutions, ensuring that Mobi aligns perfectly with your specific requirements. From fine-tuning configurations to crafting bespoke solutions, our professionals bring a wealth of experience to the table. Explore our service offerings to unlock the full potential of Mobi and let us help you navigate the complexities of semantic graph integration and development. Your journey with Mobi is about to become even more tailored and efficient with our dedicated professional services at your side.

Capabilities

Open Business Solution Platform

Tap the full potential of your enterprise data to deliver fresh insights.

Collaborative Data Modeling

Bring your team or community together to create, share, and evolve data models in a modern, web-based, collaborative environment.

Automated Data Enhancement

Mobi makes it easy to integrate data into your models. The included mapping tool aligns your data to generate interoperable data without writing a single line of code.

Intuitive Data Exploration

Bundled tools make it easy to search and explore your enterprise data without the hassle of writing queries or analyzing models. Oh, and the query tools are there too.

Modular & Extensible

Pluggable backend storage solutions, open REST and Java APIs, and a plugin framework make it easy to build solutions that extend and customize the Mobi Platform to solve your business problems.

Data Solutions

Mobi brings your team and community together through data collaboration. Break down the silos, because interoperability is everything.

What is Mobi?

Solutions

We have an abundance of experience working with many of the major knowledge graph platforms of the day and have successfully taken the tiniest spark of an ideas through to full production ready solutions.

Resources

Mobi is an open and collaborative knowledge graph platform for teams and communities to publish and discover data, data models, and analytics that are instantly consumable.

About

Recognizing the transformative potential of semantic knowledge graphs, we set out to bridge this gap and empower our customers to harness the full power of their data assets. Mobi emerged as our solution—a platform designed not just to meet our own needs but to elevate the capabilities of organizations worldwide in making better use of their knowledge models.

News

Contact