RDF & OWL - the basic framework for modern information scientist

There is no doubt, the biggest public information network nowadays is the Internet, especially, World Wide Web (aka WWW, or simply the Web). It can change over time, however, this is where we are now. We can find a lot of interesting resources on the Web, however, the biggest problem is to find the relevant resource (digital object) and the knowledge in the minimum time effort. This is the reason why the semantic web has been invented. 

The Semantic Web is an extension of the World Wide Web through standards set by the World Wide Web Consortium. The goal of the Semantic Web is to make Internet data machine-readable. To enable the encoding of semantics with the data, technologies such as Resource Description Framework (RDF) and Web Ontology Language (OWL) are used. These technologies are used to formally represent metadata. [source

RDF

RDF stands for Resource Description Framework. RDF is a framework for describing resources on the web. RDF is designed to be read and understood by computers. RDF is not designed for being displayed to people. RDF is written in XML. [source]

SPARQL is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. [source]

If you want to know how SPARQL works, I recommend the training available at [video link] which teaches the basics of SPARQL. It is intended for those who are brand new to working with SPARQL, the core query language of Stardog's Enterprise Knowledge Graph. 

Example of RDF Model ...

[source]

The Resource Description Framework, more commonly known as RDF, is a graph data model that formally describes the semantics or meaning of information. It also represents metadata, that is, data about data.

RDF consists of triples. These triples are based on an Entity Attribute Value (EAV) model, in which the subject is the entity, the predicate is the attribute, and the object is the value. Each triple has a unique identifier known as the Uniform Resource Identifier or URI. URIs look like web page addresses. The parts of a triple, the subject, predicate, and object, represent links in a graph.

Multiple triples link together to form an RDF model. The graph above describes the characters and relationships from the Flintstones television cartoon series. We can easily identify triples such as “WilmaFlintstone livesIn Bedrock” or “FredFlintstone livesIn Bedrock”. We now know that the Flintstones live in Bedrock, which is part of Cobblestone County in Prehistoric America.

The rest of the triples in the Flintstones graph describe the characters’ relations, such as hasSpouse or hasChild, as well as their occupational association (worksFor).

Fred Flintstone is married to Wilma and they have a child Pebbles. Fred works for the Rock Quarry company and Wilma’s mother is Pearl Slaghoople. Pebbles Flintstone is married to Bamm-Bamm Rubble who is the child of Barney and Betty Rubble. Thus, as you can see, many triples form an RDF model.

RDFS

RDF Schema, more commonly known as RDFS, adds schema to the RDF. It defines a metamodel of concepts like Resource, Literal, Class, and Datatype and relationships such as subClassOf, subPropertyOf, domain, and range. RDFS provides a means for defining the classes, properties, and relationships in an RDF model and organizing these concepts and relationships into hierarchies.

RDFS specifies entailment rules or axioms for the concepts and relationships. These rules can be used to infer new triples, as we show in the following diagram.


Looking at this example, we see how new triples can be inferred by applying RDFS rules to a small RDF/RDFS model. In this model, we use RDFS to define that the hasSpouse relationship is restricted to humans. And as you can see, human is a subclass of mammal.

If we assert that Wilma is Fred’s spouse using the hasSpouse relationship, then we can infer that Fred and Wilma are human because, in RDFS, the hasSpouse relationship is defined to be between humans. Because we also know humans are mammals, we can further infer that Fred and Wilma are mammals.

All credits of RDF and RDFS examples go to GraphDB Ontotext [source

OWL

The W3C Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can be exploited by computer programs, e.g., to verify the consistency of that knowledge or to make implicit knowledge explicit. OWL documents, known as ontologies, can be published on the World Wide Web and may refer to or be referred from other OWL ontologies. OWL is part of the W3C’s Semantic Web technology stack, which includes RDF, RDFS, SPARQL, etc. [source]

RDF vs OWL

Now you can ask what is the difference between RDF and OWL.
  • RDF is the data model of the Semantic Web
  • OWL is the knowledge representation languages for authoring ontologies

Ontology

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject. [source]

An ontology formally defines a common set of terms that are used to describe and represent a domain. An ontology is domain-specific, and it is used to describe and represent an area of knowledge. It contains terms and the relationships among these terms. There is another level of relationship expressed by using a special group of terms: properties. These property terms describe various features and attributes of the concepts, and they can also be used to associate different classes together.

By having the terms and the relationships among these terms clearly defined, ontology encodes the knowledge of the domain in such a way that the knowledge can be understood by a computer. This is the basic idea of ontology. [Source]

Taxonomy

Taxonomy is the science of classification. Originally, it referred only to the classifying of organisms. Now, it is often used in a more general setting, referring to the classification of things or concepts, as well the schemes underlying such a classification. In addition, taxonomy normally has some hierarchical relationships embedded in its classifications. [Source]

Thesaurus

Thesaurus can be understood as an extension to taxonomy: it takes taxonomy as described above, allowing subjects to be arranged in a hierarchy, and in addition, it adds the ability to allow other statements to be made about the subjects. [Source]

KOS

Knowledge Organization Schemes or KOS, is a general term that refers to a set of elements, often structured and controlled, that can be used for describing objects, indexing objects, browsing collections, etc. They are also used in many scientific areas, examples include biology and chemistry, where naming and classifying are important. Both Taxonomy and Thesaurus,  are Knowledge Organization Schemes. [Source]

SKOS

SKOS is an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading systems, and taxonomies within the framework of the Semantic Web.

SKOS provides a standard way to represent knowledge organization systems using the Resource Description Framework (RDF). Encoding this information in RDF allows it to be passed between computer applications in an interoperable way.

Using RDF also allows knowledge organization systems to be used in distributed, decentralized metadata applications. Decentralized metadata is becoming a typical scenario, where service providers want to add value to metadata harvested from multiple sources.

Tools

There are a variety of tools in the modeling and construction of semantic systems. A very nice list of tools is available at https://thematix.com/tools/

SKOSMOS

CONCLUSION

RDF is a standard format for the presentation of data having a network structure. The data network structures are close to oriented graphs known from mathematics. 

The basic RDF is a statement in the form of a trinity of subject, predicate, object.

  • subject = URL of the entity we are claiming/defining
  • predicate = property (characteristics) to which the entity statement relates
  • object = property value

An ontology identifies and distinguishes concepts and their relationships; it describes content and relationships. [source]. This is where OWL comes into play.

A taxonomy formalizes the hierarchical relationships among concepts and specifies the term to be used to refer to each; it prescribes structure and terminology. [source] This is where facet classification and thesauruses come into play and significantly improve search relevance. 


Comments

Popular posts from this blog

Resource Theories, Ontology and Digital Transformation

Description versus Meaning