This is a work-in-progress document.
Comments from JC to be incorporated in next draft:
1. The statements about how to do the conversion in say 3.3.1 and 3.3.2 starting "In particular" are very painful procedural descriptions. Please use declarative language, and it will be easier to understand and to critique.
2. I suggest moving the unguided translation as out of scope of this document. It would make it an easier document to write, and hence to get out the door. Of course, unguided would be better in some ways, but ... that could be in a follow up doc.
3. Obviously there are many unused references which will need pruning before a final version.
The Resource Description Framework (RDF) is a model developed by the W3C for representing information about resources in the World Wide Web. Topic Maps is a standard for knowledge integration developed by the ISO. The two specifications were developed in parallel during the late 1990's within their separate organizations for what at first appeared to be very different purposes. The results, however, turned out to have a lot in common and this has led to calls for their unification.
While unification has to date not been possible (for a variety of technical and political reasons), a number of attempts have been made to uncover the synergies between RDF and Topic Maps and to find ways of achieving interoperability at the data level. There is now widespread recognition within the respective user communities that achieving such interoperability is a matter of some urgency. This document is the result of the work done by the Semantic Web Best Practices and Deployment Working Group of the W3C with the support of the ISO Topic Maps committee to address this issue. It provides a set of Guidelines for users who want to combine usage of the W3C's RDF/OWL family of specifications and the ISO's family of Topic Maps standards.
The purpose of this document is to present a solution to the problem of RDF/Topic Maps interoperability at the data level. It consists of guidelines that describe how to author topic maps and RDF documents in order to ensure maximum interoperability, and a set of rules for performing automated translation between RDF and Topic Maps.
As the word guidelines might suggest, this document contains a possible way to perform the translation between RDF and Topic Maps and it is recommended as best practice. It is the result of the analysis of different possible approaches which are in part described in [Survey].
The goal is to be able to translate data from one form to the other without unacceptable loss of information or corruption of the semantics. Furthermore, it must be possible to query the results of a translation in terms of the target model and it must be possible to share vocabularies across the two paradigms.
[RDF-Schema] and [OWL] are considered relevant to this work to the extent that the classes and properties they define are supportive of its goals. However, the current work explicitly excludes the following goals:
This document is aimed at anyone with an interest in the problem of RDF/Topic Maps interoperability and a willingness to acquire the necessary understanding of both formalisms. In particular it targets authors of Topic Maps and RDF documents; creators of tools for translating between RDF and Topic Maps; and those who seek reassurance that data can be easily reused across the two paradigms. The reader is expected to be familiar with both RDF and Topic Maps to a level that at least corresponds to the tutorial material in <a href="#Pepper00">[Pepper 00] and <a href="#RDF-Primer">[RDF-Primer]. To fully understand Chapter 5, the reader must in addition be familiar with the models described in <a href="#TMDM">[TMDM] and <a href="#RDF-Semantics">[RDF-Semantics], and the syntaxes described in [LTM] and [N3].
This document defines a set of guidelines in order to address the RDf and Topic Maps interoperability issue. The approach is twofold: the translation can be either guided or unguided. The guided translation is supported by a specific vocabulary here defined.
This document starts by stating the requirements that the Guidelines are intended to fulfill. This is followed by an informative prose description of the mapping between the RDF and Topic Maps models that underpins the interoperability guidelines. The description is structured by concept, starting with the most general concepts (things, proxies, assertions, etc.) and ending up with concepts that are specific to one paradigm or the other (e.g., scope and language tags).
The chapter 3 describes the rationale for these guidelines, both for guided and unguided translation. Chapter 4 contains guidelines for authoring RDF and Topic Maps, respectively. These are expressed as succintly as possible in order that they should be easily referenced. The rationale for these guidelines is to be found in Chapter 3.
Finally, Chapter 5 provides a formal exposition of rules for performing automated translations from RDF to Topic Maps and vice versa based on the data models described in [TMDM] and [RDF-Semantics]. Once again, these rules are expressed as succintly as possible for ease of reference and the rationale for them is to be found in Chapter 3.
This document should provide Guidelines such that the following requirements are satisfied:
This chapter is informative: it provides a readable, informal, but essentially complete overview of how constructs in the two paradigms relate to each other. Details and formal definitions are left to Chapter 5.
The mapping mechanism consists of properties and classes in the rdftm: namespace that can be easily translated into TMs. The guidance consists in expressing some properties and some classes in RDF vocabularies as subproperties and subclasses of these rdftm properties and classes, respectively.
The following classes and properties have so far been identified:
In order to represent Topic Maps associations examples in this section we use the syntax described as follows.
The assertion:
where A, B, C, and S are topics, ra, rb, and rc are role types, and p represents the association relating A and B, is expressed with the following syntax [LTM syntax]:
p( A : ra , B : rb, C : rc ) / S
There is a fundamental equivalence between subjects and resources. This equivalence may be refined as follows:
Given the above, topics that have one or more characteristics should always be mapped to resources, since a topic only exists in order to make assertions about its subject and the only way to make an assertion in RDF is to create a statement whose subject is a resource. Resources which are the subjects of RDF statements should always be mapped to topics. However, resources which are only objects of statements may not always be mapped to topics. (This will be discussed further below.)
There is a general correspondence between TM assertions and RDF statements, and between the type of a TM assertion and the property of an RDF statement.
Resource URIs, subject identifiers, subject locators, source locators. Blank nodes.
Both topics and resources may use URI references (or URIrefs) as identifiers (the term URIref is used here in accordance with W3C usage to mean a URI with an optional fragment identifier). However, in Topic Maps there are two ways in which a URIref can be used to identify a subject:
RDF does not make this distinction explicitly. The question therefore arises, when going from RDF to Topic Maps, whether to map the URIref of a resource to a subject locator or to a subject identifier; and, conversely, when going from Topic Maps to RDF, whether to map subject locators or subject identifiers (or neither, or both) to the URIrefs of resources.
Any solution which favours one type of identifier (say, subject identifiers) will lead to unnatural results with the identifiers of the other type. These guidelines therefore suggest a solution that retains some of the ambiguity of the RDF approach while at the same time preserving enough information to be able to perform roundtripping. The solution hinges on the assumption that topics with subject locators are explicitly or implicitly instances of the class InformationResource.
The classes and properties involved in the guidance for Identity are the following:
The rules for translation are as follows:
Resources become Topics. The guidance indicates if the resource's URI must be translated as a subject locator or a subject identifier. That is, it indicates whether the resource is an information resource or not. In order to specify the nature of the resource the class rdftm:InformationResource is used. It is also possible to define explicitly properties as being rdfs:subPropertyOf the rdftm:subjectIdentifier in order to translate their values as subject identifiers.
In particular:
Topics becomes Resources. It is always clear if the URIs associated to a topic identifies the real subject (i.e. the topic is the subject itself) or information resources describing the subject.
Topics map to resources, and there are two possible cases: (i) the topic represents an information resources and (ii) the topic represents another kind of resource. In (i) the topic will have at least one subject locator, and optionally one or more subject identifiers. In (ii) the topic will have at least one subject identifier and no subject locator.
In particular,
Both in RDF and Topic Maps it is possible to associate a resource and a topic to a name. In RDF a name is represented as the value of a property, in Topic Maps name types are used. Given this, properties map to name types and vice versa. However, in past Topic Maps specifications there were also untyped names. In order to be compliant to the current specification, Topic Maps document authors has to substitute untyped names with the iso:topic-name.
The property rdfs:label deserves particular attention. It may be used in RDF in order to provide a human-readable version of a resource's name. However, rdfs:label has something special if we deal with migration of RDF to OWL ontologies, and in particular if we want our ontology to be OWL DL compliant. In fact, rdfs:label is predefined as an instance of owl:AnnotationProperty. Hence, it cannot be used in property axioms. The only information in axioms for them is annotations. For more details the reader can refer to [OWL-ref] and <a href="#OWL-sem">[OWL-sem].
The classes and properties involved in the guidance for names are the followings:
Given this, the rules for names translation are the followings:
Topic Maps has the concept of variant names, which are always associated to a scope. Both variant and scope are Topic Maps concepts that do not have a direct matching in RDF. Tha approach is to use a single property for the name (as described above), to create a statement for the assertion and to reify the statement in order to attach the variants.
The rules for translating variant names are the following:
Both RDF and Topic Maps have the concept of relationship.
The RDF is a model of triples, hence relationships are binary. Each of them is represented by a subject, a property and an object. The subject is either a URIref or a blank node, and the object, which is the value of the property, can be a literal, a URIref, or a blank node. Topic Maps defines the concept of association, which is intended to be n-ary (e.g., unary, binary, and so on). Each association has a type and n roles players. [Noy 05] identifies patterns for representing n-ary relations in RDF. While Topic Maps has the concept of role player, each of which may be given a type, RDF has only two roles in relations, subject and object.
There is a special type of relation in Topic Maps named occurence. An occurrence is a relation between a subject and an information resource. The information resource may either be a value inside the topic map or an external information resource. Occurrences correspond to RDF single statements. In TM2RDF the conversion is simple, while in RDF2TM it is an issue to decide how to treat a RDF statement: either as an occurrence or an association.
Consider the following binary association in Topic Maps:
born-in( puccini : person, lucca : place )
used to state that the composer Puccini was born in Lucca. This assertion would be represented in RDF as follows:
ex:puccini bio:born-in ex:lucca
The properties involved in the guidance for translation of binary associations are the following:
Example:
ex:puccini bio:born-in ex:lucca bio:born-in rdftm:subject-role bio:person bio:born-in rdftm:object-role geo:place
becomes:
bio-born( puccini : person, lucca : place )
Example:
bio-born( puccini : person, lucca : place )
becomes:
ex:puccini bio:born-in ex:lucca bio:born-in rdftm:subject-role bio:person bio:born-in rdftm:object-role geo:place
Topic Maps defines an occurrence as a special type of association. Occurences can be internal and external. An occurence is a binary association between a subject and an information resource. In particular, the value of an internal occurrence is a string and can have a datatype that is not a URI. If the datatype is a URI then the occurrence is external. Occurrences become RDF properties.
RDF does not have the concept of occurrence, so the problem is to to decide if a RDF property has to be treated either as a Topic Maps association or a Topic Maps occurrence.
The classes and properties involved in the guidance for occurrences are the following:
Given this, the rules for translation are the following:
The type-instance relationship is inherently binary, stating that some instance belongs to the extension of some class. In RDF this is expressed by means of an "ex:instance rdf:type ex:class" statement. In Topic Maps the equivalent statement would (in LTM syntax) be expressed as "iso:type-instance( instance : iso:instance, class : iso:type )".
In other words, both RDF and Topic Maps provide special vocabulary for expressing this particular relationship, without making the relationship part of the model proper. This means that in translating between the two, in this particular case it is necessary to mediate between the two built-in vocabularies.
The rules for translation are as follows:
Like the type-instance relationship, this relationship is by its very nature binary, and like type-instance it is represented in both RDF and Topic Maps using a special vocabulary external to the model itself. In RDF, the fact that A is a subclass of B is expressed with the statement "ex:A rdfs:subclassOf ex:B", whereas in Topic Maps it is expressed with the association "iso:supertype-subtype( A : iso:subtype, B: iso:supertype )".
The issue of handling n-ary relationships is strictly connected to the work that is currently undertaken by Natasha Noy, Alan Rector and Christopher Welty [Noy 05]. Even if the document already depicts useful patterns for describing n-ary relations in RDF, the authors are working on a specific vocabulary for this aim. Given this, we are waiting the draft of that wocabulary in order to fix the rules mapping rules for n-ary relations. In fact, we concluded that is mandatory to share the same vocabulary in place of defining our own.
Topic Maps defines the concept of scope as the context within which a statement is valid. Formally the scope is composed of a set of topics that together define the context. RDF does not have a matching concept, nor does it define any vocabulary for the representation of context.
For interoperability between RDF and Topic Maps, this guideline document defines a specific property in the rdftm: vocabulary, to be used with reified statements, in order to specify in what context the statement can be considered valid.
The following property is involved in the guidance for translation of scope:
Every set of RDF assertions of the form:
ex:X rdf:type rdf:Statement; rdf:subject ex:A; rdf:predicate ex:p; rdf:object ex:B; rdftm:scope ex:S].
maps to a Topic Maps assertion of the form:
p( A , B ) / S
Every Topic Maps assertion of the form:
p( A , B ) / S
maps to a set of RDF assertions of the form:
ex:X rdf:type rdf:Statement; rdf:subject ex:A; rdf:predicate ex:p; rdf:object ex:B; rdftm:scope ex:S].
Note: Some issues here are not still covered. For instance, how is a scope consisting of more than one topic represented, and what happens if two equal statements have different scopes.
@@ _To be done_
@@ To be done
This section will contain recommendations for people creating RDF and TM data, basically to tell them what to do and what to avoid in order to ensure maximum RDF/TM interoperability. It will be filled when both the guided and unguided translation rules will be in place, since only then will we really know what to tell people.
@@ _To be done_
@@ _To be done_