This is the Steve's draft proposal
The Resource Description Framework (RDF) is a model developed by the W3C for representing information about resources in the World Wide Web. Topic Maps is a standard for knowledge integration developed by the ISO. The two specifications were developed in parallel during the late 1990's within their separate organizations for what at first appeared to be very different purposes. The results, however, turned out to have a lot in common and this has led to calls for their unification.
While unification has to date not been possible (for a variety of technical and political reasons), a number of attempts have been made to uncover the synergies between RDF and Topic Maps and to find ways of achieving interoperability at the data level. There is now widespread recognition within the respective user communities that achieving such interoperability is a matter of some urgency. Work has therefore been initiated by the Semantic Web Best Practices and Deployment Working Group of the W3C with the support of the ISO Topic Maps committee to address this issue.
A Working Group Draft containing a [Survey] of earlier approaches and an analysis of their strengths and weaknesses has already been produced. This document provides a set of Guidelines for users who want to combine usage of the W3C's RDF/OWL family of specifications and the ISO's family of Topic Maps standards.
The purpose of this document is to present a solution to the problem of RDF/Topic Maps interoperability at the data level. It consists of guidelines that describe how to author topic maps and RDF documents in order to ensure maximum interoperability, and a set of rules for performing automated translation between RDF and Topic Maps. The goal is to be able to translate data from one form to the other without unacceptable loss of information or corruption of the semantics. It should also be possible to query the results of a translation in terms of the target model and it should be possible to share vocabularies across the two paradigms.
[RDF-Schema] and [OWL] are considered relevant to this work to the extent that the classes and properties they define are supportive of its goals. However, it is explicity not a goal of the current work to enable the general use of RDF Schema and OWL with Topic Maps, although this issue may be addressed later.
This document is aimed at anyone with an interest in the problem of RDF/Topic Maps interoperability and a willingness to acquire the necessary understanding of both models. In particular it targets authors of topic maps and RDF documents; creators of tools for translating between RDF and Topic Maps; and those who seek reassurance that data can be easily reused across the two paradigms. The reader is expected to be familiar with both RDF and Topic Maps to a level that at least corresponds to the tutorial material in [Pepper 00] and [RDF-Primer]. To fully understand Chapter 5, the reader must in addition be familiar with the models described in [TMDM] and <a href="#RDF-Semantics">[RDF-Semantics], and the syntaxes described in [LTM] and [N3].
This document starts by stating the requirements that the Guidelines are intended to fulfill. This is followed by an informative prose description of the mapping between the RDF and Topic Maps models that underpins the interoperability guidelines. The description is structured by concept, starting with the most general concepts (things, proxies, assertions, etc.) and ending up with concepts that are specific to one paradigm or the other (e.g., scope and language tags).
Chapter 4 contains guidelines for authoring RDF and Topic Maps, respectively. These are expressed as succintly as possible in order that they should be easily referenced. The rationale for these guidelines is to be found in Chapter 3.
Finally, Chapter 5 provides a formal exposition of rules for performing automated translations from RDF to Topic Maps and vice versa based on the data models described in [TMDM] and [RDF-Semantics]. Once again, these rules are expressed as succintly as possible for ease of reference and the rationale for them is to be found in Chapter 3.
[RDF and Topic Maps will often be referred to as 'the paradigms'.]
This document should provide Guidelines such that the following requirements are satisfied:
[Explicit statement that naturalness has higher priority than completeness?]
[Should translation be allowed to fail when authoring guidelines have not been followed?]
Examples of edge cases that should not be catered for:
[This chapter is informative: It should provide a readable, informal, but essentially complete overview of how constructs in the two paradigms relate to each other. Where possible, section headings should use neutral terminology. Details should be left to Chapter 5, Translation guidelines.]
The mapping mechanism will consist of properties in the rdftm: namespace that can be easily translated into TMs. The guidance will consist in expressing some properties in RDF vocabularies as subproperties of these rdftm: properties. Some classes will also be defined.
The following classes and properties have so far been identified:
There is a fundamental equivalence between subjects and resources. This equivalence may be refined as follows:
Given the above, topics that have one or more characteristics should always be mapped to resources, since a topic only exists in order to make assertions about its subject and the only way to make an assertion in RDF is to create a statement whose subject is a resource. Resources which are the subjects of RDF statements should always be mapped to topics. However, resources which are only objects of statements may not always be mapped to topics. (This will be discussed further below.)
ISSUE: What to do about topics that have no characteristics?
Resource URIs, subject identifiers, subject locators, source locators. Blank nodes.
[Resource URIs, subject identifiers, subject locators, source locators. Blank nodes.]
Both topics and resources may use URI references (or URIrefs) as identifiers (the term URIref is used here in accordance with W3C usage to mean a URI with an optional fragment identifier). However, in Topic Maps there are two ways in which a URIref can be used to identify a subject: directly, as the actual address (or locator) of the subject, in which case it is called a "subject locator"; or indirectly, as the locator of an information resource that provides some human-interpretable indication of the subject, in which case it is called a "subject identifier". It is always clear, in both the model and the interchange syntax, whether the URIref is a subject locator or a subject identifier.
RDF does not make this distinction explicitly. The question therefore arises, when going from RDF to Topic Maps, whether to map the URIref of a resource to a subject locator or to a subject identifier; and, conversely, when going from Topic Maps to RDF, whether to map subject locators or subject identifiers (or neither, or both) to the URIrefs of resources.
Any solution which favours one type of identifier (say, subject identifiers) will lead to unnatural results with the identifiers of the other type. These guidelines therefore suggest a solution that retains some of the ambiguity of the RDF approach while at the same time preserving enough information to be able to perform roundtripping. The solution hinges on the assumption that topics with subject locators are explicitly or implicitly instances of the class InformationResource.
The rules are as follows:Examples
[wikipedia-tosca = "Wikipedia page about Tosca"
%"http://en.wikipedia.org/Tosca"]
http://en.wikipedia.org/Tosca
rdfs:label "Wikipedia page about Tosca" ;
rdf:type tm:InformationResource .
------------------------------------------------------
[wikipedia-tosca = "Wikipedia page about Tosca"
%"http://en.wikipedia.org/Tosca"
@"http://psi.ontopia.net/wikipedia/Tosca-page"
@"http://psi.unibo.net/wikipedia/Tosca-page" ]
http://en.wikipedia.org/Tosca
rdfs:label "Wikipedia page about Tosca" ;
rdf:type tm:InformationResource ;
rdftm:subjectIdentifier http://psi.ontopia.net/wikipedia/Tosca-page ,
http://psi.unibo.net/wikipedia/Tosca-page .
------------------------------------------------------
[wikipedia-tosca = "Wikipedia page about Tosca"
%"http://en.wikipedia.org/Tosca"
%"http://207.142.131.214/Tosca" ]
http://en.wikipedia.org/Tosca
rdfs:label "Wikipedia page about Tosca" ;
rdf:type tm:InformationResource ;
owl:sameAs http://207.142.131.214/Tosca .
------------------------------------------------------
[wikipedia-tosca = "Wikipedia page about Tosca"
@"http://psi.ontopia.net/wikipedia/Tosca-page"
@"http://psi.unibo.net/wikipedia/Tosca-page" ]
http://psi.ontopia.net/wikipedia/Tosca-page
rdfs:label Wikipedia page about Tosca ;
owl:sameAs http://psi.unibo.net/wikipedia/Tosca-page .
General correspondence between TM assertions and RDF statements.
General correspondence between the type of a TM assertion and the property of an RDF statement.
Untyped occurrences and associations use the TMDM untyped-foo subject identifiers as properties.
It should not be possible to perform transforms between vocabularies.
There are two basic, alternative approaches to untyped names:
Apart from untyped names, the rule in 3.3 applies: name types map to properties and properties map to name types.
Conclusion regarding variants:
Having considered three alternatives for variant names we reach the following conclusions:
Considerations
ACTION: Lars Marius to post to SWBPD enquiring about the semantics of rdfs:label.
Examples
[puccini = "Giacomo Puccini"]
concert-x
dc:title "Concert X" .
composer-y
foaf:name "Composer Y" .
[ skos:prefLabel "Economic cooperation" ;
skos:altLabel "Economic co-operation" ] .
http://www.megginson.com/exp/id/airports/ABBN
apt:name "BRISBANE , AUSTRALIA" .
[concert-x = "Concert X" : dc:title]
<topic id="concert-x">
<baseName>
<instanceOf><subjectIndicatorRef xlink:href="dc:title"/></instanceOf>
<baseNameString>Concert X</baseNameString>
</baseName>
</topic>
<topic id="concert-x">
<baseName>
<baseNameString>Concert X</baseNameString>
</baseName>
</topic>
----------------------------------------------------------------------
In unguided translation, translate to an internal occurrence if the object is a literal; translate to an association if the object is a URIref or a blank node. [[We could say translate to an external occurrence if the object is not the subject of some other statement. Do we want to do that?]]
ISSUES:
Accepted as described in 3.5.2.1. It is not an error for that values of subject-role and object-role to be the same. This implies a symmetric relationship and the assignment of subject/object roles is irrelevant.
Unguided translations not yet considered.
Use rdftm:subject-role and rdftm:object-role to specify correspondence between role types and subject/object:
bio:born-in rdftm:subject-role geo:person .
bio:born-in rdftm:object-role geo:place .
This caters for RDF2TM and TM2RDF. Given
born-in( puccini : person, lucca : place )
translator can figure out that puccini is the subject. For symmetric relationships, one can only guess. [[Result is non-deterministic. Is this a problem?]]
Use rdftm:Subject and rdftm:Object as role types.
Fall back to same approach as n-ary associations.
ISSUES:
TM2RDF unproblematic: all occurrences map to statements whose values are either literals or URIs/blank nodes.
<RDF2TM: any property for which the guidance does not lead to interpretation as a name or an association (or an identifier?), is regarded as an occurrence.ISSUES:
Perhaps by generating a mapping statement when doing TM2RDF:
{tosca, synopsis, "http://www.azopera.com/learn/synopsis/tosca.shtml"}
becomes
tosca synopsis "http://www.azopera.com/learn/synopsis/tosca.shtml" . synopsis rdftm:maps-to rdftm:Occurrence .
Unproblematic. rdf:type corresponds to type-instance.
Unproblematic. rdfs:subclassOf corresponds to supertype-subtype.
Define rdftm:TRUE and use (binary) statement:
TM "unfinished(turandot)" becomes RDF "turandot unfinished rdftm:TRUE ."
Tentatively accepted, but doubts regarding acceptability to RDF community since it seems to be more natural to define classes for this kind of thing (e.g. the class of unfinished things). This needs to be clarified on the list.
Adhere to Noy's n-ary patterns.
No agreement reached as yet. Doubt as to whether Pattern 1 is really the equivalent of n-ary association. Definitely prefer Pattern 2. Noy's list pattern should probably not be supported.
X( A : rA , B : rB , C : rC )
could become one of the following:
P rdf:type X . P rA A . P rB B . P rC C . # Pattern 2 P rdf:type X . A rA P . P rB B . P rC C . # Pattern 1
In Pattern 1 role-player A is somehow privileged in that it is the subject of the relationship, whereas in Pattern 2 all role-players have the same status. There is no way to know how to pick the privileged role-player in an unguided translation.
Pattern 2 should therefore be the default for TM2RDF since it requires no additional information. Pattern 1 can be supported using rdftm:subject-role:
If X rdftm:subject-role rA . then choose Pattern 1, otherwise choose Pattern 2. Either pattern will produce X( A : rA , B : rB , C : rC ). Pattern 1 should lead to the generation of the rdftm:subject-role statement in order to enable roundtripping.
Should be straightforward, one-to-one, but check latest TMDM.
Should be no problem when reifying a binary relationship, but may be problematic with re-represented relationships as in Noy's Pattern "2". (If the point of reifying the relationship is to be able to make statements about it, how do we distinguish between statements that are part of the n-ary relationship and those that are not, when going from RDF to TM?)
Should be straightforward.
[A special case? RDF has a mechanism for attaching properties to an RDF model or document. Does anyone remember where to find it?]
Based on reification. As consistent as possible.
Special case names and internal occurrences that are scoped by a single natural language? - Yes. Need to define namespace for RFC3066 codes. See http://www.faqs.org/rfcs/rfc3066.html
Probably unproblematic but needs further discussion and examples. Probable solution is to use rdftm:scope property on a reified relationship.
Language tags are indeed special. Need to define namespace for RFC3066 codes. See http://www.faqs.org/rfcs/rfc3066.html. Mappings to OASIS PSIs? What about XTM PSIs? (Good opportunity to kill them?)
What about collections and containers? Any other issues?
What about collections and containers? Any other issues?
Best practices, including provision of mapping information.
Best practices, including provision of mapping information.
This chapter should be normative.
[This section contains a formal description of the translation procedure]
[Not sure which RDF spec is the best starting point for this. Perhaps RDF Concepts?]
[This section contains a formal description of the translation procedure]
[Could it be specified as an algorithm for processing a TMDM model, i.e., item type by item type? An attempt follows. Whether or not one agrees with the actual translation procedure that this attempt specifies, the attempt itself does seem to indicate that this approach to formally describing the procedure might work.]
FOREACH topic item:
FOREACH topic name item:
{1} What if the node in question is a blank node?