Skip to topic | Skip to bottom
Home

RDFTM
RDFTM.RDFTMInteroperabilityGuidelinesSteveDraftr1.1 - 19 Sep 2005 - 17:42 - ValentinaPresuttitopic end

Start of topic | Skip to actions

This is the Steve's draft proposal

Guidelines for RDF/Topic Maps Interoperability

1 Introduction

1.1 Background

The Resource Description Framework (RDF) is a model developed by the W3C for representing information about resources in the World Wide Web. Topic Maps is a standard for knowledge integration developed by the ISO. The two specifications were developed in parallel during the late 1990's within their separate organizations for what at first appeared to be very different purposes. The results, however, turned out to have a lot in common and this has led to calls for their unification.

While unification has to date not been possible (for a variety of technical and political reasons), a number of attempts have been made to uncover the synergies between RDF and Topic Maps and to find ways of achieving interoperability at the data level. There is now widespread recognition within the respective user communities that achieving such interoperability is a matter of some urgency. Work has therefore been initiated by the Semantic Web Best Practices and Deployment Working Group of the W3C with the support of the ISO Topic Maps committee to address this issue.

A Working Group Draft containing a [Survey] of earlier approaches and an analysis of their strengths and weaknesses has already been produced. This document provides a set of Guidelines for users who want to combine usage of the W3C's RDF/OWL family of specifications and the ISO's family of Topic Maps standards.

1.2 Purpose and target audience

The purpose of this document is to present a solution to the problem of RDF/Topic Maps interoperability at the data level. It consists of guidelines that describe how to author topic maps and RDF documents in order to ensure maximum interoperability, and a set of rules for performing automated translation between RDF and Topic Maps. The goal is to be able to translate data from one form to the other without unacceptable loss of information or corruption of the semantics. It should also be possible to query the results of a translation in terms of the target model and it should be possible to share vocabularies across the two paradigms.

[RDF-Schema] and [OWL] are considered relevant to this work to the extent that the classes and properties they define are supportive of its goals. However, it is explicity not a goal of the current work to enable the general use of RDF Schema and OWL with Topic Maps, although this issue may be addressed later.

This document is aimed at anyone with an interest in the problem of RDF/Topic Maps interoperability and a willingness to acquire the necessary understanding of both models. In particular it targets authors of topic maps and RDF documents; creators of tools for translating between RDF and Topic Maps; and those who seek reassurance that data can be easily reused across the two paradigms. The reader is expected to be familiar with both RDF and Topic Maps to a level that at least corresponds to the tutorial material in [Pepper 00] and [RDF-Primer]. To fully understand Chapter 5, the reader must in addition be familiar with the models described in [TMDM] and <a href="#RDF-Semantics">[RDF-Semantics], and the syntaxes described in [LTM] and [N3].

1.3 Structure of this document

This document starts by stating the requirements that the Guidelines are intended to fulfill. This is followed by an informative prose description of the mapping between the RDF and Topic Maps models that underpins the interoperability guidelines. The description is structured by concept, starting with the most general concepts (things, proxies, assertions, etc.) and ending up with concepts that are specific to one paradigm or the other (e.g., scope and language tags).

Chapter 4 contains guidelines for authoring RDF and Topic Maps, respectively. These are expressed as succintly as possible in order that they should be easily referenced. The rationale for these guidelines is to be found in Chapter 3.

Finally, Chapter 5 provides a formal exposition of rules for performing automated translations from RDF to Topic Maps and vice versa based on the data models described in [TMDM] and [RDF-Semantics]. Once again, these rules are expressed as succintly as possible for ease of reference and the rationale for them is to be found in Chapter 3.

1.4 Glossary

[RDF and Topic Maps will often be referred to as 'the paradigms'.]

Guided translation
...
Mapping information
...
RDF2TM
...
Round-trip
...
TM2RDF
...
Unguided translation
...

2. Requirements

This document should provide Guidelines such that the following requirements are satisfied:

MUST

  1. Data originating in one paradigm must merge cleanly with data originating in the other.
  2. Vocabularies must be reusable across the two paradigms.
  3. Queries written against one model must be usable with data translated from the other.
  4. Useful translations must be possible in the absence of mapping information specifically intended to guide the translation.
  5. It must be possible to provide specific mapping information in order to achieve an optimal (guided) translation.
  6. Constructs that cannot be handled by the translation mechanism (if any) must be specified in the Guidelines.
  7. Advice must be given to authors on how to ensure maximum interoperability.
  8. The results of a translation must be deterministic.
  9. The Guidelines themselves must not reference any namespaces apart from RDF, RDFS, OWL, XSD, TM, and RDFTM except to provide examples. (Guidance must be able to reference any namespace.)

SHOULD

  1. Round-tripping should be possible with both guided and unguided translations.
  2. It should be possible to implement the translation using event-based processing.
  3. Properties and classes defined in RDF, RDFS, OWL, and TMDM should be used where possible in order to aid and guide translations.
  4. There should be just one vocabulary that covers both directions and can be expressed in either RDF or Topic Maps.
  5. The results of a guided and an unguided translation should be as similar as possible.
  6. The translation mechanism should be capable of handling every possible construct in the source paradigm.

[Explicit statement that naturalness has higher priority than completeness?]

[Should translation be allowed to fail when authoring guidelines have not been followed?]

Examples of edge cases that should not be catered for:

  • Associations of same type use different pairs of role types
  • Untyped associations and occurrences (?) [could defined an rdftm property for these "types": rdftm:untypedRelationship or use rdfs:seeAlso]
  • Reified association roles

3. Informal description

[This chapter is informative: It should provide a readable, informal, but essentially complete overview of how constructs in the two paradigms relate to each other. Where possible, section headings should use neutral terminology. Details should be left to Chapter 5, Translation guidelines.]

The mapping mechanism will consist of properties in the rdftm: namespace that can be easily translated into TMs. The guidance will consist in expressing some properties in RDF vocabularies as subproperties of these rdftm: properties. Some classes will also be defined.

The following classes and properties have so far been identified:

  • rdftm:name (?)
  • rdftm:default-name
  • rdftm:subjectIdentifier
  • rdftm:subject-role
  • rdftm:object-role
  • rdftm:variant
  • rdftm:scope
  • rdftm:InformationResource
  • rdfs:subClassOf rdfs:Resource
  • owl:equivalentClass tmdm:InformationResource (?)

3.1 Things and proxies

There is a fundamental equivalence between subjects and resources. This equivalence may be refined as follows:

  1. Topics are to subjects as RDF nodes (excluding literals) are to resources. Subjects and resources are the "things" (entities, concepts, documents, whatever) about which assertions are made. Topics and RDF nodes (excluding literals) are the corresponding "proxies" that represent subjects and resources within the Topic Maps and RDF models respectively.
  2. In RDF, the distinction between the "thing" and its "proxy" tends to be blurred. In what follows we will therefore use the term "resource" rather than "RDF node" in order to stay closer to everyday RDF parlance.

Given the above, topics that have one or more characteristics should always be mapped to resources, since a topic only exists in order to make assertions about its subject and the only way to make an assertion in RDF is to create a statement whose subject is a resource. Resources which are the subjects of RDF statements should always be mapped to topics. However, resources which are only objects of statements may not always be mapped to topics. (This will be discussed further below.)

ISSUE: What to do about topics that have no characteristics?

3.2 Identity

Resource URIs, subject identifiers, subject locators, source locators. Blank nodes.

  • RDFTM

      [Resource URIs, subject identifiers, subject locators, source locators. Blank nodes.]

      Both topics and resources may use URI references (or URIrefs) as identifiers (the term URIref is used here in accordance with W3C usage to mean a URI with an optional fragment identifier). However, in Topic Maps there are two ways in which a URIref can be used to identify a subject: directly, as the actual address (or locator) of the subject, in which case it is called a "subject locator"; or indirectly, as the locator of an information resource that provides some human-interpretable indication of the subject, in which case it is called a "subject identifier". It is always clear, in both the model and the interchange syntax, whether the URIref is a subject locator or a subject identifier.

      RDF does not make this distinction explicitly. The question therefore arises, when going from RDF to Topic Maps, whether to map the URIref of a resource to a subject locator or to a subject identifier; and, conversely, when going from Topic Maps to RDF, whether to map subject locators or subject identifiers (or neither, or both) to the URIrefs of resources.

      Any solution which favours one type of identifier (say, subject identifiers) will lead to unnatural results with the identifiers of the other type. These guidelines therefore suggest a solution that retains some of the ambiguity of the RDF approach while at the same time preserving enough information to be able to perform roundtripping. The solution hinges on the assumption that topics with subject locators are explicitly or implicitly instances of the class InformationResource.

      The rules are as follows:

      • TM2RDF
        • If the topic has one or more subject locators, one subject locator (chosen at random) becomes the URI of the resource, and the resource is typed as tm:InformationResource. Additional subject locators become owl:sameAs properties. Any subject identifiers become rdftm:subjectIdentifier properties.
        • If the topic has one or more subject identifiers and no subject locators, one subject identifier (chosen at random) becomes the URI of the resource. Additional subject identifiers become owl:sameAs properties. (ISSUE 1: Perhaps inappropriate use of owl:sameAs? ISSUE 2: What if topic is already an instance-of tm:InformationResource? Roundtripping gets screwed up...)
        • Source locators item identifiers? are thrown away. (? Make sure we really want to do that.)
        • Topics with neither subject identifier nor subject locator become blank nodes.
      • RDF2TM
        • If the type of the resource is a subclass of rdftm:InformationResource, the URI becomes a subject locator. Any owl:sameAs properties become additional subject locators. The values of properties that are subproperties of rdftm:subjectIdentifier become subject identifiers.
        • If the type of the resource is not a subclass of rdftm:InformationResource, the URI becomes a subject identifier. Any owl:sameAs properties become additional subject identifiers..
        • Blank nodes become topics with no identifier.
        • uncertainty about usage of owl:sameAs

      Examples

        [wikipedia-tosca = "Wikipedia page about Tosca"
                          %"http://en.wikipedia.org/Tosca"]
      
        http://en.wikipedia.org/Tosca
          rdfs:label "Wikipedia page about Tosca" ;
          rdf:type  tm:InformationResource .
      
      ------------------------------------------------------
      
        [wikipedia-tosca = "Wikipedia page about Tosca"
                          %"http://en.wikipedia.org/Tosca"
                          @"http://psi.ontopia.net/wikipedia/Tosca-page"
                          @"http://psi.unibo.net/wikipedia/Tosca-page" ]
      
        http://en.wikipedia.org/Tosca
          rdfs:label "Wikipedia page about Tosca" ;
          rdf:type  tm:InformationResource ;
          rdftm:subjectIdentifier http://psi.ontopia.net/wikipedia/Tosca-page ,
                                  http://psi.unibo.net/wikipedia/Tosca-page .
      
      ------------------------------------------------------
      
        [wikipedia-tosca = "Wikipedia page about Tosca"
                          %"http://en.wikipedia.org/Tosca"
                          %"http://207.142.131.214/Tosca" ]
      
        http://en.wikipedia.org/Tosca
          rdfs:label "Wikipedia page about Tosca" ;
          rdf:type   tm:InformationResource ;
          owl:sameAs http://207.142.131.214/Tosca .
      
      ------------------------------------------------------
      
        [wikipedia-tosca = "Wikipedia page about Tosca"
                          @"http://psi.ontopia.net/wikipedia/Tosca-page"
                          @"http://psi.unibo.net/wikipedia/Tosca-page" ]
      
        http://psi.ontopia.net/wikipedia/Tosca-page
          rdfs:label  Wikipedia page about Tosca ;
          owl:sameAs  http://psi.unibo.net/wikipedia/Tosca-page .
      
      
      

      3.3 Assertions, types, and properties

      General correspondence between TM assertions and RDF statements.

      General correspondence between the type of a TM assertion and the property of an RDF statement.

      Untyped occurrences and associations use the TMDM untyped-foo subject identifiers as properties.

      It should not be possible to perform transforms between vocabularies.

      3.4 Names

      There are two basic, alternative approaches to untyped names:

      • Always translate untyped base names to the same property, e.g., rdfs:label, rdftm:name, tmdm:untyped-name.
        PRO: Simpler mapping.
        CON: Less natural result.
      • Translate untyped base names to different properties depending on the type of the topic/resource.
        PRO: More natural result.
        CON: More complex mapping.

      Apart from untyped names, the rule in 3.3 applies: name types map to properties and properties map to name types.

      Conclusion regarding variants:

      Having considered three alternatives for variant names we reach the following conclusions:

      • Any solution that inserts an extra node between a resource and the literal that is its name is very unnatural and should be avoided. This includes both the use of a complex object to represent the name as a whole and the use of collections or containers.
      • Some usages of collections/containers lead to losing the connection between the base name and its variant(s) and thus impact roundtripping.
      • The only alternative is to use a single property for the base name and to reify the statement in order to attach variants. This requires the following additional properties:
        • rdftm:variant
        • rdftm:scope

      Considerations

      ACTION: Lars Marius to post to SWBPD enquiring about the semantics of rdfs:label.

      Examples

      [puccini = "Giacomo Puccini"]
      
      concert-x
        dc:title   "Concert X" .
      composer-y
        foaf:name  "Composer Y" .
      [ skos:prefLabel "Economic cooperation" ;
        skos:altLabel  "Economic co-operation" ] .
      http://www.megginson.com/exp/id/airports/ABBN
        apt:name "BRISBANE , AUSTRALIA" .
      
      [concert-x = "Concert X" : dc:title]
      
      <topic id="concert-x">
        <baseName>
          <instanceOf><subjectIndicatorRef xlink:href="dc:title"/></instanceOf>
          <baseNameString>Concert X</baseNameString>
        </baseName>
      </topic>
      
      <topic id="concert-x">
        <baseName>
          <baseNameString>Concert X</baseNameString>
        </baseName>
      </topic>
      
      ----------------------------------------------------------------------
      
      

      3.5 Relationships

      3.5.1 Statements

      In unguided translation, translate to an internal occurrence if the object is a literal; translate to an association if the object is a URIref or a blank node. [[We could say translate to an external occurrence if the object is not the subject of some other statement. Do we want to do that?]]

      ISSUES:

      1. Should it be possible to map statements to anything except names, occurrences and associations (except for mappings that are predefined by RDFTM, e.g. for owl:sameAs)? Specifically, what about skos:subjectIndicator? (Resolution: Allowing any property, except one explicitly privileged by the Guidelines, such as tm:SubjectIdentier, to become a subject identifier or a subject locator leads to problems with roundtripping. Therefore we don't allow this. ISSUE: What other alternatives exist?
      2. What should the mapping vocabulary look like? Assuming we go for RDF: Should we use rdftm:maps-to (as in Garshol), or rdfs:subPropertyOf? In either case, possible values could be limited to tm:Name or tm:Occurrence (assuming the answer to the previous question is "no"). tm:Association is not necessary if we use the literal/non-literal distinction above.
      3. Is it then an error to map a statement whose value is not a literal to a name? (If so, variant names would constitute an exception to this, assuming we use blank nodes.)

      3.5.2 Binary associations

      Accepted as described in 3.5.2.1. It is not an error for that values of subject-role and object-role to be the same. This implies a symmetric relationship and the assignment of subject/object roles is irrelevant.

      Unguided translations not yet considered.

      3.5.2.1 Guided RDF2TM and TM2RDF

      Use rdftm:subject-role and rdftm:object-role to specify correspondence between role types and subject/object:

          bio:born-in  rdftm:subject-role  geo:person .
          bio:born-in  rdftm:object-role   geo:place .
      

      This caters for RDF2TM and TM2RDF. Given

         born-in( puccini : person, lucca : place )
      

      translator can figure out that puccini is the subject. For symmetric relationships, one can only guess. [[Result is non-deterministic. Is this a problem?]]

      3.5.2.2 Unguided RDF2TM

      Use rdftm:Subject and rdftm:Object as role types.

      3.5.2.3 Unguided TM2RDF

      Fall back to same approach as n-ary associations.

      ISSUES:

      1. What happens when a given association type is used inconsistently w.r.t. to role types?

      3.5.3 Occurrences

      TM2RDF unproblematic: all occurrences map to statements whose values are either literals or URIs/blank nodes.

      <RDF2TM: any property for which the guidance does not lead to interpretation as a name or an association (or an identifier?), is regarded as an occurrence.

      ISSUES:

      1. How to achieve round-tripping in unguided translation:
        • In TM2RDF an external occurrence becomes a statement whose object is a resource
        • In RDF2TM, a statement whose object is a resource becomes an association

        Perhaps by generating a mapping statement when doing TM2RDF:

           {tosca, synopsis, "http://www.azopera.com/learn/synopsis/tosca.shtml"}
        

        becomes

           tosca      synopsis       "http://www.azopera.com/learn/synopsis/tosca.shtml" .
           synopsis   rdftm:maps-to  rdftm:Occurrence .
        

      3.5.4 Type-instance relationships

      Unproblematic. rdf:type corresponds to type-instance.

      3.5.5 Supertype-subtype relationships

      Unproblematic. rdfs:subclassOf corresponds to supertype-subtype.

      3.5.6 Unary relationships

      Define rdftm:TRUE and use (binary) statement:

      TM "unfinished(turandot)" becomes RDF "turandot unfinished rdftm:TRUE ."

      Tentatively accepted, but doubts regarding acceptability to RDF community since it seems to be more natural to define classes for this kind of thing (e.g. the class of unfinished things). This needs to be clarified on the list.

      3.5.7 N-ary relationships

      Adhere to Noy's n-ary patterns.

      No agreement reached as yet. Doubt as to whether Pattern 1 is really the equivalent of n-ary association. Definitely prefer Pattern 2. Noy's list pattern should probably not be supported.

      X( A : rA , B : rB , C : rC )
      

      could become one of the following:

      P rdf:type X .   P rA A .   P rB B .   P rC C .   # Pattern 2
      P rdf:type X .   A rA P .   P rB B .   P rC C .   # Pattern 1
      

      In Pattern 1 role-player A is somehow privileged in that it is the subject of the relationship, whereas in Pattern 2 all role-players have the same status. There is no way to know how to pick the privileged role-player in an unguided translation.

      Pattern 2 should therefore be the default for TM2RDF since it requires no additional information. Pattern 1 can be supported using rdftm:subject-role:

      If X rdftm:subject-role rA . then choose Pattern 1, otherwise choose Pattern 2. Either pattern will produce X( A : rA , B : rB , C : rC ). Pattern 1 should lead to the generation of the rdftm:subject-role statement in order to enable roundtripping.

      3.6 Datatypes

      Should be straightforward, one-to-one, but check latest TMDM.

      3.7 Reification

      Should be no problem when reifying a binary relationship, but may be problematic with re-represented relationships as in Noy's Pattern "2". (If the point of reifying the relationship is to be able to make statements about it, how do we distinguish between statements that are part of the n-ary relationship and those that are not, when going from RDF to TM?)

      General guidelines

      Should be straightforward.

      Reified topic map/RDF model

      [A special case? RDF has a mechanism for attaching properties to an RDF model or document. Does anyone remember where to find it?]

      3.8 Scope

      Based on reification. As consistent as possible.

      Language tags

      Special case names and internal occurrences that are scoped by a single natural language? - Yes. Need to define namespace for RFC3066 codes. See http://www.faqs.org/rfcs/rfc3066.html

      Probably unproblematic but needs further discussion and examples. Probable solution is to use rdftm:scope property on a reified relationship.

      Language tags are indeed special. Need to define namespace for RFC3066 codes. See http://www.faqs.org/rfcs/rfc3066.html. Mappings to OASIS PSIs? What about XTM PSIs? (Good opportunity to kill them?)

      3.9 Other issues

      What about collections and containers? Any other issues?

      3.9 Other issues

      What about collections and containers? Any other issues?

      4. Authoring guidelines

      4.1 Guidelines for authors of RDF

      Best practices, including provision of mapping information.

      4.2 Guidelines for authors of Topic Maps

      Best practices, including provision of mapping information.

      5. Translation guidelines (rules?)

      This chapter should be normative.

      5.1 RDF to Topic Maps

      [This section contains a formal description of the translation procedure]

      [Not sure which RDF spec is the best starting point for this. Perhaps RDF Concepts?]

      5.2 Topic Maps to RDF

      [This section contains a formal description of the translation procedure]

      [Could it be specified as an algorithm for processing a TMDM model, i.e., item type by item type? An attempt follows. Whether or not one agrees with the actual translation procedure that this attempt specifies, the attempt itself does seem to indicate that this approach to formally describing the procedure might work.]

      Topic map items

      1. [Whatever has to be done first (reification?)]
      2. Perform topic item processing on each topic item.

      Topic items

      FOREACH topic item:

      1. Create an RDF node and use it as the subject of all the statements specified below unless some other subject is explicitly stated.
      2. IF the item has [subject locators] properties, THEN
        1. choose one [subject locators] property at random and use its value as the URI of the RDF node;
        2. for each additional [subject locators] property, create a corresponding owl:sameAs statement;
        3. for each [subject identifiers] property, create a corresponding rdftm:subjectIdentifier statement;
        4. create an rdf:type statement whose value is rdftm:InformationResource;
        OTHERWISE, IF the item has [subject identifiers] properties, THEN
        1. choose one [subject identifiers] property at random and use its value as the URI of the RDF node;
        2. for each additional [subject identifiers] property, create a corresponding owl:sameAs statement.
      3. Perform topic name item processing on each topic name item.
      4. Perform occurrence item processing on each occurrence item.
      5. Perform association role item processing on each association role item.

      Topic name items

      FOREACH topic name item:

      1. IF the item has a [variants] property, THEN
        1. Create an rdftm:topicName statement (/S/) whose value is a blank node (/B/);
        2. IF the item has a [type] property, THEN
          1. using /B/ as subject, create a statement whose property is the URI (/U/) of the node to which the topic item that is the value of the [type] property gave rise{1}, and whose value is the [value] property of the topic name item;
          2. create an rdf:type statement whose subject is /U/ and whose value is rdfs:label;
          OTHERWISE
          1. using /B/ as subject, create an rdfs:label statement whose value is the [value] property of the topic name item;
        3. do variant item processing on each of the [variants] properties;
        OTHERWISE
      2. IF the item has a [type] property, THEN
        1. create a statement (/S/) whose property is the URI (/U/) of the node to which the topic item that is the value of the [type] property gave rise{1}, and whose value is the [value] property of the topic name item;
        2. create an rdf:type statement whose subject is /U/ and whose value is rdfs:label;
        OTHERWISE
        1. create an rdfs:label statement (/S/) whose value is the [value] property of the topic name item.
      3. IF the item has a [scope] property, THEN
        1. reify /S/ and make the rdf:Statement node the subject of one rdftm:scope statement for each topic item in the value of the [scope] property;
        2. set the value of each rdftm:scope statement to the URI of the node to which the corresponding topic item in the value of the [scope] property gave rise{1}.
      4. IF the item has a [reified] property, THEN
        1. IF /S/ is not already reified, THEN reify /S/;
        2. set the URI of the rdf:Statement node of the reified statement /S/ to the URI of the node to which the topic item that is the value of the [reified] property gave rise{1}.

      Occurrence items

      1. ...

      Association role items

      1. ...

      ISSUES

      {1} What if the node in question is a blank node?

      6. Conclusion


      to top


You are here: RDFTM > RDFTMMappingGuidelines > RDFTMInteroperabilityGuidelinesSteveDraft

to top

Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Fabio's Wiki? Send feedback