6
2

I got some drawings by Rembrandt in a British Museum database, and some paintings by Rembrandt in a RKD database. He's referred to as bm-people:Rembrandt in one, and rkd-artists:rembrandt in the other. I use CIDOC CRM, so Rembrandt is related through crm:P14i_performed to the drawing/painting Production events.

The current representation is:

bm-people:Rembrandt a skos:Concept;
  skos:inScheme bm-people: ; skos:prefLabel "Rembrandt".
rkd-artists:rembrandt a skos:Concept;      
  skos:inScheme rkd-artists: ; skos:prefLabel "Rembrandt".

Use cases:

  1. The user can search for artists using a thesaurus auto-complete function. It relies on skos:inScheme and a further property of the ConceptScheme to collect all values for completion.
  2. If the user selects a value Rembrandt that's correlated between the two thesauri, he should see all works by Rembrandt, no matter which Rembrandt URI they relate to.

Aside: If you look in VIAF, you'll see Rembrandt correlated between 19 sources, including national libraries and Getty ULAN. There's a thousand names and a bunch of extra info about him.

If you look at the VIAF RDF record, you'll see he's represented as a foaf:Person (with a bunch of names). There are also 19 skos:Concepts from the 19 sources (including more names as skos:altLabel), which link back to the main VIAF URI using foaf:focus:

<skos:Concept rdf:about="http://viaf.org/viaf/sourceID/BNF%7C11940484#skos:Concept">
  <foaf:focus rdf:resource="http://viaf.org/viaf/64013650"/>

There are also some owl:sameAs links equating the foaf:Person to URIs in other known sources:

<owl:sameAs rdf:resource="http://dbpedia.org/resource/Rembrandt"/>
<owl:sameAs rdf:resource="http://d-nb.info/gnd/11859964X"/>
<owl:sameAs rdf:resource="http://libris.kb.se/resource/auth/197544"/>
<owl:sameAs rdf:resource="http://www.idref.fr/027341925/id"/>

It would have been great if the BM and RKD thesauri were correlated to VIAF but they're not. end-aside

The question: which is the best way to correlate skos:Concepts representing the same resource?

I know that both the SKOS Primer and Reference say one should use skos:exactMatch and not owl:sameAs. They warn of "undesirable entailments that would follow from using owl:sameAs" but the example they give is a bit weak: "a concept cannot possess two different preferred labels in the same language". Frankly, that doesn't scare me much:

  • when both labels coincide that doesn't matter
  • when you use all prefLabels and altLabels for auto-complete, that's a plus

If I use skos:exactMatch and not owl:sameAs, I'll face these difficulties:

  1. Need to merge the labels of these concepts because it would be silly to offer labels that are the same
  2. Lose the OWLIM sameAs optimization
  3. Need to implement custom inferences eg

Like this:

crm:P14i_performed owl:propertyChainAxiom (skos:exactMatch crm:P14i_performed).

What's your advice?

asked 05 May '12, 14:04

Vladimir%20Alexiev's gravatar image

Vladimir Ale...
41926
accept rate: 3%

edited 05 May '12, 14:06

Maybe I haven't expressed my concern well...

IMHO the main purpose of thesauri is to provide controlled (well-known) URIs for things, to be used in business data. The internal organization of a thesaurus is a secondary concern.

Notwithstanding that sameAs can lead to unintended consequences, that's the standard way to say two things are the same. Eg the Europeana enrichnment (links) uses sameAs to geonames, dbpedia etc.

(08 Nov '12, 05:50) Vladimir Ale... Vladimir%20Alexiev's gravatar image

owl:sameAs has potentially unwelcome inferences. Imagine that I have a term "Monkey" which was added to my thesaurus today, and I claim it's owl:sameAs the concept "Monkey" in a dictionary that somebody else published many years ago.

{ thesaurus:Monkey a skos:Concept .
  thesaurus:Monkey skos:inSchema books:Thesaurus .
  thesaurus:Monkey skos:changeNote "Added 5 May 2012."@en .
  thesaurus:Monkey owl:sameAs dictionary:Monkey .
  dictionary:Monkey a skos:Concept .
  dictionary:Monkey skos:inSchema books:Dictionary . }
      => { dictionary:Monkey skos:inSchema books:Thesaurus .
           dictionary:Monkey skos:changeNote "Added 5 May 2012."@en . } .

The skos:Concept for a monkey doesn't represent an actual monkey, or the class of all monkeys. Think of it more as representing an entry in an a thesaurus, dictionary, encyclopaedia or catalogue.

link

answered 05 May '12, 17:34

tobyink's gravatar image

tobyink ♦
5.2k312
accept rate: 26%

I understand how owl:sameAs can wreak havoc to thesaurus maintenance data, but my concern is as a user of thesaurus data. Should I use the propertyChainAxiom trick described above?

BTW, SKOS can span across ConceptScemes: a concept can have several inScheme, and broaderMatch creates hierarchies spanning two schemes

(08 May '12, 09:19) Vladimir Ale... Vladimir%20Alexiev's gravatar image

The matter has been discussed many times on W3C lists, I won't tell a lot more. The short answer is that in the SKOS context, SKOS matching properties make much sense than the owl:sameAs one. owl:sameAs was existing at the time we finished SKOS, if it had been better we'd have picked it.

The arguments in the SKOS Primer (http://www.w3.org/TR/skos-primer/#secmapping) are not as weak as you claim. These are example where using owl:sameAs results in invalid SKOS data. And not the kind of SKOS data only used for vocabulary management: having non-unique prefLabels will break many data consumption scenarios.

Side notes: - Europeana is using owl:sameAs, yes, but not between SKOS Concepts - there is a foaf:focus property which can be interesting to links SKOS Concepts to "real" entities they represent.

link

answered 10 Dec '12, 07:06

Antoine%20Isaac's gravatar image

Antoine Isaac
612
accept rate: 0%

Ok Antoine, I see foaf:focus explained at http://lists.w3.org/Archives/Public/public-esw-thes/2010Aug/0002.html and http://xmlns.com/foaf/spec/#term_focus

So if bm: is a BM tehsaurus and rkd: is an RKD thesaurus, it would look as:

bm:Rembrandt skos:exactMatch rkd:Rembrandt.
bm:Rembrandt foaf:focus dbpedia:Rembrandt.
rkd:Rembrandt foaf:focus dbpedia:Rembrandt.

And when we represent data in eg CRM we need to use dbpedia:

dbpedia:Rembrandt crm:P14i_performed rkd/painting/2926/production.

instead of

rkd:Rembrandt crm:P14i_performed rkd/painting/2926/production.
(10 Dec '12, 10:36) Vladimir Ale... Vladimir%20Alexiev's gravatar image

The discussion so far basically comes down to:

  • one can't use the most efficient way to equate individuals (sameAs) and/or

  • one shouldn't use skos:Concepts directly in data

Neither of which is a quite satisfactory answer for me...

(19 Dec '12, 12:31) Vladimir Ale... Vladimir%20Alexiev's gravatar image

Well, if you regard sameAs as the most efficient way to equate individuals, then this hints that you've already evaluated that you can live with its drawbacks, or that you are not applying its full semantics (which I'd find surprising given your background) or that you do it with some "protection" (say, named graphs). If yes, then of course you can try to use it and see whether there are bad consequences for the case. I've said that in general it seems dangerous for SKOS cases, and we felt we had to provide with an alternative with a lower ontological commitment that would fit better the observed data and be somehow safer. I've never said that you can't use sameAs.

(19 Dec '12, 18:47) Antoine Isaac Antoine%20Isaac's gravatar image

Yes, you can do like that. In fact I'd say that you can also state

rkd:Rembrandt crm:P14i_performed rkd/painting/2926/production.

But that's because I'm comfortable with having a resource being both a skos:Concept and a (say) foaf:Person at the same time--the SKOS model does not say that concepts are distinct with persons. I know other people would strongly object to it. It's up to you.

(19 Dec '12, 18:50) Antoine Isaac Antoine%20Isaac's gravatar image

Overall I think owl:sameAs, with standard semantics, is not suitable for exchange across boundaries between semantic systems. If, for instance, you're scraping triples off the floor and throwing them into a processing chain, owl:sameAs will definitely give you entailments you don't work.

Now, inside a perimeter that you control, the story is different. owl:sameAs behaves a particular way in your triple store and if you like what owl:sameAs does, then go ahead and use it.

I say it that way, because you're using OWLIM, and OWLIM implements owl:sameAs in a way that may or may not be standards correct or mathematically correct but that is certainly "correct" for building real applications. The case for owl:sameAs is much weaker if you use other tools.

Personally I've dealt with these problems by creating a wrapper that normalizes identifiers that cross the system perimeter but this doesn't address all the problems involved when two concepts in the KB get merged,

link

answered 08 May '12, 15:58

database_animal's gravatar image

database_animal ♦
8.3k612
accept rate: 15%

Antoine, I am also comfortable with having a person be both Person and skos:Concept, but if I cannot use sameAs then there's little value in it being a skos:Concept.

Seems to me VIAF got it right

First they have a main URI that's foaf:Person and all URIs used in data (eg national library bibliographies) are given as sameAs (or "=" in Turtle):

<http://viaf.org/viaf/99366184> a rdaEnt:Person, foaf:Person;
  = <http://dbpedia.org/resource/William_Temple_(archbishop)>,
    <http://d-nb.info/gnd/118756435>,
    <http://libris.kb.se/resource/auth/230284>,
    <http://www.idref.fr/033849587/id>;

This would mean that none of the source URIs is a skos:Concept.

Then they copy all labels from all the sources to foaf:name. Unfortunately they lose preferredness info, but I don't know how could they pick one of the source's prefLabel as globally preferred...

foaf:name "Temple, William",
    "Temple, William, 1881-1944",
    "Temple, William, Abp. of Canterbury, 1881-1944",
    "Temple, William, archev\u0413\u0404que",
    "Temple, William, vesc. di Manchester, 1881-1944",
    "William Canterbury, Archbishop 1881-1944".

Finally they have one skos:Concept per source, with foaf:focus to the main URI:

<http://viaf.org/viaf/sourceID/BAV%7CADV11296505#skos:Concept> a skos:Concept;
  skos:inScheme <http://viaf.org/authorityScheme/BAV>;
  skos:prefLabel "Temple, William, vesc. di Manchester, 1881-1944";
  foaf:focus <http://viaf.org/viaf/99366184>.
<http://viaf.org/viaf/sourceID/BNF%7C12466527#skos:Concept> a skos:Concept;
  skos:inScheme <http://viaf.org/authorityScheme/BNF>;
  skos:prefLabel "Temple, William, 1881-1944";
  foaf:focus <http://viaf.org/viaf/99366184>.

So they use skos:Concept and foaf:focus only for "thesaurus bookkeeping info", but it seems the intention is to use the main URI (and sameAs source URIs) in business data.

The BnF data model also uses foaf:focus: http://data.bnf.fr/images/graphe_complet.jpg

So... the answer is: don't use skos:Concept in business data, use other URIs that are amenable to sameAs. Or else, implement extra rules that propagate business relations across skos:exactMatch.

Note: it seems to me we need to assume the following rule:

{?term1 foaf:focus ?entity. ?term2 foaf:focus ?entity} =>
{?term1 skos:exactMatch ?term2}
link

answered 29 Jan '13, 10:37

Vladimir%20Alexiev's gravatar image

Vladimir Ale...
41926
accept rate: 3%

edited 29 Jan '13, 10:51

Yes, the VIAF pattern is a good one, but we had seen it already before. Note that it won't solve your all owl:sameAs problems anyway. Whatever be the types of URI, owl:sameAs may have unintended effect when reconcialiating descriptions with overlapping statements.

On "business data", I'm not sure I get your interpretation: there's not a business level and a non-business level. If the SKOS layer does not fit any business case then don't express the data for it.

The rule seems interesting indeed. Some kind of "co-denotation" axiom. But as SKOS always refused to endorse a strictly extensional approach to concept definition, this axiom would need to be endorsed by the creators of the "extension" property, ie foaf:focus.

(29 Jan '13, 12:10) Antoine Antoine's gravatar image
1

I tried to separate and assign comments as best I could. Please keep answers as answers to the original question folks, thanks. If you want to comment on an answer, look for the "add a comment" button, not the "Post Your Answer" button.

(29 Jan '13, 14:06) Signified ♦ Signified's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×80
×9

Asked: 05 May '12, 14:04

Seen: 3,061 times

Last updated: 29 Jan '13, 14:08