5
5

Named Graphs have thus far served a very useful purpose (esp. in SPARQL) for supporting notions of provenance into systems which want to store and track data from multiple sources.

The RDF next steps has sparked some discussion on the future direction of Named Graphs in the RDF standards. However, I do have some concerns about bringing Named Graphs further. To clarify, I'm highlighting concrete questions as follows:

What are the benefits/disadvantages of formalising a semantics for named graphs?

Related question wrt. specifics of some proposed Named Graph semantics here. Do we really need to nail down a semantics for named graphs? What would this buy us in terms of current adoption/standards? Is it really useful for the prevalent provenance use-case and SPARQL? Will it enable better integration of SPARQL and RDF(S)/OWL semantics? Or is it really only necessary for multi-graph documents?

Can "multi-graph documents" co-exist with "single-graph documents" on the Web? If so, in what form and how?

Thus far, Named Graphs have mainly been used in the context of encoding the provenance of multiple single graph documents. Besides the (non-trivial) issue of multi-graph syntax (for which reasonable non-standard solutions exist for the non-XML based syntaxes), publishing multi-graph documents for me is a bit of an unknown. It has obvious benefits for multi-graph exchange between provenance-aware agents (e.g., dumping a SPARQL endpoint, or dumping a site's content) and may be useful, e.g., for returning quads to SPARQL construct queries. However, stepping outside provenance, things get a bit murky.

Multi-graph document are an obvious (albeit not ideal) replacement for unwieldy RDF reification for speaking about (sets of) triples... useful for annotating triples with, e.g., temporal validity, probablistic information, etc.

However, multi-graph documents would constitute (as such) a new data model which is not backwards compatible with the RDF (single-graph) model—e.g., think of a graph which is annotated as expired for the current time. I understand the need for multi-graph exchange between trusting provenance-aware agents, but should we be encouraging more general publishing of such multi-graph documents? Are standard syntaxes for multi-graph documents a step too far? How can multi-graph documents co-exist with single-graph documents/tools out there now? Is there a restricted form of multi-graph document that is backwards compatible and still proves useful? Will we need to start talking about quints to track the provenance of multi-graph information?

asked 03 Sep '10, 00:57

Signified's gravatar image

Signified ♦
23.9k1623
accept rate: 37%

edited 03 Sep '10, 17:44

We should rather talk about "quints" rather than "quins" to keep it etymologically consistent.

(03 Sep '10, 08:26) Antoine Zimm... ♦ Antoine%20Zimmermann's gravatar image

Though the question is definitely interesting to the Semantic Web community, I think it is not appropriate for Semantic Overflow. Reading the FAQ, I see: "Q: What kind of questions should I not ask here? A: Avoid asking questions that are subjective, argumentative, or require extended discussion. This is not a discussion board, this is a place for questions that can be answered!"

(03 Sep '10, 16:46) Antoine Zimm... ♦ Antoine%20Zimmermann's gravatar image

C'mon Antoine ;) this is a very interesting and up-to-date question, where I'm currently also struggling with, because the annotation (association, tagging, ...), personalisation, context use cases are getting more and more important today. Hence, we need a mechanism to express these concerns.

(03 Sep '10, 17:25) zazi zazi's gravatar image

@Antoine, it's a fair point, but I can't help not really caring. :). Like zazi, I find the topic interesting and the question is not causing any undue obstruction to people asking concrete questions. I'm not looking for an all encompassing answer, but any insights into any of the sub-questions would be fine. For example, "what do a defined semantics for Named Graphs buy us?" Also "can multi-graph documents co-exist with single-graph documents?". I might edit the question a bit along these lines...

(03 Sep '10, 17:37) Signified ♦ Signified's gravatar image

Okay, edited the question to have two clear sub-questions people can tackle directly. The rest of the text hopefully gives "context" to the questions: i.e., where I'm coming from.

(03 Sep '10, 17:50) Signified ♦ Signified's gravatar image

However, stepping outside provenance, things get a bit murky.

Multi-graph document are an obvious (albeit not ideal) replacement for unwieldy RDF reification for speaking about (sets of) triples... useful for annotating triples with, e.g., temporal validity, probablistic information, etc.

As replacement for RDF reification, I would suggest N-Quads, where the 4th element (context) refers to the reification statement, which then mustn't include the "reification triples" (the rdfs:subject, rdfs:predicate and rdfs:object relations).

Will we need to start talking about quints to track the provenance of multi-graph information?

So, I would say yes, because also quads need a named graph entailement for provenance and trust ;)

PS: Further thoughs and a (probably) concrete use case can be found in "Time for quintuples?" on the Semantic Web mailing list

PPS: Documents are simply the carrier medium for graphs ;) (from my point of view). Hence, it shouldn't really care, where the graphs are placed in the cloud. It is rather more important how they are related, or? (that means, which graph refers to another one)

PPPS: An extensions to multiple context elements will end in a set of context (/reification) statements, which might then probably something like the proposed Triplesets ;) (see also here)

PPPPS: Tolle's distinction between internal context, here identified as reification statements, and external context, here identified by the Named Graph entailment and the description of the Named Graph (see also Understanding Data by their Context Using RDF)

Edit: To emphasize my still valid endorsement that quadruples are enough, please have a look at my recent proposal for optional statement identifier.

permanent link

answered 23 Sep '10, 20:06

zazi's gravatar image

zazi
3.4k1213
accept rate: 13%

edited 23 Feb '11, 09:38

Just a comment on the last part, the very reason for quads and named graphs in the first place is that in reality, when dealing with arbitrary data from arbitrary sources, it matters very much where a graph is placed: i.e., its provenance. Consider the claim Obama type DrugAddict. The first thing that should spring to any mind is says who? Are they reputable (e.g., are they well-linked from divrese sources); is it Obama himself (or a close relative/colleague); is it on a .gov domain (FBI or official White House page); or is it some quack who just learnt RDF?

(23 Sep '10, 20:54) Signified ♦ Signified's gravatar image

"it matters very much where a graph is placed: i.e., its provenance" - yes, this provenance information should be included in the named graph description, which can also be signed by trust statements, e.g. as proposed in the Semantic Web Publishing Vocabulary. I simply don't really understand, why we should talk about documents ;) So, I would vote for N-Quads + Named Graphs, for example: ex:NG1 { ex:APerson cco:skill http://dbpedia.org/resource/Football_(soccer) ex:CC1 . }

ex:NG1 a rdfg:Graph ; dcterms:modified "2010-09-22T09:55:52+01:00"^^xsd:dateTime .

(23 Sep '10, 21:44) zazi zazi's gravatar image

Please note also the added PPPPS ;)

(23 Sep '10, 21:45) zazi zazi's gravatar image

"is it on a .gov domain" - for example such a thing we have to describe to a "stupid" machine. A human may trust information from this domain. However, at the beginning a machine would treat all domains (authorities) in same way, e.g. as URI or string. Later, if the machine gets further descriptions of these resources, it can reason about this information etc.

(23 Sep '10, 21:55) zazi zazi's gravatar image

I may come to the solution that quadruple might be enough ;) (see http://lists.w3.org/Archives/Public/semantic-web/2010Sep/0175.html) Although, I tried to model my used example with the quintuple approach (quads + named graph), see http://smiy.sourceforge.net/cco/examples/N3/cco_-_football_example.qng . The only problem I probably noticed is that one might to differ between statement specific information (one triple/quad) and graph specific information, because people sometimes have problems with Named Graphs that include only one element (the "statement case"), e.g. for provenance reasons

(25 Sep '10, 15:23) zazi zazi's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×38
×9
×8

question asked: 03 Sep '10, 00:57

question was seen: 2,159 times

last updated: 23 Feb '11, 09:38