I was wondering: how do you decide between making your data available as static RDF and creating a SPARQL endpoint? Should you do both? What things should you consider when making that decision?

asked 13 Nov '09, 22:42

Aleksander%20Kmetec's gravatar image

Aleksander K...
8315
accept rate: 0%

edited 18 Jan '11, 17:58

scotthenninger's gravatar image

scotthenninger ♦
7.5k813

1

both .

(13 Nov '09, 22:57) spoon16 spoon16's gravatar image

Doing both is the ideal solution: publish RDF with dereferenceable URIs for easy data retrievability and provide a SPARQL endpoint for flexible data access.

Note that publishing RDF doesn't have to be static: you can use SPARQL DESCRIBE queries for collecting the data that is to be served under a dereferenceable URI. Also by describing your dataset (with voiD) and linking from published RDF data to that description and linking to the SPARQL endpoint you create a complete picture and increase discoverability.

But really, this is all just the ideal solution and both bits have their own benefits so it depends on what you want to do and what you can easily do. Are you organising a large dataset in a triple store anyway? Then it's probably not too hard to make it accessible through an open SPARQL endpoint. But if you mainly have your data organised in files then publishing static content is the easiest and quickest thing to do and better than nothing. You can even do nice dereferanceable Linked Data URIs with static files and a simple webserver.

link

answered 14 Nov '09, 00:12

Simon%20Reinhardt's gravatar image

Simon Reinhardt
1.5k27
accept rate: 22%

Consider the size of your dataset and the expected load. If you provide a SPARQL endpoint, you run the risk of your service becoming unresponsive due to complex (or poorly written) SPARQL queries, particularly if your dataset is large.

This risk might be mitigated by requiring users to register to use the endpoint, while the static RDF remains open access. If you are not expecting heavy usage of your data, you might consider that the benefits of providing a SPARQL endpoint outweigh the risk.

link

answered 15 Nov '09, 23:28

Anna's gravatar image

Anna
22116
accept rate: 0%

3

Putting a layer between the user and the SPARQL endpoint might be a good idea. If you can determine a set of commonly-used queries, you could wrap them up in a web service.

(12 May '11, 14:18) drobnox drobnox's gravatar image

This is a question that applies equally to relational databases or any other large system that uses some kind of shared data store. The problem is to process your query in the place where it will have the least impact on system performance (whilst maintaining your global architecture goals).

Another way to think about this problem is to ask yourself: which makes more sense - to take the processing to the data or take the data to the processing? This will depend on what your bottleneck resource is. If it's bandwidth, then that might suggest using SPARQL. If it's CPU load (or some other scarce resource) on your data store, then perhaps it makes more sense to send the data to somewhere where that resource is less stressed.

As Simon points out, the ideal scenario is to provide for both. This is particularly true if you're optimizing a distributed system. In that case, the bottleneck will tend to move between different parts of your system.

link

answered 17 Nov '09, 02:38

Andrew's gravatar image

Andrew ♦♦
1.5k1413
accept rate: 26%

Let me just add that RDF documents as opposition to a SPARQL endopind need not be static files. It should be relatively easy to create a crude app to publish your semantic data as RDF documents.

For example you can have a resources with URI http://data.example.com/category/name. It's Turtle representation could be served by URL like http://www.example.com/category/name.ttl. Note the bolded parts that changed. Similarily this resoruce could be served as HTML, RDF/XML or JSON by simply changing the extension part (underneath a HTTP redirect with content negotiation could happen).

This way you don't have to maintain individual files, which will become a burden as your dataset grows.

By the way have a look at the Linked Data Patterns book. It addresses these and other common issues.

link

answered 25 Jan '13, 04:14

Tomasz%20Pluskiewicz's gravatar image

Tomasz Plusk...
1.3k19
accept rate: 32%

If your data is frequently updated (a situation that the Semantic Web currently does not address very much), then publishing as RDF is VERY hard.

link

answered 23 Jan '13, 09:56

lOlive's gravatar image

lOlive
714
accept rate: 0%

Care to elaborate or qualify that? :)

(23 Jan '13, 15:04) Signified ♦ Signified's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×1,159
×802
×119

Asked: 13 Nov '09, 22:42

Seen: 3,922 times

Last updated: 25 Jan '13, 04:14