17
5

What features or capabilities do you want or need in an RDF db system that just aren't available presently? Seems like RDF systems still lag very far behind RDBMSes in terms of features and capabilities. What are some of the maturity vectors for RDF DBs?

asked 27 Aug '10, 04:00

Kendall%20Clark%202's gravatar image

Kendall Clark 2
48516
accept rate: 0%

edited 16 Sep '10, 23:11

Signified's gravatar image

Signified ♦
24.0k1623

Can you clarify what kind of RDBM features and capabilities you refer to? I see that some of the answers focus on the query language, not the database. So perhaps there are two comparisons to be made - RDBM vs. RDF db and SQL vs. SPARQL.

(17 Sep '10, 03:34) scotthenninger ♦ scotthenninger's gravatar image

GPU support

(15 Apr '11, 11:48) Nathan Nathan's gravatar image

OH, and solid spatial/GIS support

(15 Apr '11, 12:30) Nathan Nathan's gravatar image

13

Standardised full text search would be nice, as noted by another response. I'd quite also like to see support for a quad-based version of CONSTRUCT/DESCRIBE (perhaps outputting TriX or N-Quads).

Some sort of equivalent to SQL's views would be good: imagine I have a nice CONSTRUCT query figured out; I'd like to be able to save that query as a particular named graph - not just a snapshot of the query though; I'd want the graph to always contain the latest results. Similarly, it would be nice to use a SELECT query to build a view which could be used in a SPARQL 1.1 subquery.

But mostly SPARQL's syntax/expressiveness is adequate - an important focus for future development should be allowing people to query more data, faster. Of course optimisation is part of that - many triple stores are thin layers over SQL databases, which is not necessarily the best way to store and index RDF data. But distributed SPARQL is part of that too - smart, distributed queries should allow you to have a lot more data at your fingertips.

permanent link

answered 29 Aug '10, 18:19

tobyink's gravatar image

tobyink ♦
5.2k312
accept rate: 26%

2

Materialized views and view maintenance -- that's esp. good input. Thx!

(30 Aug '10, 17:04) Kendall Clark 2 Kendall%20Clark%202's gravatar image

Good point about quad-based CONSTRUCT/DESCRIBE... we had to hack that in too...

(31 Aug '10, 14:26) Signified ♦ Signified's gravatar image
1

Agree with standardized full text search (Apache Lucene, Solr) integration. The tolerant query feature (by means of corresponding indexes, e.g. k-gram) would also be helpful in addition to it.

(24 Mar '11, 05:07) Nikita Zhiltsov Nikita%20Zhiltsov's gravatar image

SPARQL views could be easy to mimic. In a way the linked data URIs of Geonames are exactly that. http://geonames.org/798544/contains.rdf represents a view of sub-features of http://geonames.org/798544/. If you allow loading graphs dynamically during SPARQL execution you can use is transparently in your queries. Also this solution is platform independent and implementing a simple app to store manage such views should be relatively easy to in any stack.

(25 Jan '13, 12:11) Tomasz Plusk... Tomasz%20Pluskiewicz's gravatar image

Standard and practical ways to handle provenance, uncertainty and temporal qualifications.

permanent link

answered 17 Sep '10, 04:33

Tim%20Finin's gravatar image

Tim Finin
701128
accept rate: 25%

+1 for uncertainty. All human information is uncertain, especially the stuff we find on the internet, but all the semantic-web technologies assume everything's certain, which makes them impractical.

(23 Feb '11, 22:01) Cerin Cerin's gravatar image

ran accross this http://semanticweb.com/structuring-data-with-probase_b18650

(24 Mar '11, 20:42) harschware ♦ harschware's gravatar image

From my Virtuoso-centric (Virtuoso-holic?) perspective, the key missing part is adequate working environment for application developers. While we are adding advanced features people continue to painfully debug trivial typos in names of predicates, variables and other things --- even trivial auto-completion is a problem.

So IDEs and textbooks, the more the better.

Technical issues mentioned above are important but they're on their way to the release already and they will cost us no more than months of work of me alone, that's much cheaper than the required long and intensive IDE work.

permanent link

answered 30 Aug '10, 02:55

Ivan%20Mikhailov%201's gravatar image

Ivan Mikhail...
7284
accept rate: 17%

2

Take a look at TopBraid Composer. In terms of an IDE for SPARQL it has a number of features to make it easy to design and debug SPARQL queries. Has autocomplete features for both variables and resource names.

(30 Aug '10, 19:19) scotthenninger ♦ scotthenninger's gravatar image
2

+1 IDEs are definitely something we need particularly wrt auto-completion and it's something I'm working on at the moment because I've got fed up of not having syntax highlighting and auto-completion when editing my RDF by hand

(31 Aug '10, 07:48) Rob Vesse ♦ Rob%20Vesse's gravatar image

In terms of editing RDF by hand, there are a few RDF editors out there. I'll single out TopBraid Composer again ;-) as it has autocomplete features throughout. Also drag-and-drop, etc. The Free Edition has all of these RDF editing features.

(31 Aug '10, 13:45) scotthenninger ♦ scotthenninger's gravatar image
1

True but it would require me to use Eclipse and I'm still trying to avoid ever having to do so ;-)

(01 Sep '10, 08:26) Rob Vesse ♦ Rob%20Vesse's gravatar image

May be Datao can help when trying to figure out the data model of an endpoint and the various molecules you can extract from it. cf http://Datao.net

(23 Jan '13, 12:33) lOlive lOlive's gravatar image

Standardized way for atomically minting new URI's on the basis of a pattern and a sequence (mysql autoincrement). Needed for apps where we need to generate new primary keys. Bonus points for being able to randomly select an unique value from a range (e.g. random userids).

In essence I do not know of a correct standard function that would answer generating-unique-ids

permanent link

answered 06 Sep '10, 07:18

Jerven's gravatar image

Jerven ♦
4.7k610
accept rate: 35%

Thanks for this comment; it's also good feedback. It gets old reading another comment from Scott pushing TBC in a thread about RDF DBs. :)

(08 Sep '10, 05:56) Kendall Clark 2 Kendall%20Clark%202's gravatar image

This seems to be a case of topic hijacking, as the question is on missing RDF data store functionality and this is a more general topic of generating URIs for RDF. However there are many examples of tools (though none standardized that I know of) that use UUIDs, etc. A more specific example for this "question" is TopBraid Suite's smf:buildUniqueURI() that takes a string and creates a URI. If the URI matches an existing URI, it appends an auto-incremented number _0, _1, etc. It's not standardized, but a useful way to address this problem...

(10 Sep '10, 18:21) scotthenninger ♦ scotthenninger's gravatar image

Edited comment to make it a bit clearer. It's not pushing anything and is noted as non-standard. It's an example that could address the issue. Others are free to add their examples and maybe we'll eventually reach the standard Jerven is asking for.

(10 Sep '10, 18:23) scotthenninger ♦ scotthenninger's gravatar image

@scotthenninger I have specific use case where I want the store to generate an unique URI, in a non-blocking atomic operation. Preferably using a standard SPARUL statement. I do not want to rewrite core queries every time we try a new store.

What you are missing is the capability to call TBC from within a sparul statement ;) Or any other URI generating service. And I am quite certain that if TBC is not tightly integrated into the store you are not going to get atomic behavior for your new URIs.

(17 Sep '10, 09:36) Jerven ♦ Jerven's gravatar image

I guess, the thread "URI template specifications for Linked Data publishing?" (http://www.semanticoverflow.com/questions/2858/uri-template-specifications-for-linked-data-publishing) is related to this issue.

(24 Mar '11, 08:58) zazi zazi's gravatar image

@zazi partly. The part of filling in the {key} part in the template needs to be filled in with an atomic generated unique value.

(24 Mar '11, 09:22) Jerven ♦ Jerven's gravatar image
showing 5 of 6 show 1 more comments

Better support for federation (that would nice kick in that b***s for RDBMS)

Consideration for transactions and even federated transactions (an even harder kick)

A good, abundant and wide range of supported Accepts types on the SPARQL endpoint

permanent link

answered 23 Mar '11, 22:05

William%20Greenly's gravatar image

William Greenly
5.1k412
accept rate: 13%

I'd like to have more control over physical layout of the data and indexes. In relational databases, we regularly see factor of 100-1000x improvements when we get this right, and I'm sure the same could be done for the RDF world.

Just try this experiment.

(i) Load 300 million rows into a single mysql table with some indexes. (ii) Load 3 million rows into 100 different mysql rows. (iii) compare. You'll find that (ii) happens much more quickly than (i).

The paradigm of "load everything into one huge triple store" might work for a place like IBM or NASA, that can afford a 50 machine cluster and where there isn't any consequence if a project succeeds or fails. However, you might as well hang out a sign that says "lean startups need not apply" because it makes the cost of entry incredibly high. Many of us are getting results using very stupid methods such as RDF -> relational mapping and batch processing because we find that we deliver answers in 20 minutes on cheap hardware, rather than taking a few days on an expensive cluster.

Just to start with, I'd like to see named graphs with completely separate physical storage and indexing. That is, building 100 named graphs with N triples each should cost 100 times what it costs to build 1 named graph with N triples. Beyond that, I'd like to see supplementary indexes that work on specific graph patterns, so I can build something like a multipart index in SQL.

permanent link

answered 28 Aug '11, 20:40

database_animal's gravatar image

database_animal ♦
8.4k1612
accept rate: 15%

Standardised syntax for keyword search in SPARQL.

permanent link

answered 27 Aug '10, 16:02

Signified's gravatar image

Signified ♦
24.0k1623
accept rate: 37%

Keyword search or do you mean full-text search over RDF literals?

(27 Aug '10, 16:58) Kendall Clark 2 Kendall%20Clark%202's gravatar image

I don't draw a distinction... but yes, more accurately full-text search wrt. RDF literals. Although most such engines support full-text search, they do so in non-standard ways using non-standard syntaxes. It's strange to me that SPARQL hasn't tackled this issue.

(27 Aug '10, 18:18) Signified ♦ Signified's gravatar image

Okay, this is probably best input to the SPARQL WG; RDF databases can't make anything standard, other than de facto. :>

(27 Aug '10, 18:25) Kendall Clark 2 Kendall%20Clark%202's gravatar image

There is regex in SPARQL, which uses the XQuery 1.0/XPath 2.0 standard. But that will only search a specific literal. For searching all literals in a given graph/repository, yes, this is not trivial to do with existing RDF data stores.

(27 Aug '10, 20:12) scotthenninger ♦ scotthenninger's gravatar image

Yep. Regex is really only suitable for filtering, and difficult to support in terms of lookups (definitely, most engines would only support Regexes in a post-filtering sense). Some Regexes (like starts-with patterns) could be efficiently supported for lookups, and some could be translated into inverted-index lookups, but would prefer a more "pure" full-text search support for which well-known inverted index engines can be employed.

(28 Aug '10, 00:56) Signified ♦ Signified's gravatar image
1

Related Question which shows the proliferation of different standards for this - http://answers.semanticweb.com/questions/8676/do-you-use-full-text-search-with-sparql-if-so-how-and-why

(06 Sep '11, 09:07) Rob Vesse ♦ Rob%20Vesse's gravatar image
showing 5 of 6 show 1 more comments

A standardized mechanism for the equivalent of RDBMS stored procedures. Allegrograph provides a way to add your own LISP code to the server and with Jena you can extend the instantiation of Fuseki, intercept code at the right points and inject your own Java into the query engine and underlying triple store... But I fear the vendor specific solutions a la Oracle, Sybase etc. devising custom stored procedure languages for their RDBMS. It would be really nice if the W3C could head this off at the pass and devise something that co-exists well with the semantic web stack we are all used to. With something like that in hand, Jerven could solve his "atomically minting new URI's" issue.

permanent link

answered 23 Mar '11, 21:56

harschware's gravatar image

harschware ♦
7.7k1616
accept rate: 20%

1

You mean such as SPIN functions, or does it need to be Java code? SPINx has JavaScript support at least, allowing you to declare SPARQL functions in RDF, so that any SPARQL engine can pick up the necessary JavaScript code when the function is called.

(24 Mar '11, 11:11) Holger Knubl... Holger%20Knublauch's gravatar image

right, add SPIN to the Fuseki, Allegro example.

(24 Mar '11, 18:35) harschware ♦ harschware's gravatar image
2

@holger I'm learning more about SPIN these days, it does seem to fit the bill more than I first thought. Given its recent W3C Team Submission (http://www.w3.org/Submission/2011/02/Comment/) it certainly takes a big step in standards.

(13 Apr '11, 23:43) harschware ♦ harschware's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×1,328
×886
×208

question asked: 27 Aug '10, 04:00

question was seen: 6,022 times

last updated: 25 Jan '13, 12:12