Hi, I use this query :

"SELECT DISTINCT ?vertices " +
"WHERE { " +
URIForQuery(centralVertex.id()) +
" tb:has_neighbor{0," + depth + "} " +
"?vertices . " +
"} ";

to get a number of vertices around a central vertex for my graph based mind-map application. The "depth" in the query is the maximum distance from the central vertex that I use to get nodes. When the depth is about 20 or more, the query is slow. But I need it to be fast because I want to load and show the mind map instantly. I would also like that that depth can be really high.

I use a Jena-MySQL repository. I know if it was in-memory Jena it would be faster but I don't think that loading an whole graph in memory is scalable for my app.

How to speed things up? Thank you.

asked 28 Aug '12, 17:37

VincentBlouin's gravatar image

VincentBlouin
215
accept rate: 0%

@VincentBlouin You may want to take a look at this question (http://answers.semanticweb.com/questions/12147/whats-the-best-way-to-parameterize-sparql-queries) which covers how to parameterize queries in common frameworks like Jena and should avoid such unreadable string concatenation like your example query, HTH

(28 Aug '12, 19:24) Rob Vesse ♦ Rob%20Vesse's gravatar image

Finally I tried TDB but it was still too slow and so I switched to Neo4j which is more suited for graph traversal. For more details look my answer here : http://stackoverflow.com/questions/11733868/efficient-traversal-search-algorithm-to-fetch-data-from-rdf/13617174#13617174

(30 Nov '12, 20:53) VincentBlouin VincentBlouin's gravatar image

If you want to stick with using Jena, you might think about trying TDB. imo, you get better mileage if you use a database that's designed for RDF rather than sticking RDF into a relational database, but I'm biased.

fwiw, with a depth of 20 or more, that's a non-trivial query; there are a lot of joins going on behind the scenes. If trying a different database is a no-go, you can only try smaller depths or a different query.

permanent link

answered 28 Aug '12, 19:19

mhgrove's gravatar image

mhgrove
3.3k17
accept rate: 28%

Yes - a property path over an external database is going to be very expensive. TDB is a far better choice.

(29 Aug '12, 05:44) AndyS ♦ AndyS's gravatar image

The way I understand it is that TDB takes a lot of ram. On the requirements page http://jena.apache.org/documentation/tdb/requirements.html it says for a 32 bit machine I need at least 1gb of Java heap space. Does TDB loads an whole graph in memory? With a lot of data, I wonder how that is number of gig is gonna grow. Is it expensive to do a disk write with TDB? Because if it's in memory and the server shuts down, I lose all changes. And in my app, each user have their own "model" or "RDF repository", would it consume too much ram to load many models?

(01 Sep '12, 11:43) VincentBlouin VincentBlouin's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×628
×62
×5

question asked: 28 Aug '12, 17:37

question was seen: 1,510 times

last updated: 30 Nov '12, 20:53