Having a problem with sesame windows client. I am trying to mirror Dbpedia on my local sesame installation[Not really sure if that's a good idea]. I uploaded a 1.85gb file. No problems[did take 3 hours though]. Now when i am trying to upload another file to the same repo, I get Java Heap Space exceeded exception.
Is it possible to mirror the dbpedia entirely ? Or is my understanding lacking somewhere ?
Help :) Is there some other approach ?
asked 22 Feb, 01:11
I don't think it's a good idea to try and mirror the complete DBPedia on a "local Sesame installation", unless you have taken quite a bit of care to have your local Sesame installation set up correctly. Simply dropping all of DBPedia's dump files in a single Sesame memory or native store definitely won't work: DBPedia is just too large for that, and Sesame's default stores are not designed for that kind of scale.
You can, of course, mirror parts of DBPedia quite easily, which can really help your query performance already. This is actually a tactic I personally often employ in projects: I create a local Sesame store in which I load things like the DBPedia ontology, and maybe one or two basic instances data-files. I then use SPARQL federated queries to query over the combination of my local store and the remote DBPedia endpoint. If you do this right, you can get quite good query performance, and an added bonus is that you significantly lighten the load on the DBPedia server.
To mirror the complete DBPedia dataset using Sesame, there are two basic approaches.
Of course, that's just the basics of the approach. Doing this kind of very-large-scale data mirroring will require a bit of tweaking and care. For a start, have a look at this article I wrote on loading large files. I'm not sure that using the Sesame Windows Client is really the best way to go about it (it might work, but it wasn't designed for these kinds of data sizes either).
answered 22 Feb, 18:07
Jeen Broekstra ♦