I have been trying to get a DBpedia resource using keyword search in the SPARQL. I tried FILTER but it caused query timeout everytime I executed query. I was a bit successful using bif:contains function, however, I couldn't get that query working using Jena ARQ API as Virtuoso has not yet opened the SPARQL port for DBpedia yet. The query I'm trying looks something like this
Description:Find a university whose name matches with ($string), and get me the City and Country of that university. 'university' is an example, but it could be any 'organization'. And the ($string) is multi-word.
How can I get this query working on DBpedia endpoint?
The broad information need: I want retrieve a DBpedia resource of any type of organization and fetch its City & Country information. So this organization could be anything eg 'Harvard University'/'Microsoft'/'Dell',etc. I thought I could get its location info from DBpedia. So do reply if there are other ways/sources to get this info too.
you could try a query like the follow:
please note that i use some FILTER statement in order to reduce the results over the language. I also left commented a filter over the specific uri in order to give you an idea on how to execute a specific query for a specifc resource: this could be done -as said before- in order to improve performance. If this way could fits your needs you only have to:
1) retrieve all the uri you are interested into with full-text:
2) then you could programmatically use every one of them in the FILTER statement of this query:
A key to working with DBPedia, or any large RDF store for that matter, is to realize that any string search is very inefficient relative to matching a triple pattern. In your case, I'd suggest using a minimal query to discover the label for the entity you want. Something like:
Note that I'm using some heuristics here. I'm pretty sure that it will start with "Harv" and that not many others will start with that string. fn:starts-with will only need to search the first n characters, so that can be used for a more efficient search than regex or contains. Of course, if you don't know if the keyword appears at the beginning of the label or the case isn't known, then you will need to do regex, etc. But this can work in many cases. Note that smaller search strings will, of course, return quicker.
Then do the following to discover what properties are associated with the resource:
From this you will discover that the properties start with a lowercase, as is the custom, e.g. dbprop:city and dbprop:country, and you will discover that others do not have these properties, so you may need to use OPTIONAL and use the properties associated with those resources..
Again, the key here is using SPARQL to iteratively discover how the data is represented in smaller chunks the service is able to process efficiently, then grow the query you are working on, testing for the ability of the service to handle the request as you go.
After looking into some DBpedia resource pages, I also figured that most of the entities have URIs with space replaced by underscore(_) char, but NOT always! So one possible trick could be to replace spaces with underscores to form the DBpedia resource and directly query other details as shown below:
Simple Query Text: Harvard University
answered 14 Feb '12, 13:51