I was playing along with this post, and ended up with running code that I had to halt at 7 minutes due to impatience. I jotted down a Groovy script using Jena to mimic the behavior, and got all the results back in a second or two. Is rdflib known to be slow when it queries over a wire, or is my code using a big inefficiency?

Python 2.7 + rdflib 3.0:

from rdflib import Graph, Namespace, RDF

store = Graph()
store.parse("http://source.data.gov.uk/data/education/bis-research-explorer/2010-03-04/education.data.gov.uk.nt", format="nt")

store.bind("PROJECT", "http://research.data.gov.uk/def/project/")
store.bind("FOAF", "http://xmlns.com/foaf/0.1/")

PROJECT = Namespace("http://research.data.gov.uk/def/project/")
FOAF = Namespace("http://xmlns.com/foaf/0.1/")

for organization in store.subjects(RDF.type, FOAF["Organization"]):
    for postcode in store.objects(organization, PROJECT["location"]):
        try:
            print postcode
            store.parse(postcode)
        except:
            print '404 not found'

Groovy 1.7.5 + Jena 2.6.4:

import com.hp.hpl.jena.rdf.model.*

String url= 'http://source.data.gov.uk/data/education/bis-research-explorer/2010-03-04/education.data.gov.uk.nt'
Model model= ModelFactory.createDefaultModel().read(url, 'N-TRIPLE')

Property rdfType= model.getProperty('http://www.w3.org/1999/02/22-rdf-syntax-ns#', 'type')
Property projectLocation= model.getProperty('http://research.data.gov.uk/def/project/', 'location')
Resource foafOrg= model.getResource('http://xmlns.com/foaf/0.1/Organization')

model.listResourcesWithProperty(rdfType, foafOrg).each { org ->
    model.listObjectsOfProperty(org, projectLocation).each { loc ->
        try {
            println loc
        }
        catch (Exception e) {
            println '404 not found'
        }
    }
}

asked 03 Feb '11, 15:16

Ryan%20Kohl's gravatar image

Ryan Kohl
2.5k310
accept rate: 17%


The rdflib example has store.parse(postcode) inside the loop. This attempts to fetch the postcode URI from the web and parse the result and add it to your store. In the Jena example, you simply print the URI and that's all. Depending on the number of postcodes and the speed of the server, this can make a huge difference!

link

answered 03 Feb '11, 18:58

cygri's gravatar image

cygri ♦
9.0k412
accept rate: 34%

And that's what I get for trying out other people's code without looking too closely. Yes, commenting that parse instruction out returns the proper results immediately. Thanks for restoring my faith in rdflib!

(04 Feb '11, 13:51) Ryan Kohl Ryan%20Kohl's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×578
×36

Asked: 03 Feb '11, 15:16

Seen: 1,287 times

Last updated: 03 Feb '11, 18:58