0
1

Dbpedia is a extreme source of triples, but is too big for an ontology parser, if you try read a "offline" version of it (http://wiki.dbpedia.org/Downloads351).

I need only some terms of DBpedia like < http://dbpedia.org/resource/Mad_cow >

In my brainstorming, i think:

  • Create a Sparql and convert the results inside to a new ontology (Jena ResultSetFormatter.toModel i think...), very similar to this question: http://bit.ly/ao9KYP

Anyone have another suggestion?

Thanks, Celso.

asked 10 Oct '10, 17:18

Celsowm's gravatar image

Celsowm
48618
accept rate: 0%

"Dbpedia is a extreme source of triples". Interesting way of putting it... :)

(10 Oct '10, 21:46) Signified ♦ Signified's gravatar image
1

in Portuguese, I would say: "Dbpedia tem tripla 'pra caramba !'"

(11 Oct '10, 03:11) Celsowm Celsowm's gravatar image

Why not just grab the bits you want? You don't even need sparql:

$ java jena.rdfcat -out N3 \
     http://dbpedia.org/resource/Mad_cow \
     http://dbpedia.org/resource/Mad_Dog
...
<http://en.wikipedia.org/wiki/Mad_cow>
    foaf:primaryTopic <http://dbpedia.org/resource/Mad_cow> .

<http://dbpedia.org/resource/Mad_Dog_%28disambiguation%29>
    dbpprop:redirect <http://dbpedia.org/resource/Mad_Dog> .

<http://dbpedia.org/resource/Mad_cow>
    rdfs:label "Mad cow"@en ;
    dbpprop:redirect <http://dbpedia.org/resource/Bovine_spongiform_encephalopathy> ;
...

However if there is some common condition the concepts meet DESCRIBE is ideal:

DESCRIBE ?s {
    ... some condition for choosing ?s ...
}    

Finally, if you really want a large number of triples, it might be more polite to download a dump in n-triples, then filter for the bits you want.

permanent link

answered 10 Oct '10, 17:38

Comment%20Bot's gravatar image

Comment Bot
3.1k49
accept rate: 41%

edited 10 Oct '10, 17:50

reasoning is possible in these results? to get "superclasses" for example?

(10 Oct '10, 17:52) Celsowm Celsowm's gravatar image

The first version certainly grabs the sameAs itself, but I'm not sure about the object. In the second you need to check, but could add ?s owl:sameAs ?o and add ?o to the project clause if required.

(10 Oct '10, 17:53) Comment Bot Comment%20Bot's gravatar image

and CONSTRUCT? (http://n2.talis.com/wiki/SPARQL_intro#CONSTRUCT)

(12 Oct '10, 03:55) Celsowm Celsowm's gravatar image

CONSTRUCT would work too. It's less clear than DESCRIBE, in that it conflates the condition with the bits you want to extract. (Hrm, I see I read 'superclasses' as 'sameAs' above. Replace the latter with the former in my answer :-)

(12 Oct '10, 07:59) Comment Bot Comment%20Bot's gravatar image

I have had similar problems with the size of DBpedia. A simple solution is to find the downloads that you are interested and filter out the triples using grep. If I am interested in the article category "Butterflies" and I suspect there are useful triples in the article_categories download I can use the following bash script and get a smaller set of triples that contain the references to the URI of interest.

Hope this helps - Pete

#!/bin/bash
grep "<http://dbpedia.org/resource/Category:Butterflies"  article_categories_en.nt  > dbpedia_nt/categories/butterflies.nt
permanent link

answered 12 Dec '11, 15:52

pete_devries's gravatar image

pete_devries
211
accept rate: 0%

edited 12 Dec '11, 16:09

Signified's gravatar image

Signified ♦
24.0k1623

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×886
×632
×629
×277
×31

question asked: 10 Oct '10, 17:18

question was seen: 3,032 times

last updated: 12 Dec '11, 16:09