I recently downloaded dbpedia 3.7 ( the 22GB all_languages.tar at the download server ) and loaded it into a triple store. I was shocked to find that my triple store found triples I could not find in the live dbpedia server or dbpedia SPARQL endpoint. Specifically, the query below, which finds 14 triples in my triplestore:

  <http://dbpedia.org/resource/Albert_Einstein> ?pf1 <http://dbpedia.org/resource/Stuttgart>

Running this at both dbpedia.org/sparql and live.dbpedia.org/sparql produced no results. So I thought maybe my triple store had an error so I went digging in the 3.7 data dumps via the cmd:

perl -ne '/Albert_Einstein>/ && /wikiPageWikiLink/ && /Stuttgart>/ &&  print "$ARGV $.:  $_" ' page_links*.nt

which produced 14 triples with output like:

page_links_af.nt 77102:  <http://dbpedia.org/resource/Albert_Einstein> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Stuttgart> .
page_links_an.nt 448134:  <http://dbpedia.org/resource/Albert_Einstein> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://dbpedia.org/resource/Stuttgart> .
<.. snip ..>

Does anyone know why the dbpedia dumps would contain data not in dbpedia endpoints?

asked 20 Oct '11, 02:04

harschware's gravatar image

harschware ♦
accept rate: 20%


I wouldn't be shocked about problems with DBpedia...

(28 Oct '11, 20:38) database_animal ♦ database_animal's gravatar image

Looks like I was right, not all the dbpedia exports are loaded into dbpedia. The exact subset is given at the page: Datasets loaded into the public DBpedia SPARQL Endpoint.

Thanks to Anja on the dbpedia-discuss list

permanent link

answered 28 Oct '11, 19:25

harschware's gravatar image

harschware ♦
accept rate: 20%


I recently had need to do this myself, and created a groovy script that helps download everything (well actually it just creates a text file of the files needed and then shows you the wget command to recursively grab it all). https://bitbucket.org/tharsch/dbpediamirror

(16 Jan '14, 14:43) harschware ♦ harschware's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:


question asked: 20 Oct '11, 02:04

question was seen: 3,047 times

last updated: 16 Jan '14, 14:43