|
I'd like to replicate a local dump of DBpedia. The DBpedia 3.8 downloads page has a lot of links to files in different languages, and so forth. Essentially, I am looking for a list of URLs for NQuads-format files that taken together would constitute a DBpedia dump, containing (at least) all of the data you could achieve through dereferencing RDF from DBpedia URIs. The closest I could find to such a list was here, but I guess this might miss some dereferenceable information from the Can anyone suggest a better list, or what additional files the list above might miss? (My last resort would be to screen-scrape all .nq links from the homepage, which I'm lazy to do and hope will not be needed.) |
|
They used to package it all up for us, a la http://downloads.dbpedia.org/3.7/all_languages.tar, but it looks as if they have stopped doing that. I don't see an all_languages.tar file for 3.8. dbpedia has always been such a moving target... sigh. Short of an all inclusive tar ball, you could go to the top of the dump directory http://downloads.dbpedia.org/3.7/ and run some web crawler downloader thingy on it. I found a list of java based ones at http://java-source.net/open-source/crawlers and Crawler4j claims you could be up and running in 5 minutes. Thanks! I ended up just sticking with the EN list here:
This should be sufficient for the moment. Otherwise yep, I'll probably throw together some script to go through the folder structure. ... seems like they should have something on their end though. The organisation of those downloads has become extremely confusing. 1
If you are attempting to replicate the public dbpedia endpoint then you will want to see this: http://wiki.dbpedia.org/DatasetsLoaded Ah yes, that's pretty much what I was looking for! Thanks! |

