Is there a scheduled downtime for DBpedia, or did I just pick an unfortunate time to ask it about hockey? Both my code and my browser at getting 503's at the moment, where they weren't 10 minutes ago.
asked 09 May '11, 13:50
:( Too many people uses dbpedia as a sandbox, and some ill queries runs much longer than predicted by the compiler, causing redundand load and wasted working set. Fortunately, many important applications use private replicas of dbpedia (or whole LOD), so they does not suffer from unpredictable overloads of the public server and the public served does not suffer from quite predictable overloads of these apps :)
The scheduled downtime of DBpedia is quite minor and the cache warm-up after start is very aggressive, so the lag between server start and reaching 30-50% of the peak performance is surprisingly small. Scheduled downtimes of LOD are a bit longer (at least I've seen maintainance page on LOD myself but I've never seen it on dbpedia). The long part of LOD maintainance is free-text indexing, because both the number of graphs and the size of vocabulary are "a bit bigger" than average. dbpedia is more convenient for free-text because it's all about single graph.
In the name of science, we recently did some experiments where we crawled a selection of URLs from different domains, each month, for nine months during 2010.
Domains were accessed twice per second. The crawler had timeouts set at 128 s for sockets, and 64 s for connecting. Crawls were done midnight between Sat/Sun.
We derived a stability score of 86% for DBpedia from 8,942 URLS, meaning that each URL from DBpedia was successfully retrieved on average, 7.74 of the 9 attempts.
47% of the URLs were retrieved in all snapshots.
Interpret those results how you will.