0
1

Hi. I am trying to figure out how to get actor names and birth of dates using SPARQL query. This is what my query looks like now

    PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

PREFIX dbo: <http://dbpedia.org/ontology/>
select distinct(?name) as ?name min(?birth) as ?birth
{
?person <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:American_television_actors> .
{?person dbo:birthDate ?birth}
      UNION {?person dbpedia2:birthdate ?birth.}       
   ?person foaf:name ?name.
 Filter (regex(str(?birth),"[0-9]{4}-[0-9]{2}-[0-9]{2}"))    
}
OFFSET 2000

When i did the count of all the record i got about 12,000. I am using OFFSET 2000 to retrieve all the records. The problem is i am getting a lot duplicates. I use distinct and min without success. Would i need to do a double query or there is another way of writing this query.

Thank you very much in advance!

asked 02 Oct '11, 22:50

yanz's gravatar image

yanz
8915
accept rate: 0%

edited 02 Oct '11, 22:51


One way of doing this would be with a GROUP BY clause combined with a SAMPLE aggregate:

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT (sql:SAMPLE(?name) AS ?Name) (MIN(?birth) AS ?birthDate)
{
  ?person <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:American_television_actors> .
  { ?person dbo:birthDate ?birth }
  UNION
  { ?person dbpedia2:birthdate ?birth. }       
  ?person foaf:name ?name.
  FILTER( REGEX( STR(?birth),"[0-9]{4}-[0-9]{2}-[0-9]{2}"))    
}
GROUP BY ?person

What this query does is group by the URI of the person which is the same regardless of how many different forms of their name are present in DBPedia and then uses SAMPLE to pick just one possible name from the group.

When using GROUP BY you can only project either the group variables (so in this case ?person) or aggregates over these groups (in this case SAMPLE and MIN)

Note - Because Virtuoso does not strictly follow the SPARQL 1.1 standard you have to prefix SAMPLE with their special prefix sql: in order for this to work

permanent link

answered 03 Oct '11, 04:06

Rob%20Vesse's gravatar image

Rob Vesse ♦
14.0k1715
accept rate: 29%

Thank you very much! It worked well. Now the only duplicates i have is because either two people with the same name which is fine and if the same person has two different birthday set because I union dbo and dbpedia to get more results. Do you know if there is anything i can do about that or i would have to remove those manually?

(03 Oct '11, 09:18) yanz yanz's gravatar image

See How to eleminate redundancy from My Sparql Query against DBPedia End Point. Duplicates arise from actors with multiple birthdate properties (e.g. http://dbpedia.org/page/Lucie_Arnaz).

permanent link

answered 03 Oct '11, 03:23

AB's gravatar image

AB
1.7k310
accept rate: 35%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×1,319
×166

question asked: 02 Oct '11, 22:50

question was seen: 3,498 times

last updated: 03 Oct '11, 09:18