I want to import the uniprot dataset into Jena. Since uniprot is in rdf and not in n3 I have to convert it via rdfcat into the right format before loading it with tdbloader into my database. However, I'm getting these errors when I'm converting with rdfcat:

$ ./apache-jena-2.7.4/bin/rdfcat -out n3 -x zcatpipe | head
16:36:33 WARN  RDFDefaultErrorHandler :: file:zcatpipe(line 27878 column 198): {W108} Not an XML Name: 'http://purl.uniprot.org/SHA-384/270467CF2F1A4915613B11129BAFB5747CFA0FE4DA9B398EC35D9C5039B376EF6E07180933366A0FCB5306C809529D2B'
16:36:33 WARN  RDFDefaultErrorHandler :: file:zcatpipe(line 33157 column 196): {W108} Not an XML Name: 'http://purl.uniprot.org/SHA-384/65D63EDE7317100278EE1448ADB744A86A80751CCBC39CD0608A281BAB02CAB8F172A1BB52A364803FAE2676878488BF'
16:36:33 WARN  RDFDefaultErrorHandler :: file:zcatpipe(line 33379 column 196): {W108} Not an XML Name: 'http://purl.uniprot.org/SHA-384/95E617D67B0829DA2440DAD51501FB83E27B3177F4E0CBA9C3EE6CEC486460972D7C2A5BCEBA98A926716FAEAD78133C'
16:36:33 WARN  RDFDefaultErrorHandler :: file:zcatpipe(line 35309 column 196): {W108} Not an XML Name: 'http://purl.uniprot.org/SHA-384/E134CD65C93715F13C3A880F0B9DC1FD6E2A8D62C5D8A5AAD6B10A281C37B59C20593AD063F87A7F58C70BB54DDA9A6F'
16:36:33 WARN  RDFDefaultErrorHandler :: file:zcatpipe(line 35421 column 196): {W108} Not an XML Name: 'http://purl.uniprot.org/SHA-384/5FE424054DEC8721F7D72C959DE28400745F6EF77F1C95C3D20F2CC24647B060606DCCD2E2DA3DB7852B5CD92397E230'

Can anybody tell me why there is this error (the id seems legit to me) and if (and how) I can ignore it?

EDIT: rapper also throws an "Illegal rdf:ID value"

EDIT2: For example, a line that causes this error is the fifth line in this pastebin data (sorry, I was struggling with the markdown, it didn't want to show my RDF data)

asked 11 Dec '12, 10:42

manuel's gravatar image

manuel
335
accept rate: 0%

edited 12 Dec '12, 05:21

The dataset is very huge, about 28 gigabytes. Could you share a small snipped that causes the problem?

(12 Dec '12, 04:15) utapyngo utapyngo's gravatar image

Hi, sure: I added a link to a snipped

(12 Dec '12, 05:28) manuel manuel's gravatar image

  1. rdf:ID goes on a resource but your data has it on a literal.
  2. rdf:ID takes a local name, not a URI; use rdf:about for URIs (also wrong here by (1))
link

answered 12 Dec '12, 06:23

AndyS's gravatar image

AndyS ♦
13.0k37
accept rate: 32%

This was communicated to us at UniProt. And it will be fixed in our next release. Which is aimed for release on Wednesday the 9th of January 2013.

If you ignore it for now all you will miss is if a certain annotated feature position is uncertain or not. This is also an object lesson not to use reification inside your datamodel where it can lead to unexpected results. In the longer run the specific pattern of triples for describing the position of the annotated feature on the amino acid string is going to change to use the FALDO ontology

In the mean time please test our public sparql endpoint in beta

link

answered 14 Dec '12, 03:47

Jerven's gravatar image

Jerven
4.5k610
accept rate: 34%

edited 14 Dec '12, 10:20

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×852
×578

Asked: 11 Dec '12, 10:42

Seen: 942 times

Last updated: 14 Dec '12, 10:20