If you spend much time with RDF you'll run into a case where somebody confuses rdfs:label and rdf:label or otherwise messes up the name of a property (well-known or not).

I'd like some tool that, given (i) an OWL Ontology and (ii) a Turtle file, reports cases where properties and types are used that are not defined in the ontology. I'm particularly interested in running this on the output of a processing chain to confirm that that the documentation implied by the OWL ontology really applies to the data product.

Does something like this exist?

asked 17 Dec '11, 19:01

database_animal's gravatar image

database_animal ♦
8.3k612
accept rate: 15%


You might want to check out Pellet ICV -- it uses a slight different view of your standard OWL semantics which lets you use your OWL ontology to validate your instance data. It won't catch rdf:label vs rdfs:label (unless you said everything must have an rdfs:label), but it will flag objects which violate domain & range restrictions, cardinality violations, etc. ICV support is also built into Stardog,

link

answered 19 Dec '11, 08:37

mhgrove's gravatar image

mhgrove
3.0k17
accept rate: 26%

I wonder whether Pellet ICV, which is essentially a closed-world constraint checker, can be used as

some tool that ... reports cases where properties and types are used that are not defined in the ontology.

That it can be used for other relevant tasks is out of question, but the original request was only on checking for undeclared entities, with a primary focus on misspelled term names.

(19 Dec '11, 09:18) Michael Schn... ♦ Michael%20Schneider's gravatar image

As I said, it won't really help w/ name mis-spellings. But I suspect you could set things up such that it could catch undeclared types and/or properties; you'd just have to model specifically to try and capture these cases. I think it's generally useful for satisfying his requirement of "confirm that that the documentation implied by the OWL ontology really applies to the data product"

(19 Dec '11, 09:28) mhgrove mhgrove's gravatar image

I suspect you could set things up such that it could catch undeclared types and/or properties

Depends at what point Pellet ICV does the checking and which parsing software is used. If checking is applied after parsing using OWL API, then it will already be too late to find such problems, as OWL API will invent the missing declarations as a matter of "repairing" the input data.

But even if a different parser is used, checking for undeclared or misspelled terms is better be done immediately at the syntax level, not in a later processing stage after transformations may have been applied.

(19 Dec '11, 10:27) Michael Schn... ♦ Michael%20Schneider's gravatar image

Could this be done by issuing a few SPARQL queries ?

link

answered 19 Dec '11, 06:16

Fabian%20Cretton's gravatar image

Fabian Cretton
1.0k17
accept rate: 3%

One day, I decided to spend a bit of time to write a simple Perl script to check for misspelled well-known term names. Get it from

http://pastebin.com/1gbF9i3s

and feel free to adjust it to your particular needs. I hope that it's helpful. (It has saved me a lot of time when creating reasoning test cases, where misspelled term names is a primary source of difficult-to-detect errors).

Btw, if you find any bug, please leave a comment. Thanks!

link

answered 18 Dec '11, 14:46

Michael%20Schneider's gravatar image

Michael Schn... ♦
6.1k1612
accept rate: 34%

edited 18 Dec '11, 15:11

You could merge the OWL ontology with the Turtle file and check the result against the OWL 2 validator, with the OWL 2 DL profile. It will report all properties and classes that are used but not declared explicitly. It will also show you other potential warnings or errors with respect to the OWL DL syntax, but you don't have to care about them.

link

answered 18 Dec '11, 12:40

Antoine%20Zimmermann's gravatar image

Antoine Zimm... ♦
9.5k514
accept rate: 31%

I don't think that OWL API's validator is well-suited for database_animal's purpose. AFAIK, it checks input data from an OWL 2 Structural Spec's point of view and will map RDF data into this formalism, applying "repairs" to enable such a mapping. So for a triple "ex:foo rdf:label "Foo"", it will first invent some property type for rdf:label and will then happily tell you that "the ontology and all of its imports are in the OWL 2 profile". It will, in fact, tell you issues w.r.t. OWL 2 DL, but then the validator is only interesting for input data intended to be strictly in OWL 2 DL.

(18 Dec '11, 14:22) Michael Schn... ♦ Michael%20Schneider's gravatar image

It will tell you exactly:

Use of undeclared data property rdf:label "Foo"

(19 Dec '11, 03:23) Antoine Zimm... ♦ Antoine%20Zimmermann's gravatar image
1

As I said, it does so only when being asked whether the given input data is in OWL 2 DL or not. But if you use the validator in that particular mode, then it will also inform you about all kinds of "undeclared" terms when given DL-invalid data, such as RDFS vocabularies. Check it out with DCTerms at http://dublincore.org/2010/10/11/dcterms.rdf! Validator@OWL2DL produces a huge list of false negatives: every single DCTerms name is reported to be a undeclared property or class! This is hardly what people have in mind when asking for a tool to check for misspelled (well-known) terms.

(19 Dec '11, 04:56) Michael Schn... ♦ Michael%20Schneider's gravatar image

Ah ok, I understand your point. Anyway, it's just something that came out of my mind and that could be used out of the box. I guess a completely satisfying solution would need some extra implementation, unless I am not aware of existing tool to do the job.

(19 Dec '11, 07:43) Antoine Zimm... ♦ Antoine%20Zimmermann's gravatar image
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×802
×535
×52
×27

Asked: 17 Dec '11, 19:01

Seen: 1,236 times

Last updated: 19 Dec '11, 10:27