Is the following Turtle/N3 Long literal valid or should the quote inside it be escaped?

"""This is a "quote" in a long literal"""

The Turtle specification says that you can use the \" escape inside long strings but doesn't say that you have to.

Is the above literal valid or should it be encoded as follows instead?

"""This is a \"quote\" in a long literal"""

Edit

As an addition to my question is it still valid to not escape the quote if the quote occurs at the end of the literal e.g.

"""This ends in a quote""""

Or should it be the following:

"""This ends in a quote\""""

I've had reports from other people that some RDF parsers don't understand this syntax despite the fact that they should since they should use a longest token (i.e. maximal munch) rule for parsing.

asked 28 Apr '10, 13:06

Rob%20Vesse's gravatar image

Rob Vesse ♦
13.6k1715
accept rate: 29%

edited 29 Apr '10, 17:40

Hi Rob, thanks for the question (which highlights the problem). I've just send an email (http://bit.ly/brzlW8) on jena-devel about it. I think it's a real problem. Have you experienced it with a specific Turtle parser?

(30 Apr '10, 09:36) castagna castagna's gravatar image

Just updated my answer (My money is on the latter). Good question, I'd ask the grammar geeks about it on the semweb list.

(30 Apr '10, 10:11) Comment Bot Comment%20Bot's gravatar image

The above literal is valid. From the spec:

longString  ::= #x22 #x22 #x22 lcharacter* #x22 #x22 #x22

Looking at lcharacter:

lcharacter  ::= echaracter | '\"' | #x9 | #xA | #xD

And in echaracter:

echaracter  ::= character | '\t' | '\n' | '\r'

Character is basically everything, plus unicode escapes.

So '"' is fine. It's probably clearer when you contrast longString with string, which accepts:

scharacter  ::=  ( echaracter - #x22 ) | '\"'

i.e. definitely not '"'.

Concerning new question

I think (much less certain here) the latter is correct, i.e. """This ends in a quote\"""". """ ends the string, you don't search for the longest run of lcharacters between two """, otherwise it would gobble up far too much. Jena, rapper and SemWeb agree, which boosts my confidence slightly.

link

answered 28 Apr '10, 13:21

Comment%20Bot's gravatar image

Comment Bot
3.1k49
accept rate: 41%

edited 30 Apr '10, 10:01

Also my example should work since a conforming parser should apply the longest token rule when parsing the input

(28 Apr '10, 21:19) Rob Vesse ♦ Rob%20Vesse's gravatar image

Surely it would never gobble up too much, all it should do is once it sees a sequence of 3 double quotes it should consume any immediatedly subsequent double quotes (since there is no valid syntax that would permit a quote to appear directly after a long literal without whitespace to separate it)

(30 Apr '10, 16:57) Rob Vesse ♦ Rob%20Vesse's gravatar image

I don't think it's a case that they agree I think it's a case that this is a corner case they haven't thought of and noone put in a test suite

(30 Apr '10, 16:59) Rob Vesse ♦ Rob%20Vesse's gravatar image

Reported it on rapper's bug tracker and Dave Beckett reckons it's been fixed in a recent version, but until I get back in the office in a couple of days time I don't have access to a suitable machine to test this on

(03 May '10, 11:38) Rob Vesse ♦ Rob%20Vesse's gravatar image
1

Paolo reported on jena-devel. AndyS replied:

"Illegal Turtle.

The first 3 " in """" are the end of """-string, leaving a single " which is illegal."

(03 May '10, 22:14) Comment Bot Comment%20Bot's gravatar image

Hmmm ok, may still post to W3C Semantic Web list to try and get a definitive answer

(07 May '10, 08:45) Rob Vesse ♦ Rob%20Vesse's gravatar image
showing 5 of 6 show 1 more comments

Answering my own question, seems you don't need to escape double quotes as one of the Turtle test suite tests uses an unescaped quote in a long literal.

http://www.w3.org/TeamSubmission/turtle/tests/test-23.ttl

link

answered 28 Apr '10, 13:15

Rob%20Vesse's gravatar image

Rob Vesse ♦
13.6k1715
accept rate: 29%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×48
×34
×32
×3

Asked: 28 Apr '10, 13:06

Seen: 2,090 times

Last updated: 30 Apr '10, 10:01