Re: [tap] Parse error vs failure

Michael G Schwern <schwern@pobox.com> Wed, 04 February 2009 06:58 UTC

Message-ID: <49893C64.1050604@pobox.com>
Date: Tue, 03 Feb 2009 22:57:40 -0800
From: Michael G Schwern <schwern@pobox.com>
User-Agent: Thunderbird 2.0.0.19 (Macintosh/20081209)
MIME-Version: 1.0
To: Ovid <publiustemp-tapx@yahoo.com>
References: <4984B200.6060907@pobox.com> <439925.21633.qm@web65707.mail.ac4.yahoo.com>
In-Reply-To: <439925.21633.qm@web65707.mail.ac4.yahoo.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: tap@ietf.org
Subject: Re: [tap] Parse error vs failure
Precedence: list

Ovid wrote:
> ----- Original Message ----
>
>> From: Michael G Schwern <schwern@pobox.com>
>
>> In Oslo, when we came across an edge case that was considered nonsense, the
>> tendency was to say that its a parse error.  For example, "not ok 4 # SKIP",
>> because why would you fail a test that didn't run?  Also "1..0" without a
>> skip, because why would you not run any tests and yet not skip the test?
>
> Your arguments made sense, but I have a question: what does 'not ok
> 4 # SKIP' even mean?  To my mind, that's a parse error because
> something has gone horribly wrong or the person writing the TAP
> producer is terribly confused.  There's no way the parser can
> understand what the intent is, so making it a parse error makes
> sense to me because if *I* can't parse it, how can a TAP consumer?
>
> I know I'm the odd man out here and I don't feel strongly enough
> about this to fight it :)

I think the important thing here is the difference between a syntax
error and a semantic error.  Let me illustrate:

"The color eight died" is syntactically correct.  There's the noun
"eight", modified by the adjective "color" and the verb "died" which
"the color eight" did.  There is meaning there to be parsed.  You can
draw a sentence diagram for it.  It is English.

"The color eight died" is a semantic error.  It is nonsense.  Eight is
not a color and numbers don't die.

You can have something which is syntactically correct but semantic
nonsense, and there is still meaning in semantic nonsense.

"not ok 4 # SKIP" is syntactically correct.  It applies to test #4.
That test failed.  The code was not run.

"not ok 4 # SKIP" is semantic nonsense.  You don't fail a test you
never ran.

Pragmatically, something has gone terribly wrong and the consumer
should flag that line.  So the desired result is still there for a
syntax error or a semantic error.  I hope that's clear, it still
results in an error.

This is, to a certain extent, splitting hairs, but that's part of what
formalizing TAP is about.  However, there are practical implementation
advantages to making it a semantic error.  Consider a grammar-based TAP
parser, rather than using ad-hoc regexes.

* Simpler grammar.

A test result line can be described as:

Test-Result [SP Test-Number] [SP Description] [SP # SP Directive [SP Reason]]

Allowing all combinations of Test-Results with Directives.  Meaning is applied
after they are parsed.

Whereas if "not ok 4 # SKIP" is made a syntax error that exception has
to be written into the grammar so we need to spell out all the combinations:

Test-Result-Line = Simple-OK / Simple-Fail / Passing-TODO /
                   Failing-TODO / Passing-SKIP
Simple-OK       = "ok" [SP Test-Number] [SP Description]        ; pass
Simple-Fail     = "not ok" [SP Test-Number] [SP Description]    ; fail
Passing-TODO    = Simple-OK   SP TODO-Directive                 ; pass
Failing-TODO    = Simple-Fail SP TODO-Directive                 ; pass
Passing-SKIP    = Simple-OK   SP SKIP-Directive                 ; pass

Which you might want to do to grammatically determine passing or failing, but
it does leave us with no basic definition for a test result line.

* All syntax errors are ignored.

Since we ignore all syntax errors, a failing skip test must be detected and
treated specially.  That means it must be valid syntax!

Failing-SKIP    = Simple-Fail   SP SKIP-Directive               ; error

* Better error reporting.

This is the most pragmatic reason.  Since "not ok 4 # SKIP" is an
actual error, as opposed to an ignored line, you're going to
have to detect it and parse it anyway to inform the user.

Consider if you see:

TAP Version 14
1..3
ok 1
not ok 2 - no database # SKIP
  ---
  file: foo.t
  line: 42
  ...
ok 3

Now, what would you rather tell the user?  If it's just a syntax
error, then all you get is:

  Syntax error at TAP stream line 4.
  Diagnostics without test at line 5.

But if you parse and understand it, you can use that information to say:

  Nonsense failing skip test "no database" at foo.t line 42.

pointing the user at the problem.  Isn't that better?

-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.

[tap] Parse error vs failure Michael G Schwern
Re: [tap] Parse error vs failure Aristotle Pagaltzis
Re: [tap] Parse error vs failure David E. Wheeler
Re: [tap] Parse error vs failure Michael Peters
Re: [tap] Parse error vs failure Steffen Schwigon
Re: [tap] [offlist] Re: Parse error vs failure Aristotle Pagaltzis
Re: [tap] Parse error vs failure Ovid
Re: [tap] Parse error vs failure Steffen Schwigon
Re: [tap] Parse error vs failure Aristotle Pagaltzis
Re: [tap] Parse error vs failure Andy Armstrong
Re: [tap] Parse error vs failure Andy Armstrong
Re: [tap] Parse error vs failure Michael G Schwern
Re: [tap] Parse error vs failure Steffen Schwigon