Re: [tap] Parse error vs failure

Michael G Schwern <schwern@pobox.com> Wed, 04 February 2009 06:58 UTC

Return-Path: <schwern@pobox.com>
X-Original-To: tap@core3.amsl.com
Delivered-To: tap@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9E34D3A6A47 for <tap@core3.amsl.com>; Tue, 3 Feb 2009 22:58:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.194
X-Spam-Level:
X-Spam-Status: No, score=-2.194 tagged_above=-999 required=5 tests=[AWL=-0.195, BAYES_00=-2.599, J_CHICKENPOX_31=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nsIFnfoAYZtY for <tap@core3.amsl.com>; Tue, 3 Feb 2009 22:58:05 -0800 (PST)
Received: from sasl.smtp.pobox.com (a-sasl-fastnet.sasl.smtp.pobox.com [207.106.133.19]) by core3.amsl.com (Postfix) with ESMTP id 8D3903A694C for <tap@ietf.org>; Tue, 3 Feb 2009 22:58:04 -0800 (PST)
Received: from localhost.localdomain (unknown [127.0.0.1]) by a-sasl-fastnet.sasl.smtp.pobox.com (Postfix) with ESMTP id 1C81396632; Wed, 4 Feb 2009 01:57:45 -0500 (EST)
Received: from [10.23.42.2] (unknown [69.64.236.3]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-sasl-fastnet.sasl.smtp.pobox.com (Postfix) with ESMTPSA id D649496631; Wed, 4 Feb 2009 01:57:42 -0500 (EST)
Message-ID: <49893C64.1050604@pobox.com>
Date: Tue, 03 Feb 2009 22:57:40 -0800
From: Michael G Schwern <schwern@pobox.com>
User-Agent: Thunderbird 2.0.0.19 (Macintosh/20081209)
MIME-Version: 1.0
To: Ovid <publiustemp-tapx@yahoo.com>
References: <4984B200.6060907@pobox.com> <439925.21633.qm@web65707.mail.ac4.yahoo.com>
In-Reply-To: <439925.21633.qm@web65707.mail.ac4.yahoo.com>
X-Enigmail-Version: 0.95.7
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Pobox-Relay-ID: 1FEB290C-F289-11DD-B53A-8B21C92D7133-02258300!a-sasl-fastnet.pobox.com
Cc: tap@ietf.org
Subject: Re: [tap] Parse error vs failure
X-BeenThere: tap@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Test Anything Protocol WG discussions <tap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tap>, <mailto:tap-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tap>
List-Post: <mailto:tap@ietf.org>
List-Help: <mailto:tap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tap>, <mailto:tap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Feb 2009 06:58:06 -0000

Ovid wrote:
> ----- Original Message ----
>
>> From: Michael G Schwern <schwern@pobox.com>
>
>> In Oslo, when we came across an edge case that was considered nonsense, the
>> tendency was to say that its a parse error.  For example, "not ok 4 # SKIP",
>> because why would you fail a test that didn't run?  Also "1..0" without a
>> skip, because why would you not run any tests and yet not skip the test?
>
> Your arguments made sense, but I have a question: what does 'not ok
> 4 # SKIP' even mean?  To my mind, that's a parse error because
> something has gone horribly wrong or the person writing the TAP
> producer is terribly confused.  There's no way the parser can
> understand what the intent is, so making it a parse error makes
> sense to me because if *I* can't parse it, how can a TAP consumer?
>
> I know I'm the odd man out here and I don't feel strongly enough
> about this to fight it :)

I think the important thing here is the difference between a syntax
error and a semantic error.  Let me illustrate:

"The color eight died" is syntactically correct.  There's the noun
"eight", modified by the adjective "color" and the verb "died" which
"the color eight" did.  There is meaning there to be parsed.  You can
draw a sentence diagram for it.  It is English.

"The color eight died" is a semantic error.  It is nonsense.  Eight is
not a color and numbers don't die.

You can have something which is syntactically correct but semantic
nonsense, and there is still meaning in semantic nonsense.

"not ok 4 # SKIP" is syntactically correct.  It applies to test #4.
That test failed.  The code was not run.

"not ok 4 # SKIP" is semantic nonsense.  You don't fail a test you
never ran.

Pragmatically, something has gone terribly wrong and the consumer
should flag that line.  So the desired result is still there for a
syntax error or a semantic error.  I hope that's clear, it still
results in an error.

This is, to a certain extent, splitting hairs, but that's part of what
formalizing TAP is about.  However, there are practical implementation
advantages to making it a semantic error.  Consider a grammar-based TAP
parser, rather than using ad-hoc regexes.

* Simpler grammar.

A test result line can be described as:

Test-Result [SP Test-Number] [SP Description] [SP # SP Directive [SP Reason]]

Allowing all combinations of Test-Results with Directives.  Meaning is applied
after they are parsed.

Whereas if "not ok 4 # SKIP" is made a syntax error that exception has
to be written into the grammar so we need to spell out all the combinations:

Test-Result-Line = Simple-OK / Simple-Fail / Passing-TODO /
                   Failing-TODO / Passing-SKIP
Simple-OK       = "ok" [SP Test-Number] [SP Description]        ; pass
Simple-Fail     = "not ok" [SP Test-Number] [SP Description]    ; fail
Passing-TODO    = Simple-OK   SP TODO-Directive                 ; pass
Failing-TODO    = Simple-Fail SP TODO-Directive                 ; pass
Passing-SKIP    = Simple-OK   SP SKIP-Directive                 ; pass

Which you might want to do to grammatically determine passing or failing, but
it does leave us with no basic definition for a test result line.


* All syntax errors are ignored.

Since we ignore all syntax errors, a failing skip test must be detected and
treated specially.  That means it must be valid syntax!

Failing-SKIP    = Simple-Fail   SP SKIP-Directive               ; error


* Better error reporting.

This is the most pragmatic reason.  Since "not ok 4 # SKIP" is an
actual error, as opposed to an ignored line, you're going to
have to detect it and parse it anyway to inform the user.

Consider if you see:

TAP Version 14
1..3
ok 1
not ok 2 - no database # SKIP
  ---
  file: foo.t
  line: 42
  ...
ok 3

Now, what would you rather tell the user?  If it's just a syntax
error, then all you get is:

  Syntax error at TAP stream line 4.
  Diagnostics without test at line 5.

But if you parse and understand it, you can use that information to say:

  Nonsense failing skip test "no database" at foo.t line 42.

pointing the user at the problem.  Isn't that better?


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.