Re: [abnf-discuss] [art] [Technical Errata Reported] RFC7601 (5435)

Peter Occil <poccil14@gmail.com> Mon, 23 July 2018 12:19 UTC

Return-Path: <poccil14@gmail.com>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B2FC130E17; Mon, 23 Jul 2018 05:19:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.749
X-Spam-Level:
X-Spam-Status: No, score=-1.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 00Xdtn1POhDj; Mon, 23 Jul 2018 05:19:49 -0700 (PDT)
Received: from mail-yw0-x236.google.com (mail-yw0-x236.google.com [IPv6:2607:f8b0:4002:c05::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EB3C5130DE1; Mon, 23 Jul 2018 05:19:48 -0700 (PDT)
Received: by mail-yw0-x236.google.com with SMTP id c135-v6so134552ywa.0; Mon, 23 Jul 2018 05:19:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:mime-version:to:cc:from:subject:date:importance :in-reply-to:references; bh=57eGCdAByLhkXEw62dPrwiJVFRBIYnUt3T5HuWRmYXE=; b=PxJNy32v4Z9GySSfjXMEBlBCMXv5ueV1se5l1vmYdSO3Op+3zXUQZdbUFfmo6A5xkN uy4/E2PIij2DdjT6w8nvqGj9ZMSPQg/D1jn56pz1sbHpozH5E4uXr82ncUgxxr6m8Pk3 gdoEUhJcRKnYA376WB2sRzbZQj2/GKGSu3tydfMDKjZfWXLhRLN+TIve2LOqhKqYtPqH HngLlB9xhliBKQ9S24BFT5Q07U92bn6rPSq/GGZPyGS0A537hu9hPxSaDpZqnMG5wllE dfRhT7Aapqe84tVpWjoGSR4GyCVzB6zo34iL8jRagQLEDOA7mB+iUQBXHLsNht0niW97 qKnA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:mime-version:to:cc:from:subject:date :importance:in-reply-to:references; bh=57eGCdAByLhkXEw62dPrwiJVFRBIYnUt3T5HuWRmYXE=; b=oJzJ8vaOCJ4d9etezc71w9X2lqTAywnDKc/kH5HNT5/R0eqEzDEeiSrPwsfd5KSw5B e+TOl3LkAz3V3P7WUFiDSimALzF5yL2DlfFobK+e0cTWprpEV1ZFcUa6CYS/eOuG2qYD 1/qnUB3wcNgh4OWBZwViKbqIgSZOtreHF02swmdWBasEqyVlYDCfF0VqixAcxcp1U6/e b91REeCsyf1Mw8guFsnTe2XazcBCgfnpEIGdqoIVRrV9xg3iuATAzJ360W2gtYPkzQ/Z 5BlwxzDGctyPI0Cr9/xjumFuIf16TTtHyO8A1urcKC6puYkeYEQqGHyMcXkt04TIsi01 ItCQ==
X-Gm-Message-State: AOUpUlFEns9a0MvloF7UwDafHgOHsELbWAxd6KoQ+fF80VRwQr6/rNVy sly8d9kLep6+wnabl9kJVCc=
X-Google-Smtp-Source: AAOMgpfRntP+VhgGiOmViyyugdSD7f40yRJWdlSkj2/uvqeN7svGSSEw0Bnk0P8SjCBj3FWp/UAPBw==
X-Received: by 2002:a81:2913:: with SMTP id p19-v6mr6329745ywp.270.1532348388174; Mon, 23 Jul 2018 05:19:48 -0700 (PDT)
Received: from ?IPv6:2601:192:4e00:596:22:8b71:4eb9:6006? ([2601:192:4e00:596:22:8b71:4eb9:6006]) by smtp.gmail.com with ESMTPSA id y133-v6sm7318536ywy.31.2018.07.23.05.19.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Jul 2018 05:19:47 -0700 (PDT)
Message-ID: <5b55c7e3.1c69fb81.10c36.95dc@mx.google.com>
MIME-Version: 1.0
To: "John R. Levine" <johnl@iecc.com>, "abnf-discuss@ietf.org" <abnf-discuss@ietf.org>
Cc: Murray Kucherawy <superuser@gmail.com>, "ben@nostrum.com" <ben@nostrum.com>, Alexey Melnikov <aamelnikov@fastmail.fm>, "adam@nostrum.com" <adam@nostrum.com>, "art@ietf.org" <art@ietf.org>
From: Peter Occil <poccil14@gmail.com>
Date: Mon, 23 Jul 2018 08:19:48 -0400
Importance: normal
X-Priority: 3
In-Reply-To: <alpine.OSX.2.21.1807222231200.19369@ary.qy>
References: <20180723000558.CD758B80F99@rfc-editor.org> <alpine.OSX.2.21.1807222152380.18947@ary.qy> <5b5539e6.1c69fb81.b4b6d.5477@mx.google.com> <alpine.OSX.2.21.1807222231200.19369@ary.qy>
Content-Type: multipart/alternative; boundary="_3DC1092F-2F87-4954-8E0F-FBB2D9E9FFD2_"
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/Veh1XGGurjXy6DjugU6P9-iMdmU>
Subject: Re: [abnf-discuss] [art] [Technical Errata Reported] RFC7601 (5435)
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Jul 2018 12:19:51 -0000

You have pointed to an unexpected weakness of ABNF.  As a result, it's unfortunate that many ABNF productions in RFCs can't generally be used as is in order to build a parser, since ABNF itself has no strong concept of matching.  

For example, RFC 5234 sec. 3.2 is silent on which order alternative productions are to be matched (although this silence can be useful, I admit, in order for the incremental alternatives feature, sec. 3.3, to work).  The production "received-token" in RFC 5322 is illustrative:

   received-token  =   word / angle-addr / addr-spec / domain

Both "word" and "domain" include the production "atom".  Thus, whether a received-token parser parses an "atom" as a "word" or as a "domain" depends on the context in which the "atom" appears, the kind of parser (greedy or not), or whether earlier alternatives take precedence over latter alternatives or vice versa, none of which is defined in RFC 5234. "Obviously", if the "atom" is part of a domain, then the received-token should be the whole "domain" regardless of the details just mentioned.

[ Additional context: https://www.ietf.org/mail-archive/web/art/current/msg00571.html ]

From: John R. Levine
Sent: Sunday, July 22, 2018 10:33 PM
To: Peter Occil
Cc: Murray Kucherawy; ben@nostrum.com; Alexey Melnikov; adam@nostrum.com; art@ietf.org
Subject: RE: [art] [Technical Errata Reported] RFC7601 (5435)

> And also, the erratum I submitted can be correct for handling certain 
> parsers that skip the CFWS in the “pvalue”, set the pointer to after the 
> CFWS (if any), and move on to parsing the next “CFWS propspec” within 
> the (current) “resinfo” production (in other words, the parser does 
> “greedy” matching).

ABNF doesn't have to be LL(k) or LR(k), so greedy parsers won't work, at 
least not without rewriting the rules to factor out ambiguities.

Regards,
John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly