Re: [abnf-discuss] Are the Core Rules (Appendix B.1 of RFC 5234) always present in an ABNF spec?

Paul Kyzivat <pkyzivat@alum.mit.edu> Wed, 20 September 2023 18:25 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: abnf-discuss@ietfa.amsl.com
Delivered-To: abnf-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A8524C151555 for <abnf-discuss@ietfa.amsl.com>; Wed, 20 Sep 2023 11:25:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.102
X-Spam-Level:
X-Spam-Status: No, score=-2.102 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, NICE_REPLY_A=-0.091, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=alum.mit.edu
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Sq-hbI-Hu9AW for <abnf-discuss@ietfa.amsl.com>; Wed, 20 Sep 2023 11:25:51 -0700 (PDT)
Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2055.outbound.protection.outlook.com [40.107.102.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 99343C14EB17 for <abnf-discuss@ietf.org>; Wed, 20 Sep 2023 11:25:51 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cjmb887M5L5BDqgxPwo+Pvnfjj6tPYoa8QlaGd8QfC35BqRaH9rIFD3txRUx3EOKxszl4f+Z4LO3lo1kSQfTEbdiP5wkqvw2aZaBALmcDAmebsTazGhASV9h476gf4uq8MHmh9+cAXkqvjY+NMVcvnIGBMrGACQxjjy3PP8FpaHsYiMfCOU3wfbjgN5MA+vRA+0WDjUPezipY51EjPF8LkTJUKjvW3AKrkb3C7L50Cnfq4jfszmaDZ7Qos5XfLeYIg5TJs/BDlCPXpgk/cbDVh8HpVcHXlVRiy1R3Dcerd1XhGEIrXwH+4AZlyPTYvc0BNw6Wqj/gptqpsSPjyhgBg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=siyu0l5oDAnZume3Z/QwJn5ZIut/yPsZQ8Ei3JapnsY=; b=NMtub0udhk5TjiWrRz7f6JS8sxSAxOqQfsWtEX29nu1BLWMt4JWuawUe1B4B3eoj3NYNuSI/JIWhqA86LfGWsfw5mUv7glPHgFudR9PWOaohOto3Rq+v1I8VG/xuv/gbKHjjlaxs3pFXEQuGAVi5Yiua4g276KaclGmOiT9T9z3dylp7ikInGV+cLt70ivRCjpyOQJdFqjxtf8xDs34yXP+CFMnaty06eU8FBsh+Jer5eiaq3yTTpqU9iiSlPk4MXSz76/yyQpWh1Jmbf/HNRAH7w4r9GqMUYBtbrDsKh9ccN4Yhtmzp8db7x74FEaESLhQS104TqZ+BGyLLspqJ+g==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 18.7.68.33) smtp.rcpttodomain=ietf.org smtp.mailfrom=alum.mit.edu; dmarc=pass (p=none sp=none pct=100) action=none header.from=alum.mit.edu; dkim=none (message not signed); arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alum.mit.edu; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=siyu0l5oDAnZume3Z/QwJn5ZIut/yPsZQ8Ei3JapnsY=; b=WjYFoaRIyN0ZbDNVr9xrIqIdQabVdxmg80wjdS2CskMVBTHqoVOlGsM3nPUJwSwC3Cm1me/XClOcvcFcF5fXWXoyNJbPmIVlEULV1KkFSkqhA+va3AFEdMCXNxBSrPBt2PG6CLuDW2UN9Fs9mndo77cZ6IAq+eQQQVR8vhuZ/m8=
Received: from MW4PR04CA0359.namprd04.prod.outlook.com (2603:10b6:303:8a::34) by CH2PR12MB4229.namprd12.prod.outlook.com (2603:10b6:610:a5::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.20; Wed, 20 Sep 2023 18:25:48 +0000
Received: from CO1PEPF000044EF.namprd05.prod.outlook.com (2603:10b6:303:8a:cafe::68) by MW4PR04CA0359.outlook.office365.com (2603:10b6:303:8a::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.30 via Frontend Transport; Wed, 20 Sep 2023 18:25:48 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 18.7.68.33) smtp.mailfrom=alum.mit.edu; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=alum.mit.edu;
Received-SPF: Pass (protection.outlook.com: domain of alum.mit.edu designates 18.7.68.33 as permitted sender) receiver=protection.outlook.com; client-ip=18.7.68.33; helo=outgoing-alum.mit.edu; pr=C
Received: from outgoing-alum.mit.edu (18.7.68.33) by CO1PEPF000044EF.mail.protection.outlook.com (10.167.241.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.20 via Frontend Transport; Wed, 20 Sep 2023 18:25:47 +0000
Received: from [192.168.1.52] (c-73-143-251-114.hsd1.ma.comcast.net [73.143.251.114]) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.14.7/8.12.4) with ESMTP id 38KIPjY4030569 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NOT) for <abnf-discuss@ietf.org>; Wed, 20 Sep 2023 14:25:46 -0400
Message-ID: <7ddce85c-ad53-0874-fd9f-926a45d24e3c@alum.mit.edu>
Date: Wed, 20 Sep 2023 14:25:45 -0400
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1
Content-Language: en-US
To: abnf-discuss@ietf.org
References: <A9C3993F-D73B-4529-A94B-A6A33589726E@tzi.org> <3c38cda7-98d5-48c8-9b36-ea25f1ab5da4@gmail.com>
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
In-Reply-To: <3c38cda7-98d5-48c8-9b36-ea25f1ab5da4@gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-EOPAttributedMessage: 0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: CO1PEPF000044EF:EE_|CH2PR12MB4229:EE_
X-MS-Office365-Filtering-Correlation-Id: 17fd6757-1e04-43f7-8a78-08dbba070308
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: yjbCfHANeJ8goN65IVDjpHfCctHSJU6albh9jsUI/n3/6oBMvJOrlGN/3v1Tu0v3uQeSZSo61+CPD0YeUVbiqw9owwM3mi81Ppb7j/GisocOZPvPa2u7sp0bWRxAEsEmLQSD+4Nb6t6OIM+02xFsv10JVuQMDYORRZ/UyvAqdH2cj79CwXe61Bh3pyAuTJ4dvb+7ISUWFr+YPuPZlbcdxCz9hupxp+QQfB60SO6t1UU8DaEw+HEF6HiGt1zvDYcSzbgxy3FrgkcU+MHR3eQqPXdVlM05v1rdjqWpHEeOopoz5rQGAx98qte0/QkfqhhZQ58OfFVk2ZkvE/PuoyHGmQlCtlQ+EgYmelt6m/CLQxPMZl8vB2dvYGNFdyJSgSSkmLQXhiMr2ZMJC9H6TA04q4XeZSXSoQv7a2EM96hhyJHnRomIcfFI9TfASmh6QBBVVis3AwFU9thmZHLA3ZKlW5FQoRP2uzmOhNjyRk5kzRUuzLr1PcLVfNuXIpWCnQLQJycoPUsKUstCT5DRLMiHpCQvfJNGnPbLp0TnbkjWqD02bHFXSUSdl0XN5PhFvsvikegLUrJ/bJCr6jCahVv6yRQOs7/ynApeYFUsoiVxmznyvcigbtMi+0R2ow4NO9gvUdLJUdnhAUKQKrC3o92+QNMQhcgNd4Vt43YrIdHBhivnbWZbLtuKBYDSRahI8ZtgI8FSr/lLgnaBP4kAtXsArvxvzARp7ksbfuGlLIw5FGacVraeIjXOSNBjx+75W3gR
X-Forefront-Antispam-Report: CIP:18.7.68.33; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:outgoing-alum.mit.edu; PTR:outgoing-alum.mit.edu; CAT:NONE; SFS:(13230031)(136003)(376002)(39860400002)(346002)(396003)(451199024)(186009)(1800799009)(82310400011)(36840700001)(46966006)(31686004)(83380400001)(53546011)(82740400003)(5660300002)(2906002)(31696002)(2616005)(75432002)(8676002)(316002)(41320700001)(8936002)(41300700001)(786003)(6916009)(336012)(36860700001)(47076005)(478600001)(70586007)(70206006)(7596003)(356005)(956004)(40480700001)(86362001)(26005)(43740500002); DIR:OUT; SFP:1101;
X-OriginatorOrg: alum.mit.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Sep 2023 18:25:47.6954 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 17fd6757-1e04-43f7-8a78-08dbba070308
X-MS-Exchange-CrossTenant-Id: 3326b102-c043-408b-a990-b89e477d582f
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3326b102-c043-408b-a990-b89e477d582f; Ip=[18.7.68.33]; Helo=[outgoing-alum.mit.edu]
X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000044EF.namprd05.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB4229
Archived-At: <https://mailarchive.ietf.org/arch/msg/abnf-discuss/QUHbC7m04TpRufhc-hV1CcWAyOQ>
Subject: Re: [abnf-discuss] Are the Core Rules (Appendix B.1 of RFC 5234) always present in an ABNF spec?
X-BeenThere: abnf-discuss@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "General discussion about tools, activities and capabilities involving the ABNF meta-language" <abnf-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/abnf-discuss/>
List-Post: <mailto:abnf-discuss@ietf.org>
List-Help: <mailto:abnf-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/abnf-discuss>, <mailto:abnf-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Sep 2023 18:25:55 -0000

Dave,

Comment at end

On 9/20/23 12:26 PM, Dave Crocker wrote:
> On 9/20/2023 8:20 AM, Carsten Bormann wrote:
>> The problem here is that RFC 8259 seems to (implicitly) import “DIGIT" and “HEXDIG" from RFC 5234 Appendix B.1, but also defines a rule “char”, which conflicts with “CHAR” in Appendix B.1 (rule names are case-insensitive per Section 2.1 of RFC 5234).
>>
>> Is this a bug in 8259?
> 
> Probably.  To use a rule that is not defined within the document using 
> it has to mean that it comes from somewhere else.  (This is a tautology, 
> but seems to be needed as a foundation for the concern here.)
> 
> If it comes from somewhere else, how is that importation specified?
> 
> If it is implied, how is a reader to reliably and accurately know what 
> is imported and where it comes from?  Readers vary widely in subject 
> matter knowledge and a specification needs to work for all readers.
> 
> In other words, it must not be 'implied'.  It has to be specified 
> explicitly.
> 
>> And, if yes, is that bug that “char” and “CHAR” conflict (i.e., because B.1 is always implicitly imported), or that 8259 forgets to import DIGIT and HEXDIG (i.e., because B.1 is not implicitly imported)?
> 
> This hits some different issues, IMO.
> 
> One is formal specification requirements.  To that, anything clearly and 
> precisely defined in fine.  So if a rulename is common elsewhere, there 
> is nothing wrong with defining it differently, within the 4 walls of a 
> given specification.
> 
> The other is a human factors concern.  If readers are generally already 
> used to using the rulename and that rulename has a particular semantic, 
> defining it differently within a spec creates a cognitive conflict, 
> called proactive inhibition.  That invites error.
> 
> However sometimes it can be useful to note the history of the name and 
> explicitly redefine it.  As long as the rationale for doing this is 
> clear and documented, it might make sense to do. This would be an 
> explicitly 'derivative' specification of the rulename, rather than 
> purely replacement.
> 
> Lastly, while it might make sense to say that a spec citing the ABNF 
> spec can be assumed to be importing rules from it, that seems to me a 
> pretty risky approach to spec writing.  For example:
> 
>   * there is nothing in the ABNF spec that declares these rules to be
>     automatically imported by users of ABNF
>   * having something in an Appendix means it is extra, and therefore not
>     'inherent'
>   * the language at the start of the Appendix is "This appendix contains
>     some basic rules that are in common use." which is casts this as
>     informative rather than normative.

I agree with all of what you say.

But because there is a practical need to "import"/"copy" rules from one 
spec to another and 5234 doesn't provide one, various ad hoc mechanisms 
for specifying this have evolved across multiple documents. Some are:

1) verbage in the explanatory text of the document explaining that 
certain rules are to be taken from another document

2) an ABNF comment explaining the same thing

3) use of a prose-val, such as:
	CHAR = <as defined in RFC5234>

I generally prefer (3) when only a few rules are needed and their 
definitions don't reference lots of other rules. If the situation is 
more complex then I generally use (2).

This comes up a lot in documents that extend another document, including 
extending the ABNF in the other document. (Preferably the base document 
will have included "extension points" in the grammar.)

It can be hard to verify the ABNF in a document that extends the ABNF of 
another document. It is necessary to actually cut and paste to create 
the composite ABNF grammar and then check it. (Or implement it.)

Its even harder when there are multiple extension documents to a single 
base document. Their rule definitions can potentially conflict. 
Verifying that a new one doesn't conflict requires searching out all 
other extensions.

I would *like* to see ABNF extended to have a more formal mechanism for 
importation of ABNF, and for extension of ABNF grammars. This would 
facilitate automatic verification of ABNF in documents. But I'm dubious 
whether there is sufficient interest to pursue this.

	Thanks,
	Paul