Re: [Rswg] Archive format

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Fri, 29 July 2022 09:34 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: rswg@ietfa.amsl.com
Delivered-To: rswg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 550EEC13C505 for <rswg@ietfa.amsl.com>; Fri, 29 Jul 2022 02:34:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.912
X-Spam-Level:
X-Spam-Status: No, score=-6.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NCbLJ8_FSNVj for <rswg@ietfa.amsl.com>; Fri, 29 Jul 2022 02:33:57 -0700 (PDT)
Received: from JPN01-OS0-obe.outbound.protection.outlook.com (mail-os0jpn01on2115.outbound.protection.outlook.com [40.107.113.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C4913C15C50C for <rswg@rfc-editor.org>; Fri, 29 Jul 2022 02:33:57 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Q0lo3FbLAJSuWtFn1J6ViRD1hpA32NGHTMmkTNcL8CxjsnwJy41yRuzTGweZ4jE4++PfZVwobrX+RCVYQPhxYCTdl0sRACt53lsmTDPtiaI88NbCC9Un6Z7xDgABl8r6mmbhgO/Q8OHrt1zjtHU3xE0sytJ3/ar/RXjD+m4z0uFbVjKsk1JDl8ZbUK9LHmlg0sg9qnxNTfh1faUWyjOOHxf/bV6255jr9GRjpGujxc7Jp4+M756mP5Mxl4xONq8WaZHVTmulmOxXE9Kqy6lO7jtXKdRKSMZEp8EtQs7TqrzaRtHC2GXLiq7xpXadeaQrBpFABJvZ/G3cbtFTKQkC4g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=35IFef5xIpFVYjAfWSA7XVE009lS19aYre5hp34WIvA=; b=MKw3p3A//gJtGqwQgLViDwdtDLpK0Oj0rs3jVJp6QBZIYuonoyxqzbbe53X8D0IjktVxR1HPcqqrxxmAVrLnW6gb2uJCeDyhH65H3X7dE3Nlk7t5ArseJZ8aR5Wpf3YYV69luRGNoPZDq+rrvZ2Zrv0H/HT7XeqTFo74UhQJull+QdvlxDQWkbpjfEXCzGmA+BdXAN2dhBvHicqaJzoJkrIvIruHolxQ6x8P2Oxc7699ltP1gUJHbb4ns2sJNlsSe/MCDgI6u2+NN13/oH+wIz8Zt7zdHiFa8OVXGwACXy0NLG0Cz+Ueo3UlTh3uKFE2gA/yqUmMZTcMxqeFGKp7Pw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=35IFef5xIpFVYjAfWSA7XVE009lS19aYre5hp34WIvA=; b=BuA0kLCQljp/TgBE57s5fPN/sLGeVyNFM+sFldyM0g8lYMQKZAp1SAOjkrULmMAJ2YkfA0BCLJzs0y4inrSsMz9mYzG6F9kYfD3K1y7zj8UbR+3gEuyvFexpR4CzRnij5e0xxN208r45VzOZ4RMmq7LSZvGlYNEnPDOH+v6VRqM=
Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7) by TYCPR01MB7725.jpnprd01.prod.outlook.com (2603:1096:400:17a::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5458.20; Fri, 29 Jul 2022 09:33:53 +0000
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::a55f:c63f:e2d7:2546]) by TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::a55f:c63f:e2d7:2546%9]) with mapi id 15.20.5482.010; Fri, 29 Jul 2022 09:33:53 +0000
Message-ID: <ac05ae83-18bc-c174-db0d-09b02ce15a92@it.aoyama.ac.jp>
Date: Fri, 29 Jul 2022 18:33:50 +0900
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0
Content-Language: en-US
To: Mark Nottingham <mnot@mnot.net>, Brian E Carpenter <brian.e.carpenter@gmail.com>
Cc: rswg@rfc-editor.org
References: <fa101e0e-998e-900c-3a03-824abd1dd0fa@gmail.com> <a4ab3fa7-b59e-14e2-e442-a38d6f87b8c7@gmx.de> <CACOFP=i-85zPcQY3sWrno9dDYbt5e2DogzZ1S9fSjc6=FHBoyA@mail.gmail.com> <483e8a6f-4507-7fe4-5770-43f7773688a8@gmail.com> <BF154431-6A28-44A4-93C9-C92C96729AA5@mnot.net>
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
In-Reply-To: <BF154431-6A28-44A4-93C9-C92C96729AA5@mnot.net>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-ClientProxiedBy: TYWPR01CA0010.jpnprd01.prod.outlook.com (2603:1096:400:a9::15) To TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7)
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: 5e17d299-e8ca-4300-fc74-08da714573c1
X-MS-TrafficTypeDiagnostic: TYCPR01MB7725:EE_
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: Mak+32qdZxc6A/V04VE7TH+pa4Yo2uhYv7vQW+AZ2ZOCwne1UBoMPgQDRxJ+M28dVPrj99huRBTYOmvi+XDPzJaSyKZp+k0gmJ17yVG1EsXMwHwCZV4pL/eXUklaqQ+BcP/fh3smePvJ9CNU5tHy9JxbhTRRBq0Sc+67EUmQNw8bgktDHlMsfYnQysfY4rAMmWL7sqQ+hbucg0Mh3wLKVQdwSDTga1Itkq1mAjll9Ial6p+rh7TDk5/WDOTSuYMV/SvzRJL4F9zAam3K3c62WnB+pQf/H5QKPWoHP52IMFjJNZ+67YA4q9eFjsIhGhuRvhA7fbtaBnuU3uuReVt1fjhyhBP1nRZAiR6NO692PoXe9WbRUfMpgBsFB6Ye5rGarCGNF/wTpeFWWO/Phy/vl26ZOzftcpsCF08NTSzvKECRBBXLSQbD3OgR1clSp62BPhm3Ijfc0Yi/s+Zo2lZAPssR8iXEDh1o/OmPl0Y3o7ammE4o65eR2U1BVjaSPVV5wYvKMEXs7t7bSEZhVpnqqbHmkQncIpUlG8uZzWPglnBIYZRpFevih0BCTW8+b0Eh7SIoak5Sw3QSKpmGlAmwsgKOGZ3J1ei3ryp9R5YWOfN3hP/Xk6ojzUOS2hJ5+5BLaK7TMTkKVwWSXnKqsGgDs3eGevIi00QlR/g5IgzlMmX5AiJ4IszuJsTFNJbtn4bld9sW5soFAIwVGe2SVgc2ovhYRyAWkIHFevYfmlEhbOJoXqZWwroGJ7FbVJpqIlBxvXv5ohEMQnlgKS0IQsQ1RWeyI1T1YPpEw1ff6j9Y/WhW8kdjFup2NTWECU3mU4Ll4Zsqb2vhqJ/jdLq3LpB0mkCTFSSFSezOrbHMVOhq6BoSvWn978egY0UcVY5oVSPISv8geDcpNZBAF90LGWMMWOWQgmgWvtXRNk99hxngsL1nG66iG+LQ2dMKd93lJA1U
X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB5689.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(376002)(39850400004)(136003)(366004)(346002)(396003)(186003)(6506007)(786003)(2616005)(6486002)(66476007)(26005)(52116002)(8936002)(31686004)(8676002)(53546011)(110136005)(66946007)(6512007)(2906002)(41300700001)(31696002)(83380400001)(66556008)(41320700001)(86362001)(5660300002)(38100700002)(478600001)(4326008)(36916002)(38350700002)(316002)(966005)(43740500002)(45980500001); DIR:OUT; SFP:1102;
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: TeCcsxF9QJnJrQAfBlpZ0G8TNgj2vviGEcWbwsdryNrrFZupSyREiyHXzo/X/bFDSo1fbRzofbHeIy2DbtNXBEZmxGfkL/e0Qz9DQ2k082sorJlo29ZX/V/LmRZ0fEzTplhiah+R2pKpsxRT0m3HVYeq8Wx+++8pjveB9wuhLunr1lYv5jgc+zliq5oYBQ3tNqttuyNCas7XJ954hnJvIbdi1GbMUjpRgTSRMJQThgr0DGcNzOYVAnMKNjDbuSJthW6a8X84moEP+3dnMuR273X3eyzqnq3SZy6UFK/lkoHXkLjwNbNTzZyAX5ZFUkzIGotya95kpbzZIXr7Ndkb+49BuIPEhLOyBQHT7EgBQVBPQWAdsMhMCzFOPOq5ibCuoYSPj0x/MHokT4zNJrSS5gPgBLdDjSXT2/AfLqSx5g6hFlOn6zq9c9LXK+bPDFREeVOW68EEB3uZPVULc3eqKN0CGTMJiKhm5yKanFsogTlCpEHuiTZiXWvMJK4d7e8KFWhO5ov8UfF/oLgIpk+jbQKzA+blAonBOSm56+KvGn7m4yGI0LrxiBYg0zXlwulAxRlbCA3kGRpKJhycaYn9Y0UaDdzX/T7vQ+bsIcpk9a0ZUsZcQfkw7rnYwWP5xcCMILMMiAjO7Q1sAiOs3As3AxnspnhVB+fd/cc9WBLQe6+VRLJCCtJU5idxioSWCxcWgGTv0EXmJcUIoB8bXA7DuG929PuEI10/l/psKv9lQQwSh/nr743NZZEw0O2ow3bzMRp1S6/DlLl3hspTyXNPdnN8fi/L5RgA9on9yyM54OLXQCrgj5H0Mhh2RL+a1shxihLjdbxQW27FeOpGAoNd2AF3kLE1v8bMhBfiCl+nB7xV82FT7w+mZrWvi9Px89ROukmZz3SEwlUK1Po289RrZByc/wG1MRgaPrGUe/TeG/weMMvM+uNzqzfnzhRbd+S+tARa6UNv/tTLkgch/KCuiY+nnSsTHz/lT+I/XYHxqCzAnKv5tH5V+yGYasR9n2BXdjHKrOqEJgZREqp8TOKFp4DVGF3bpPQ5KYnttNEbe7ykeKpDt5NOXBjWAxuV73/Wwu0trBsfi1wI41AwfdLdf1rqEdQkY8MdFwDF0ml6mFrfnmUcCegd2jO9w6+6/gkfJY80o/AAubvAWTXIPZzBsBPRylM8i8ObPln5OHYyCH0BeLZ/M3PM7zrEL7HDy/sA42WMFjYdLTUl5x4aomepWHQGCuv7D5mv6czoFzbOHk9Vb0DwyFISN4X5f1tYIDolPj1gEh4lH5jX+nvmVKXJF9C38b78baqM1StPj2AnXc7aB0t13OhYaKR8ovXBu6DmXMjaA8WkpRIGrniYwQGBc/repBvjobZ2kPddK5NuVnaC8ovxR2bb8szyw2DOVa6sG7HHIVH2vSxALFJRCicbHuRohmc5HaSgBg8al7jQdn0j+vamyfDhIv2uxYkCDqOxBf+3pOlZIb6LGqypGpF49b9yQ2YmbWH5gXGPaf8HJUZoRrM4u/+DmXP/Hp/QFN0NoP794QefOzC8zaAP4VPkivZ8GuxgRGbhZ4uIkZ1pp9UPIYxTvCglG4K8Kvts5DQ5EMuEFDsxtsmFFnTiePl4Hw==
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 5e17d299-e8ca-4300-fc74-08da714573c1
X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB5689.jpnprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2022 09:33:53.7152 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: +pHslv/LgutlhF8J85OXNmhH6TmmLs1nglznlipV+j0kvJZzHgM+Ku0tmzobyxRDCejZJaOCS2tDXF6z9VOKZg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCPR01MB7725
Archived-At: <https://mailarchive.ietf.org/arch/msg/rswg/QF9TgKGDlOPia96JWzU-blb4HIg>
Subject: Re: [Rswg] Archive format
X-BeenThere: rswg@rfc-editor.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "RFC Series Working Group \(RSWG\)" <rswg.rfc-editor.org>
List-Unsubscribe: <https://mailman.rfc-editor.org/mailman/options/rswg>, <mailto:rswg-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rswg/>
List-Post: <mailto:rswg@rfc-editor.org>
List-Help: <mailto:rswg-request@rfc-editor.org?subject=help>
List-Subscribe: <https://mailman.rfc-editor.org/mailman/listinfo/rswg>, <mailto:rswg-request@rfc-editor.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Jul 2022 09:34:00 -0000

On 2022-07-29 12:51, Mark Nottingham wrote:
> I think it depends on what properties we're assuming 'archival' and 'canonical' documents have.
> 
> In my mind, the closest thing to stable has always been the textual representation. The HTML representation changes with e.g., stylesheet changes; the PDF representations sometimes need updating; and as we see, even the XML vocabulary needs to change over time. All of them seem to evolve in such as way as to not change the textual representation in any meaningful way (for some definition of meaningful that probably features whitespace pretty prominently).
> 
> Just food for thought...

Yes indeed. As network engineers, we tend to care a lot about the bits 
on the wire, and so we care a lot about the bits in an RFC. But an RFC 
isn't really bits, it's text. And I don't mean text in the sense of 
ASCII (or UTF-8) text here, but text in the sense of words and sentences.

White space features prominently in there if it's relevant for layout 
that affects meaning, but otherwise it's pretty irrelevant. So is most 
of the CSS styling for the HTML.

Of course, when you start to be sloppy with the bits and bytes, this may 
eventually lead to sloppiness with the textual content, and so we 
shouldn't be sloppy with bits/bytes/markup/... But we shouldn't deal 
with it as if it were the holy grail.

On 2022-07-29 06:48, Brian E Carpenter wrote:
 > On 29-Jul-22 01:56, John C Klensin wrote:

 >> Not speaking for Brian, but there might be at least two reasons:
 >>
 >> (1) The very fact that the vocabulary / schema seems to be
 >> evolving rapidly to accommodate problem fixes and additional
 >> features is problematic from the standpoint of an archival
 >> format that was expected to be completely stable.
 >
 > Exactly. As I understand it (and I have not looked at the details),
 > RFC 8651 and RFC 9286 use different xml grammars. This makes any
 > claim of a stable archive format dubious. Imagine trying to explain
 > that to a jury under hostile cross-examination as an expert witness.

[IANAL] First, we have to distinguish between a stable (e.g. bit-by-bit, 
or at whatever other level, see above) archival version of a single RFC 
or a range of RFCs and a single stable archive format. Of course having 
the later is desirable, but lawyers will only be interested in the former.

Second, even for the former, if there are versions that differ 
bit-by-bit but have the same textual contents, lawyers (and the court, 
and the jury) will be interested in the textual contents and not in the 
bits, unless the difference in the bits somehow is potentially affecting 
the meaning of the text. (*) So again, we should be very careful about 
what we do, but bringing up hostile cross-examination when discussing a 
stable archival format is a red herring.

Regards,   Martin.

(*) I was involved in a (court, although it didn't make it to court 
because the parties settled early) case where white space was actually 
relevant. I was deposed as the author of 
https://www.ietf.org/archive/id/draft-duerst-dns-i18n-00.txt and its 
updates, because a company had managed to get a patent on it, and that 
patent had ended up in the hands of some patent troll company which was 
then suing a company that used i18n domain names.

The defendant's lawyer showed my page 6 of 
https://www.ietf.org/archive/id/draft-duerst-dns-i18n-02.txt, 
in particular the table on that page, where the data columns are off 
from the column headings. (The -01 version doesn't have the problem.) I 
don't remember exactly, but I guess I apologized for my sloppiness.

The lawyers also showed me the patent itself, which had the same column 
shift. I expressed surprise at that, because for everybody who had an 
idea about what was discussed in the draft, it should have been obvious 
that the columns were off; that this wasn't corrected in the patent 
seemed to indicate that the people writing the patent didn't really 
understand what they were talking about.