Re: [dispatch] RFC 3896 and 3987 vs WHATWG URL Living Standard

Alwin Blok <alwinb@gmail.com> Tue, 08 June 2021 12:43 UTC

Return-Path: <alwinb@gmail.com>
X-Original-To: dispatch@ietfa.amsl.com
Delivered-To: dispatch@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E400D3A2F96 for <dispatch@ietfa.amsl.com>; Tue, 8 Jun 2021 05:43:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0nvEoqBs0-hJ for <dispatch@ietfa.amsl.com>; Tue, 8 Jun 2021 05:43:43 -0700 (PDT)
Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7955C3A2F95 for <dispatch@ietf.org>; Tue, 8 Jun 2021 05:43:43 -0700 (PDT)
Received: by mail-ej1-x631.google.com with SMTP id k25so26903376eja.9 for <dispatch@ietf.org>; Tue, 08 Jun 2021 05:43:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=1PBR7UQ3JKXZrf37EYH4u9BN+opiJ18Div/UcljRXDM=; b=dEcFuZ9IVK7iQpJM0dPq80ApnW1oRMPloprif+3/0AXt5TwBNpX7wo14jVdBdkJV0P A9UtEwxDkpiZmXj2sTUkhM+n9qlD4CoSkVGAfg4dvLtfU5kAEgQ2elzMaS0IoxeYxV13 6pi1QVa0l0fbnm7wH6zxcWSBIDXOz8NH+LlaaTjzRkPU+gt9UGztwbT1ZlT7i+jZqa9/ 9ZLL/Gn/32X15mf9wwCel9gon0SpSXx4w/Q6dkiECRAhkN07b+VKw98siNK6NyEFzdhs a/O4NOPi/PLp/E3KJbkLS+9EJBOTGvYwbPacbpSLq6fIHlseAoGCNpOFYMNvxSjt89UI lkzA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=1PBR7UQ3JKXZrf37EYH4u9BN+opiJ18Div/UcljRXDM=; b=Ni/E5CQfHIcaiqyIV510SFF9cs7Ww0WwcIDZMBtvnDu32dz5crwPnAcHf86ALC+vVs iR+zw4lBAgOgJMCLg3EDImAIOymPSB0j0R21rTLdRkGnx90T0ceBvdeTZ94xLJ9ka4w1 0JYd0wJekdztZJo8HsKD+8VUQrXP87VDZKTvqd98a2iswqp62LhGUBA2hjjRk7rG2N0U 9DwMjLj/7aEjLz7Xwm8RF0tUod7sfEPu3R3/jHDJKd0uRJx6npMiVJkSq4LiJA22FCLc xPdo/tztCGtI3NNSHxNDJxkxX1fofgXgttHdmGk2pVYW/L8A8AoKhuOOXwTJmXW66Qv9 Hrew==
X-Gm-Message-State: AOAM5303vSl2b6vcdJAyrl9dSECIStxZ3Gan+vuajVFUO4NAeL7zHMlP AFUNeELgGrQFlQJAOBiYHI+Y2d2PWfU=
X-Google-Smtp-Source: ABdhPJxSVvDJlY/z84Q+7QSkdW2+k8O9Hvd+2cmc24cykiIxrMCPO66W3AoYvwAoeLx1lriiO0rN5w==
X-Received: by 2002:a17:906:4308:: with SMTP id j8mr23541442ejm.315.1623156220677; Tue, 08 Jun 2021 05:43:40 -0700 (PDT)
Received: from [192.168.1.166] ([87.214.169.250]) by smtp.gmail.com with ESMTPSA id br24sm7716786ejb.55.2021.06.08.05.43.40 for <dispatch@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Jun 2021 05:43:40 -0700 (PDT)
From: Alwin Blok <alwinb@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
Date: Tue, 08 Jun 2021 14:43:39 +0200
References: <002501d75a5b$08694740$193bd5c0$@acm.org> <FC052CE1D6FD5CD0B69051AE@PSB> <BCCD9ABB-9E18-481B-8342-70005966E7E2@mnot.net>
To: dispatch@ietf.org
In-Reply-To: <BCCD9ABB-9E18-481B-8342-70005966E7E2@mnot.net>
Message-Id: <673E6113-F219-4865-9636-3D1EC9C8DACD@gmail.com>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dispatch/79eVb08qzANpoHmYa93Bjh6xivw>
Subject: Re: [dispatch] RFC 3896 and 3987 vs WHATWG URL Living Standard
X-BeenThere: dispatch@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DISPATCH Working Group Mail List <dispatch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dispatch>, <mailto:dispatch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dispatch/>
List-Post: <mailto:dispatch@ietf.org>
List-Help: <mailto:dispatch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dispatch>, <mailto:dispatch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jun 2021 12:43:48 -0000

Hello, 

I have recently subscribed to this mailing list after seeing Larry Masinter's message on the issue tracker of the WHATWG URL Standard <https://github.com/whatwg/url/issues/479#issuecomment-855303022>. 

I have done a lot of research on the WHATWG URL standard. Some of the results can be seen on my personal GitHub page here <https://github.com/alwinb/url-specification> and here <https://github.com/alwinb/spec-url>. I'm doing that work as an individual, I'm not part of a WHATWG steering group. I mention that just in case, to prevent confusion. 

I believe that an addendum to RFC3986 (URI) and RFC3987 (IRI) that specifies the WHATWG parsing and resolution behaviour, would be _very_ valuable. Another, maybe preferable option would be to combine RFC3986 (URI) and RFC3987 (IRI) into a new document that specifies elementary operations that can be easily combined to accurately and exactly reproduce the WHATWG behaviour. 

Personally I think that the WHATWG standard would improve greatly if the 'basic-url-parser' is refactored and split up into spearate parsing, resolving, normalising and encoding operations. I am doubtful that this will happen at all, though I do estimate that over time support for relative URLs will have to be added. 

I have thought a lot about how to resolve the, difficult situation and I've come to the conclusion that the WHATWG and the IETF can actually (I am serious) complement each other well here. The style of the IETF is more suitable for specifying the elementary operations and the deeper structure, whereas the WHATWG is limited in that area and can instead focus on the step-by-step instructions in pseudocode and on the specification of the web APIs.

The most important thing will be, that:

1) Any new IETF effort at an updated RFC3986/ RFC3987 specifies elementary operations that can be used by the WHATWG to implement the WHATWG behaviour, accurately and exactly, and

2) That the WHATWG acknowledges the new IETF effort, and explicitly points out that the behaviour is equivalent to theirs. This is similar to what has been done in the WHATWG Encoding Standard <https://encoding.spec.whatwg.org/#utf-8> for UTF8. I will add that personally I think that it is good for readers, but also as a gesture to the IETF, to make such a statement a bit more prominent.

All of this, is just an outline, of course, and it does require a lot of effort, still. 
But I believe that it can be done and that it is worthwhile. 

Regards,
Alwin Blok