Re: I-D Action: draft-carpenter-6man-rfc6874bis-00.txt

Andrew Cady <andy@cryptonomic.net> Tue, 06 July 2021 15:25 UTC

Return-Path: <andy@cryptonomic.net>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 432583A2B7D for <ipv6@ietfa.amsl.com>; Tue, 6 Jul 2021 08:25:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3nNu_bEIqrgs for <ipv6@ietfa.amsl.com>; Tue, 6 Jul 2021 08:25:32 -0700 (PDT)
Received: from zukertort.childrenofmay.org (zukertort.childrenofmay.org [149.56.44.185]) by ietfa.amsl.com (Postfix) with ESMTP id EB6283A2B76 for <ipv6@ietf.org>; Tue, 6 Jul 2021 08:25:31 -0700 (PDT)
Received: by zukertort.childrenofmay.org (Postfix, from userid 1000) id E25FEF2DC23; Tue, 6 Jul 2021 11:25:28 -0400 (EDT)
Date: Tue, 6 Jul 2021 11:25:27 -0400
From: Andrew Cady <andy@cryptonomic.net>
To: 6MAN WG <ipv6@ietf.org>
Subject: Re: I-D Action: draft-carpenter-6man-rfc6874bis-00.txt
Message-ID: <20210706152527.j47rcxas5nwz5d63@zukertort.childrenofmay.org>
References: <162545101341.19246.8566193740265797873@ietfa.amsl.com> <95a7dbe5-e0a3-4676-9dcc-005ff53725e0@gmail.com> <CA+9kkMD3iSgo-KMM5Ed8bVnVCu_G3f2kB6zHKoOx2ta=x8QucA@mail.gmail.com> <CANMZLAbmdWHDRBPpHgy_e4_0-WUVW2gjnbXWwu2pF_xi-S0vWQ@mail.gmail.com> <87a6n13y0j.fsf@ungleich.ch> <CA+9kkMBx4F0FGZasdk11ogyCOwQZecAEkO4JbECDr4osySN-4w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CA+9kkMBx4F0FGZasdk11ogyCOwQZecAEkO4JbECDr4osySN-4w@mail.gmail.com>
User-Agent: NeoMutt/20170113 (1.7.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/ocNXw2Tl7YnOXOVjnUJ_VS7PI88>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jul 2021 15:25:38 -0000

On Mon, Jul 05, 2021 at 11:08:24AM +0100, Ted Hardie wrote:
>    Hi Nico,
>    Essentially, the % symbol in a URI is an indicator that what
>    follows is percent encoded; see RFC 3986, section 2.1.  Section
>    2.4 also says:  Because the percent ("%") character serves as the
>    indicator for percent-encoded octets, it must be percent-encoded as
>    "%25" for that octet to be used as data within a URI.
> 
>     This proposal treats the % which starts a scope identifier as "data
>    within a URI" and so it percent-encodes the percent symbol.  

That is a bug.  The '%' must be treated as the syntactic delimiter it is.

We don't encode the '[' or the ':' either.  We only allow
percent-encoding in the percent-decoded data parts of the URI.  The '%'
is a delimiter between two separate data parts.  It is not a data part.
The user cannot input different values there.

No other percent-encoded character is allowed there.  It is not a data
part!  It is a constant, not a variable!  It is not a percent-decoded
component!

You are not allowed by RFC3986 to treat it as a data part, because that
document states correctly that it is not one.




Please remember two facts and a corollary that I posted here last week:

  1.  RFC3986 says percent-decoding is done AFTER parsing into components
      and subcomponents

  2.  The IPv6 literal subcomponent is NOT percent-encoded.

  2a. No percent-decoding is done (or allowed) on this parsed subcomponent.

(Ref:<https://mailarchive.ietf.org/arch/msg/ipv6/FPLeDZXqJ1zwE1yF_Qkh7Ldq120/>)






The technical question before us is whether to put a percent-encoded
character into a non-percent-decoded field!






The unfixable problem for implementors is where to strip the
redundantly-added "25".  The URL fragment needs to move like this:

  1.User --> 2.Location Bar --> 3. Web client --> 4.Web server --,
                                                                 |
                                       ,-- 5.Web application <---'
                                       |
                                       .------------------,
                                                          |
  8.User <-- 7.UA rendered content <-- 6.Web client <-----'

If a point of mangling is specified, then a corresponding point of
de-mangling must be specified.  Neither have been properly specified.




I claim ONLY if the fragment is mangled at 3 and de-mangled at 4, then
it can be made to work with the full range of addresses.  Mangle at the
latest possible chance, unmangle at the first possible chance.  Thus
protecting the user AND the developer of the web application AND the
developer of the OS insofar as the web application developer depends on
the OS interface.

In the case of {3,4} mangling is harmless -- but serves no purpose.

I claim no other choice of mangle/de-mangle locations even works.

Note: It is NOT acceptable to require the user to handle any mangling or
de-mangling at 1 or 8.







>    Other
>    approaches, such as a bare percent symbol within the square brackets
>    have been rejected (see the long Mozilla bug thread Brian posted).

That Mozilla bug thread explains the correct approach to the scopeid
parsing in comment 39:
https://bugzilla.mozilla.org/show_bug.cgi?id=700999#c39


        heinz.repp
        Comment 39 • 5 years ago

        (In reply to Brian  E Carpenter from comment #38)
        > But the real objection from
        > the browser side is that it's a horrible thing to implement.

        I know, this stems from Ryan Sleevi's 2015 comment, but having
        some experience in networks and C programming for decades I can
        hardly understand this. I see only 2 distinct jobs here:

        1. expand the parser of the url-decoded URI that already can
        parse IPv6-address-raw: instead of

        IPv6-literal = '[', IPv6-address-raw, ']';

        parse

        IPv6-literal = '[', IPv6-address-raw, [ '%', IPv6-scope-id ], ']';
        IPv6-scope-id = '0'-'9' | 'a'-'z' | 'A'-'Z', { '0'-'9' | 'a'-'z' | 'A'-'Z' | ' ' | '_' | '-' };

        difficulty: 'create your own parser' course, beginner's lesson




This is the ONLY correct implementation because it is the ONLY
implementation that guarantees that information is not lost in between
the user and the server.  We do not need a special case that loses
information here.  That loss is simply a bug.  A standard that mandates
a bug itself has a bug.

We simply need to fix the bug to get implementation.  The bug cannot be
accepted.  The bug must be removed from all existing standards.

We cannot allow some other standard to block this standard that blocks
the software bugfix: instead, the bug in the software must be fixed, so
that all standards mandating the bug must surrender their priority to
the need to fix the software bug itself.

Any standard that claims to have higher priority than fixing the bug in
the code must be laughed at by the gods and looked at with suspicion by
mortals.

>    I think that taking this approach is worth trying, but I believe that
>    consistency is needed.  Making this the valid form but accepting the
>    bare % in some circumstances seems likely to me result in lack of
>    interoperability.  If I can paste when going into browser 1 but not
>    browser 2, the result is confusing for the user.

Yes.  The one correct solution is to fix the percent-encoding bug
entirely.  It does not make sense to try to support any mangling.

However, isn't this just about Microsoft, and theirs the sole broken
name-mangling implementation?  Is it just Microsoft here demanding that
everyone else match their bug?  Just one firm?

Then let Microsoft ignore the standard -- do not let them ruin it.
Their browser can be the one we call confusingly different.  It cannot
be allowed to lead.

Name-mangling is wrong.  It is anti-user.  It does not serve any user
interest.  It is an "anti-feature."

Name-mangling cuts off the user from access to software written by many
developers.

Name-mangling breaks much software written behind the web server _and_
potentially software written behind the /etc/nsswitch.conf module
system, which allows great diversity of supply of names into the system.

That is a system that allows developers to plug in their own new
implementations of new ideas.  Don't try to take that away!





# Solution to Problem #

The way to fix port 80 is to abolish the name-mangling middle-man
forever.

Let the server know exactly what the browser's location bar says.
Let the user tell the location bar exactly what the user will.
Make the unbroken user<->server bond a requirement of HTTP.