Re: [art] Artart last call review of draft-ietf-6man-rfc6874bis-02

Brian E Carpenter <brian.e.carpenter@gmail.com> Sat, 17 September 2022 04:11 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: art@ietfa.amsl.com
Delivered-To: art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 76D46C14CF13; Fri, 16 Sep 2022 21:11:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.865
X-Spam-Level:
X-Spam-Status: No, score=-0.865 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZoQUxlRHujOS; Fri, 16 Sep 2022 21:11:04 -0700 (PDT)
Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4C7B1C14F73E; Fri, 16 Sep 2022 21:11:04 -0700 (PDT)
Received: by mail-pj1-x102a.google.com with SMTP id q15-20020a17090a304f00b002002ac83485so1369885pjl.0; Fri, 16 Sep 2022 21:11:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date; bh=GcRVnFofKBsC5bSJEX8e5QdiSdi2Kg7bN/gIfA/NCVc=; b=fpMgHH++wI2NoTz04rO5dUGOA2eaHaf+E1w1jrCdGYbiakrGMdhLyZg5dV5J2lEd0t J2kBzmdYSGAl0ZAkxB9DQFTKZtABckHvoUoSohMa6GzfYfghw0pxVQahXM9GtVVppCHR qCzsh6lUjMiLAIjJjIeIVatgtHnoY6fJrLlxIRuxZZr5+WWkpo6ISK+oorys5e/YXE1Y TEbLIQaznUioxeOi571Y7LKbGJG0xf2NIn78vrI8i0qoH//szqwoKIc6A366wOR5b9a2 9MYInR0YuDLmJPVK0iOzrMf1urSS9mk0/yeDnhb1H96jtDda2N9sYmbuTDWZ0/qsbnEg qGLQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date; bh=GcRVnFofKBsC5bSJEX8e5QdiSdi2Kg7bN/gIfA/NCVc=; b=Su2GsvNFMqQz6nFShJa1jHT4Ull/inGt0qUTvqkd1JlBlvnb5gZksBLN7kEGvW51iT y/qsiNFgM95QaLgKPqojCp1EaeIf+5QR/LQj43pxwNF2cnHRvgBUwC989rRFSbJmRxWD pmtpbvE9s20I6ghrMoYUe/CmdrTAo9OawWQAtKoVpfGNoXFU9LASDzqFuORNQcNDz92r iRqy72rMBV7vlrnQ3snH6G+ttOr+P+0wrnRHVgfwLvvQRSEEnQNTxNA8UO5pYN39tMKr 55kRLbSMFHeS9tAnxk8idAj95PK5Eb7PWta8chSxJI7f7Jv3sQHwuW5mZU0OZ5kO9OFd MSyw==
X-Gm-Message-State: ACrzQf0q27bErl1f9t3/u4wRGOw38THTQo4QiyaPL36CcB6tW9lAthB/ tBZmZmkqWpRNLePvNmErsbc=
X-Google-Smtp-Source: AMsMyM5d3jx9ZkWd7a/ZiFlb4GnhJUh9IVkVm6JH956KA+8rok+lwU4A3rAGznrI/olR0akQiTaVGA==
X-Received: by 2002:a17:902:c1cd:b0:177:e483:51b0 with SMTP id c13-20020a170902c1cd00b00177e48351b0mr2982358plc.41.1663387863493; Fri, 16 Sep 2022 21:11:03 -0700 (PDT)
Received: from ?IPV6:2406:e003:1124:9301:80b2:5c79:2266:e431? ([2406:e003:1124:9301:80b2:5c79:2266:e431]) by smtp.gmail.com with ESMTPSA id f11-20020aa7968b000000b0053b723a74f7sm15898295pfk.90.2022.09.16.21.11.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 16 Sep 2022 21:11:02 -0700 (PDT)
Message-ID: <140572c4-5ca3-0da7-9a86-e51948fad43c@gmail.com>
Date: Sat, 17 Sep 2022 16:10:58 +1200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0
Content-Language: en-US
To: Martin Thomson <mt@lowentropy.net>, art@ietf.org
Cc: draft-ietf-6man-rfc6874bis.all@ietf.org, ipv6@ietf.org, last-call@ietf.org
References: <166335671066.41888.11681289954866903154@ietfa.amsl.com>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
In-Reply-To: <166335671066.41888.11681289954866903154@ietfa.amsl.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/art/3UKRbQGeZUFxfEDublcEPeRKry4>
Subject: Re: [art] Artart last call review of draft-ietf-6man-rfc6874bis-02
X-BeenThere: art@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Applications and Real-Time Area Discussion <art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/art>, <mailto:art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/art/>
List-Post: <mailto:art@ietf.org>
List-Help: <mailto:art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/art>, <mailto:art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Sep 2022 04:11:08 -0000

Hi Martin,
On 17-Sep-22 07:31, Martin Thomson via Datatracker wrote:
> Reviewer: Martin Thomson
> Review result: Not Ready
> 
> As a bit of a preface here, I've been aware of this work for some time, but
> have refrained from offering opinion.  As liaison to the W3C, I felt like my
> responsibility was to facilitate the conversation.  

And many thanks for that.

> However, as Barry has asked
> me to do a review, I feel obligated to set that attempt at neutrality aside.
> 
> The biggest issue here is that this document is very much unwanted by its
> target audience, as is its antecedent, RFC 6874.  I share that concern.

In a way that's the whole point and exactly why it *should* be published
as an RFC.

Let's first dispense with RFC 6874. We (its authors, mainly) were careless
or naive, and so we ended up with a lot of stuff in there that was simply
unrealistic. So we should just forget about it.

Yes, we know that most of the browser writing community doesn't want this.
It's a feature request. Programmers very often don't want feature requests.
It's users that want them. True, this particular feature is not wanted by
the average user. But it's very much wanted by an important subset -
technical people who know that this will be needed for ops and diagnostic
reasons in future. Actually average users might also need it. You find
consumer products whose instructions tell you to browse to http://192.168.178.1
or something similar. A future product might tell you http://[fe80::1%eth0].

Just to underline one of the use cases, this is an *active* bug in Windows:

"In Windows 10, a WSD printer will have with a URL like this:
http://[fe80::822b:f9ff:fe6a:d7de%10]:80/WebServices/Device "

(from https://bugzilla.mozilla.org/show_bug.cgi?id=1759739, with
my follow-up at
https://bugzilla.mozilla.org/show_bug.cgi?id=700999#c89 )

We deliberately kept the use cases in the draft at a high level, but they
are real enough. I really don't think it's acceptable for the browser
community to simply dismiss the requirement.

> 
> I have clear, strong indications from folks at three major browsers
> (conveniently, I am at W3C TPAC this week and so was able to talk with a few
> people directly on this topic; inconveniently, I've not had a lot of time for
> this review) that this change to the URI specification is not just something
> that they don't want to implement, but that it is not good for the Web.  Public
> communications from them will be somewhat more polite and circumspect, but it
> has been made clear to me that this change is not wanted.

Since this URI is actually a LRI (local resource identifier) there is indeed
a meta-issue here, but one that already exists and seems entirely metaphysical.
"Not good for the web" is a strange statement for something that *explicitly*
has no meaning outside the host where it's used. Actually this is precisely
where RFC 6847 got itself into trouble, and rfc6874bis doesn't. From what
you say, it's unlikely that the people concerned have read the current draft
with care. At least one person already said off list that they haven't read
it. So this seems like hand-waving to me. I'd be interested to learn how
exactly this local format can damage the Web.

> 
> The IETF does occasionally publish specifications that don't end up being
> implemented, but we usually look for signals that a protocol might be
> implemented before even starting work.  Here, we have a strong signal that a
> specification won't be implemented.

So, you're happy that a requirement from people who define, implement and
manage the lower layers of the protocol stack can be refused on some
rather abstract argument?

> Mark Nottingham asks the same question as
> well as point 1 in [1].  (I don't personally find his second and third points
> to be especially problematic given adequate consultation, but even there, there
> are a few concerns that I will outline below.)
> 
> I do want to give due credit to the authors - Brian in particular - for being
> very open and forthright in their consultation with the affected constituency.
> They have been proactive and responsive in a nearly exemplary fashion.
> 
> Overall, I think that it would be better for the IETF to declare RFC 6874 as
> Historic(al).

That's kind of irrelevant. As far as the use cases go, that would change
nothing. The users would be left with nothing.

> There might be some residual value in RFC 4007 from a diagnostic
> perspective, 

Er, diagnostic use via ping is a tiny part of the value of RFC 4007.
RFC 4007 is fully supported by the POSIX and WinSock APIs and is used
by any piece of software that needs to make use of link local
addresses. I've written some such software myself and it works well.

> but the use of zone identifiers in URIs seems fundamentally
> incompatible with the goals of URIs.

Metaphysically, you are correct, of course: http://[fe80::1%eth0]
is not a URI, it's an LRI. But so is http://192.168.178.1, so
this train left the station 20 years ago. Would it help if the
draft pointed this out?

> I do recognize that the Web and HTTP is not the only protocol affected by this
> sort of change.  The goal is to change all URI types.  However, I believe that
> HTTP is pretty important here and I have a fair sense that the sort of concerns
> I raise with respect to HTTP apply (or should apply) to other schemes.

Well, yes, but the "U" versus "L" distinction will stand.
  
> 
> ---
> 
> There are a few technical concerns I have based on reviewing the draft.  Some
> of these - on their own - are significant enough to justify not publishing this
> document.
> 
> Inclusion of purely local information in the *universal* identity of a resource
> runs directly counter to the point of having a URI.  This creates some very
> difficult questions that the draft does not address.

I got rapped over the knuckles whenever I conflate "U" with "Universal",
since it actually stands for "Uniform", but OK.

The other possible conflation is with "Unique". Note that http://192.168.178.1
is far from unique. Millions of instances exist. http://[fe80::1%eth0]'
is no different in that respect.

> 
> For instance (1), the Web security model depends on having a clear definition
> for the origin of resources.  The definition of Origin depends on the
> representation of the hostname and it relies heavily both on uniqueness
> (something a zone ID potentially contributes toward) and consistency across
> contexts (which a zone ID works directly against).  Now, arguably the identity
> of resources that are accessed by link-local URIs don't need and cannot
> guarantee either property, but this is an example of the sorts of problem that
> needs to be dealt with when local information is added to a component that is
> critical to web security.

Yes, but when we tried to address this in RFC 6874 by requiring the local part
to be deleted on the wire, the implementers (very reasonably) told us we were
silly. Should we state that the Zone ID should be disregarded for security
purposes?

The thing is, like it or not, when two hosts communicate via their link-local
addresses, using any protocol whatsoever, they have implicitly formed
a limited domain. (Didn't think of that for RFC 8799.)

> 
> For instance (2), in HTTP and several other protocols, servers depends on the
> host component - as it appears in the URI - to determine authority.  If there
> is no rule for stripping the zone ID from URIs, servers hostname checks will
> depend on the client.  That exposes link-local servers to information that they
> need to filter out.  Some might not be prepared to do that. 

When they receive a link-local packet, they know which interface they received
it on, but they have by definition no way to validate the zone ID, so certainly
they either have to take it on faith or ignore it. I don't see a problem
with ignoring it, but there would be code changes needed to do so.

> Hostname checks
> are critical for security, especially the consistent treatment of the field
> across different components like serving infrastructure, web application
> firewalls, access control modules, and other components.

Sure. That's exactly why we wanted to strip it in 4874, but implementers
said no. So it needs to be ignored at the server side. We should say that
somewhere in the document.

> 
> This is a non-backwards-compatible change to RFC 3986.  The only issue related
> to this that is addressed in the draft is the question of document management -
> this updates RFC 3986 - but surely there are other concerns that might need to
> be addressed.  I see some effort to address software backwards-compatibility in
> discussion threads, but I found very little in the draft itself.

It seems like a pretty standard legacy problem to me. It will only work on
updated systems, and it will fail on legacy systems.

> 
> The configuration of zones on a machine is could be private information, but
> this information is being broadcast to servers.

Well, unicast actually.


> In HTTP, that is in Host
> header fields; on the Web, in document.location.  This information might
> contribute significant amounts of information toward a fingerprint.  I
> appreciate that the stripping of zone ID was never implemented, but it is a
> useful feature.

But as noted above, vigorously rejected by implementers (and actively
damaging to the CUPS use case).

> 
> Arguments in Section 5 depend on the zone IDs being hard to guess, but that
> isn't true.  Zone IDs are - in practice - low entropy fields.  

Well, they vary, and in some cases could be guessed from the MAC address.
They are not the main defence against scanning attacks.

> More critically,
> they are fields that are sent to servers.

Yes, but only to very local servers, and not actually very useful in themselves.

> 
> Zone ID size is not bounded - most implementations will have a size limit on
> the authority or host portion of a URI (256 octets is sufficient for current
> names), but the implication is that Zone IDs could be arbitrary length.

That's a defect in RFC 4007 that we can't do much about, but we should
note it.
  
> Though percent-decoding is not likely to be a concern from a specification
> perspective (the operative specification from the browser perspective does not
> apply pct-decoding to a v6 address [2]), what work has been done to verify that
> a zone ID won't break existing software?

That's a negative I can't prove. However, my patched version of wget didn't break
the server in my FritzBox, which simply did what I expected it to do.

Regards,
     Brian

> 
> [1] https://mailarchive.ietf.org/arch/msg/last-call/4vEKZosvKvqJ9cufSm5ivsCho_A/
> [2] https://url.spec.whatwg.org/#concept-host-parser
> 
>