Re: [TLS] Narrowing allowed characters in ALPN ?

Viktor Dukhovni <ietf-dane@dukhovni.org> Thu, 20 May 2021 15:26 UTC

Return-Path: <ietf-dane@dukhovni.org>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DA6033A1A82 for <tls@ietfa.amsl.com>; Thu, 20 May 2021 08:26:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TMdgdgwjZ7Mj for <tls@ietfa.amsl.com>; Thu, 20 May 2021 08:26:30 -0700 (PDT)
Received: from straasha.imrryr.org (straasha.imrryr.org [100.2.39.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A7AC13A1A80 for <tls@ietf.org>; Thu, 20 May 2021 08:26:30 -0700 (PDT)
Received: by straasha.imrryr.org (Postfix, from userid 1001) id E6A56D4918; Thu, 20 May 2021 11:26:28 -0400 (EDT)
Date: Thu, 20 May 2021 11:26:28 -0400
From: Viktor Dukhovni <ietf-dane@dukhovni.org>
To: "tls@ietf.org" <tls@ietf.org>
Message-ID: <YKZ/pMV1oy/UV9od@straasha.imrryr.org>
Reply-To: tls@ietf.org
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CAHbrMsD=atgKLr+eN+W2ieuF7YrP3pm-05B=txEoDOMrCGPRvg@mail.gmail.com> <CAErg=HH5DfBpkPx48NKy4air1N1FKiwVCttYz5ddCfw+K3eQuA@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/Q4S72Hfeb4bfX3Fqg5fnhBKIQXY>
Subject: Re: [TLS] Narrowing allowed characters in ALPN ?
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 May 2021 15:26:35 -0000

On Thu, May 20, 2021 at 03:06:06AM -0400, Ryan Sleevi wrote:

> On Thu, May 20, 2021 at 1:56 AM Viktor Dukhovni <ietf-dane@dukhovni.org> wrote:
> 
> > On Wed, May 19, 2021 at 10:29:43PM +0000, Salz, Rich wrote:
> >
> > > I support limiting it.
> >
> > I concur.  These are not strings used between users to communicate in
> > their native language.  They are machine-to-machine protocol
> > identifiers, and use of the narrowest reasonable character set promotes
> > interoperability.
> >
> 
> I'm not sure I understand this. Could you expand further how adding more
> normative restrictions promotes, rather than harms, interoperability?

See below.  The idea is not to ask TLS implementations to reject
non-ASCII ALPN values, but rather for non-ASCII values to not be
defined, facilitating better interoperability among systems that
exchange ALPN values outside of TLS in various serialisation contexts.

> The fact that, as you highlight, they are machine-to-machine, seems like
> the greatest path to interoperability, because they shouldn't be assumed to
> be "human-readable", and because as specified, no other validation needs to
> be performed by either party. They should simply be treated as they're
> specified: an opaque series of bytes.

Yes, in TLS, but once they end up in DNS zones, configuration files,
pasted into web forms (to update DNS HTTPS/SVCB records...) various
pain points arise.

On Thu, May 20, 2021 at 07:39:59AM -0700, Ben Schwartz wrote:
> On Thu, May 20, 2021 at 6:30 AM Salz, Rich <rsalz=
> 40akamai.com@dmarc.ietf.org> wrote:
> 
> > Look at RFC 701, it says: the precise set of octet values that identifies
> > the protocol. This could be the UTF-8 encoding of the protocol name.
> >
> > So I changed my mind and think it's okay to leave as-is but wouldn't mind
> > if it became less general or more specific. For example, what if a protocol
> > string matches a truncated UTF8 string?  It makes me think of SNI which
> > could have any format, but really is "any format as long as it's a DNS name"
> 
> One intermediate option might be to keep the ALPN TLS extension 8-bit
> clean, but change the IANA instructions for the ALPN registry to tighten
> the registration requirements.

Yes, this, but also a commmitment to keep it that way, so that e.g.
HTTPS/SVCB can rely on not needing to encoding ALPN values that are
non-ASCII, have control characters, commas, double-quotes, ...

So the below should suffice:

    * lower-case ASCII letters a-z
    * ASCII digits 0-9
    * "+", "-", ".", "/" and "_"

-- 
    Viktor.