Re: [TLS] Narrowing allowed characters in ALPN ?

David Benjamin <> Thu, 20 May 2021 15:23 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 98C153A1A63 for <>; Thu, 20 May 2021 08:23:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -9.347
X-Spam-Status: No, score=-9.347 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.698, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_SBL=0.5, URIBL_SBL_A=0.1, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id sH8O5YNQlUVA for <>; Thu, 20 May 2021 08:23:32 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::102b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C2B1A3A1A66 for <>; Thu, 20 May 2021 08:23:32 -0700 (PDT)
Received: by with SMTP id b15-20020a17090a550fb029015dad75163dso5482274pji.0 for <>; Thu, 20 May 2021 08:23:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5hJpYlu8BfowSAbhBQCqL0tiqfU10gJR+UW4u34gD60=; b=TvmSTk1MXN4vgUk6qwDuR50R/YNxDif/IWHdEQccMJE4QtjjmH9jcSfU6Wzav8w/gq LMKKTj2CWFvsmtvtarkliPj1KKFIFVMggI9EsenfrdpK2fnvRlZ7mN5o9cWLyPaRcnlM LRg2vJifPvzJHob1tOCuxaSbovT4a2U7YzL7M=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5hJpYlu8BfowSAbhBQCqL0tiqfU10gJR+UW4u34gD60=; b=jUsxy3/wcgaGUiDaLdAZtRwMY2iVtsK2WAr41wFSR5MjmaFkG0QNgGBoMROIqM7aDf qQ11tW27z7cwtaZi99MTHPP6g57xeaj9O6hLEHa7Rg99T8W7H/Rxeg36rmYbs8HltvsZ D7wXEaqwiYgQYaGyb9YYYopVcfbTtidEuROKZikdI5if4k/denQ/ymb50zBiJv66LNZE pC7cVmPxgGbgMUdphTWG8xtOip3jiQIoC8WSUIPNXGQyrJCBq6izZNSDu+5/MDOQ54ys LlSg3yl+UdfNmZBRcMXM4bxnFQn86Kco650ip+bVp/XlFbL+9cjrXliipNcWjLH6nOmj +GZg==
X-Gm-Message-State: AOAM532rxy1hkaGDCR6gNynwtPat/JKZ/nnoqp7GKJxZGrMAEoFYd7a8 uIkVsxw0CIwhhEVPlXYcDLfVjGP/1e6j2vCOHwWUYXUDAQ==
X-Google-Smtp-Source: ABdhPJzd5enDkD8E9J0TrlkCbb5Y1MtMTJh9ctccgjcmtmBwyyeTXyw+Gb0oVsQhJkSmCg834lLJm8NCTU34/63FteM=
X-Received: by 2002:a17:90b:3b86:: with SMTP id pc6mr5522829pjb.162.1621524211509; Thu, 20 May 2021 08:23:31 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <> <>
In-Reply-To: <>
From: David Benjamin <>
Date: Thu, 20 May 2021 11:23:15 -0400
Message-ID: <>
To: Ben Schwartz <>
Cc: "Salz, Rich" <>, "" <>
Content-Type: multipart/alternative; boundary="0000000000009fb7ea05c2c486c5"
Archived-At: <>
Subject: Re: [TLS] Narrowing allowed characters in ALPN ?
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 20 May 2021 15:23:38 -0000

SVCB's syntax would need us to not only exclude non-ASCII characters but
also avoid random delimiters like commas, right? I think that's going a bit
too far. As Ryan notes, complex definitions for allowed strings result in
ambiguities around who is responsible for validating what and subtle
variations in different implementations. That ambiguity can lead to
injection attacks when one component of a system expects some validation,
but the other component disagrees.

I think a system that consistently expects a simple data type is more
robust than carefully maneuvering around commas, spaces, newlines, etc.
Text protocol syntaxes aren't universal syntax, and for every
delimiter-based protocol we dodge, there'll be one we hit. For instance,
ALPN identifiers already cannot be used as filenames because "http/1.1"
includes a slash. More generally, the ship has sailed. Applications already
need to tolerate arbitrary 8-bit ALPN strings out of their TLS libraries.

That said, documenting some interop risks when allocating values is
reasonable. IIRC, a lot of Java TLS stacks have issues with non-UTF8
(perhaps even non-ASCII?) identifiers. The getApplicationProtocol() API
reports the ALPN protocol as a String (16-bit) and implementations
sometimes use a random character set without paying attention.

Note this isn't a fundamental limitation of 16-bit strings. It's possible
to convey 8-bit-clean in a 16-bit string, if you define a suitable, if
inelegant, encoding.

On Thu, May 20, 2021 at 10:40 AM Ben Schwartz <bemasc=> wrote:

> On Thu, May 20, 2021 at 6:30 AM Salz, Rich <rsalz=
>> wrote:
>> Look at RFC 701, it says: the precise set of octet values that identifies
>> the protocol. This could be the UTF-8 encoding of the protocol name.
>> So I changed my mind and think it's okay to leave as-is but wouldn't mind
>> if it became less general or more specific. For example, what if a protocol
>> string matches a truncated UTF8 string?  It makes me think of SNI which
>> could have any format, but really is "any format as long as it's a DNS name"
> One intermediate option might be to keep the ALPN TLS extension 8-bit
> clean, but change the IANA instructions for the ALPN registry to tighten
> the registration requirements.
> _______________________________________________
> TLS mailing list