Re: Too many GREASE codepoints

Christian Huitema <huitema@huitema.net> Tue, 17 November 2020 23:52 UTC

Return-Path: <huitema@huitema.net>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7AF793A105E for <quic@ietfa.amsl.com>; Tue, 17 Nov 2020 15:52:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.89
X-Spam-Level:
X-Spam-Status: No, score=-1.89 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P3P0QSBKK0Eg for <quic@ietfa.amsl.com>; Tue, 17 Nov 2020 15:52:48 -0800 (PST)
Received: from mx36-out10.antispamcloud.com (mx36-out10.antispamcloud.com [209.126.121.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F00B83A104A for <quic@ietf.org>; Tue, 17 Nov 2020 15:52:45 -0800 (PST)
Received: from xse439.mail2web.com ([66.113.197.185] helo=xse.mail2web.com) by mx18.antispamcloud.com with esmtp (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1kfAm0-0004jf-D1 for quic@ietf.org; Wed, 18 Nov 2020 00:52:40 +0100
Received: from xsmtp22.mail2web.com (unknown [10.100.68.61]) by xse.mail2web.com (Postfix) with ESMTPS id 4CbN5m0L3Kz1YKj for <quic@ietf.org>; Tue, 17 Nov 2020 15:52:12 -0800 (PST)
Received: from [10.5.2.18] (helo=xmail08.myhosting.com) by xsmtp22.mail2web.com with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1kfAlf-0000tJ-UP for quic@ietf.org; Tue, 17 Nov 2020 15:52:11 -0800
Received: (qmail 12938 invoked from network); 17 Nov 2020 23:59:26 -0000
Received: from unknown (HELO [192.168.1.103]) (Authenticated-user:_huitema@huitema.net@[172.58.43.45]) (envelope-sender <huitema@huitema.net>) by xmail08.myhosting.com (qmail-ldap-1.03) with ESMTPA for <quic@ietf.org>; 17 Nov 2020 23:59:25 -0000
Subject: Re: Too many GREASE codepoints
To: Martin Thomson <mt@lowentropy.net>, quic@ietf.org
References: <59a19e58-ced4-4a38-89da-e9441df162fb@www.fastmail.com>
From: Christian Huitema <huitema@huitema.net>
Message-ID: <2a594580-dae0-a678-b5e9-81f040cfa264@huitema.net>
Date: Tue, 17 Nov 2020 15:52:11 -0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.3
MIME-Version: 1.0
In-Reply-To: <59a19e58-ced4-4a38-89da-e9441df162fb@www.fastmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Originating-IP: 66.113.197.185
X-Spampanel-Domain: xsmtpout.mail2web.com
X-Spampanel-Username: 66.113.197.0/24
Authentication-Results: antispamcloud.com; auth=pass smtp.auth=66.113.197.0/24@xsmtpout.mail2web.com
X-Spampanel-Outgoing-Class: unsure
X-Spampanel-Outgoing-Evidence: Combined (0.15)
X-Recommended-Action: accept
X-Filter-ID: Mvzo4OR0dZXEDF/gcnlw0cThwnZT+ODcFyeCwCHoUjupSDasLI4SayDByyq9LIhVUZbR67CQ7/vm /hHDJU4RXkTNWdUk1Ol2OGx3IfrIJKywOmJyM1qr8uRnWBrbSAGDcnqpk5VeF3xR4kF6iVwRtbgN zB/4Jkrw1eDLcif59fuB0dkgey8U6ifKKFvdZQxUU7Tmz6iKnkQL9gqsxD347235Nhqq+/HvroPq 8GSPg+5hmwN8D4LrepG7AX8WNwY8PM2DmctUkuqITdPZ8a3LuY6jSvfpO+1kZkomjtjB6X5Q5Q9f RUeIpTIC2ySfqvnqLwoxlgatmaBb0rBiK9xbkDrUqzcKIief90MVLZY9LbIZh9+IQ1oS9LBn3VIP 95Jz7ujRlJ9wSMlhvaudJXZ9EIBG/qaR+8r9SKFMmPJLf850OvZYsmoVQuOIhwKLK6IKBNB4LZ0v UHHKTzJX7b1JhLSQQ4vSj0QEim26t/Moy0UPX5E73H1QfrH/5kkrV/Cr0bm2vWdo8usP65i82q1C dZgGrpL44wdx9eXqjQjbvUopOMQJvQ/Ck3iiU+4DQAj3fuQgzT3K9JUHTNiGwfwAmxx/Wk8McinP JEkgAVrOMpbIPHELHGuybVREcKQKsZ6SQvxvQDFut/09iucwtxpgCJSZKKyxYJ3lI3BqPFdSovcv UwPy3x0FYtCNEb10sHyQCLHEvD1OqP6bgZ4L66GcgBg66gs5OuzYxJgw5atIxeNDvjI/CYe5WPy0 +t1RP0az3O8HxoIF7IU4W378KLvMi5xMPnetLBJMh51NiRRoHIDWSkHgwu/8fm9G1PMhdEKqmiK7 x42VjdzChZMe6O/Did+/hGXTmfhE+Dx2/NyzMXpxnoHzo+gJcabKXjn8ZsryvOnCUbNPgcPcQwzM gKHyQxUo+ql2ySTkvEFH/23XMww2BnTTFGX5/yI4Ky+1ZJcbGqc5H4PEZHeoI/d6LWFf332z7LMw LGdoi9FMQ5j9dQUvMi1YKAun15JQSJLyCT5k+MTObVKxHy/dols381l9r9ft9daDonlwd6LnuX+J u10=
X-Report-Abuse-To: spam@quarantine11.antispamcloud.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/WZ-2iqSgJQlhXIoomMNK-_olsgk>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Nov 2020 23:52:50 -0000

On 11/17/2020 3:41 PM, Martin Thomson wrote:
> (I've been meaning to write this for a few days, but the IANA review prompted me to sit down and work through it.)
>
> I have come to believe that we are reserving too many codepoints for greasing.
>
> The greasing function we currently use is 27+N*31.  That's about 2^57 values.  148764065110560896 by my estimation.
>
> The waste I can tolerate.  This is a very large number, but it leaves plenty of values free for use.  The problem is that the odds of collision are too low.
>
> This problem was highlighted when draft-iyengar-quic-delayed-ack-02 chose a greased codepoint.  The failure mode there is terrible: occasionally an implementation will generate a greased value that collides with this value.  If an implementation of that draft encounters that collision it will fail.  It will probably fail when trying to parse the transport parameter, but it could instead fail when it receives the frame.  Alternatively, is also plausible that an implementation of the draft could include a greased value with the same codepoint, which would be similarly bad.  Such failures are virtually impossible to reproduce and unless you are lucky enough to have good logging, you might have trouble catching the reason for the error.
>
> The odds that an implementation of that draft would encounter a peer who used a greased codepoint with the same value is minuscule.  There are around 2^57 valid greasing values, so even if every connection ever included greased values, the odds of a given connection containing that value is tiny. (Coincidentally, this is the same as the probability we allow for an attacker to attack the packet protection AEAD.)
>
> I think that we want a higher risk of collision than that.  Then mistakes like this will result in a higher probability of failure when deployed.  Ideally, failures will be discovered in testing.
>
> The reason we have a non-trivial number of reserved codepoints is to avoid having implementations that filter out these reserved values by just enumerating them.  However, for that we don't need so many reserved values.  Even as little as 1000 will create a disincentive to tabulate the values.  That many would add a large 8k table to a simple implementation.
>
> If we were to select 1000 (or 1024) values in a way that is relatively simple to generate from a random number, then that would be better than the current situation.
>
> So I'm going to propose that we define a new function that takes a 10-bit random value and produces a greasing codepoint.  This can be used to produce a greasing codepoint.  I have two requirements:
>
> * It should be a tiny bit more complex to implement a function that tests a value for whether it is reserved.  The current scheme is easy to filter.  (X - 27) % 31 == 0 isn't much code to write.
>
> * The scheme needs to produce values that encode to 1, 2, and 4 bytes in varints with reasonable probability.  An even distribution over 2^62 with 1024 values has a gap of 2^52 between each value, which means that 2 byte and 4 byte values aren't likely to be represented.
>
> The function I've been thinking of looks like this:
>
> def grease(n):
>     v = 27
>     i = 1
>     while n > 0:
>         v = (v << i) | (n & 1)
>         i = i + 1
>         n = n >> 1
>     return v
>
> This is bit spreader, taking the bits of the random value and adding progressively more space between each bit.
Looking for a nerd feast, are we?
>
> When passed 0x3ff, the resulting value is the 60-bit binary value: 0b110111010010001000010000010000001000000010000000010000000001.  Smaller values result in different 1 bits being turned off.  If it is passed 0x1ff, it produces the shorter 50-bit value of 0b11011101001000100001000001000000100000001000000001.
>
> The result is two values that encode to 1 byte, 14 values that encode to 2 bytes, 48 values that encode to 4 bytes, with the remaining 960 values encoding to 8 bytes.
>
> Detecting this is fairly easy, but also probably not worth writing out the code to do so.  This could be made considerably harder by flipping more than one bit at a time.  For example, `v = (v << i) ^ ((n & 1) * 19)`.
>
> What do people think?  Is this a change we could tolerate?  Or is my logic faulty here in some way?

I don't know. I observe that most extension designers pick a 12 to 16
bit code that looks cute, and then prepend something like "ff01" to deal
with draft numbers. A nice property would be to make grease collisions
up to 16 bits easy to evaluate, and then ensure that patterns like
"ff121dea" do not collide if the root pattern "1dea" does not.

-- Christian Huitema