Re: [tcpm] New Version Notification for draft-ietf-tcpm-accurate-ecn-11.txt

Bob Briscoe <ietf@bobbriscoe.net> Thu, 12 March 2020 18:58 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 859683A105D for <tcpm@ietfa.amsl.com>; Thu, 12 Mar 2020 11:58:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yMraiNp9FoAE for <tcpm@ietfa.amsl.com>; Thu, 12 Mar 2020 11:58:55 -0700 (PDT)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0A9E73A0FC0 for <tcpm@ietf.org>; Thu, 12 Mar 2020 11:58:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=hVoc5IZri8jfL8vkT8pYsFg95mFhMik8ZhBre5sz6H8=; b=ybeZJLgrevP3RM7UL2Qii6reh KdC1xmxcJ80N8eQtFJV9nBbKBi1LE/veOVaxDmz1saHIg9c99dSUuDyeBgsyzw8ilj62xnTv8EBK1 TaIzBc1BklHpSE5YXo8Zt87Kw9EpKm4Z4plrkn03XQczRfxXyFl+9h8m9qaNBAo6GlmbyEMsZ1JVK eOQ9HEFu6AlceJuHSe1xDo6nsOC+xtMVqoVv05zdxt+AhtJh7zYJ6uMUlWq2NQ0cv1mPmN1W35K8y Z13MSqhrowQcGxxppAobagbdf5b3c1bKwEFz1Z/ShiLf7zyJCfC5FedxAOj7BSkiwlBOEjvfd/Rij IZ+SbiYRA==;
Received: from [31.185.135.141] (port=33380 helo=[192.168.0.4]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1jCT2h-00DcF3-LR; Thu, 12 Mar 2020 18:58:52 +0000
To: "Scharf, Michael" <Michael.Scharf@hs-esslingen.de>, tcpm IETF list <tcpm@ietf.org>
Cc: Richard Scheffenegger <richard.scheffenegger@netapp.com>, Mirja Kuehlewind <ietf@kuehlewind.net>
References: <6EC6417807D9754DA64F3087E2E2E03E2D9CA5C4@rznt8114.rznt.rzdir.fht-esslingen.de><cbffbda8-e752-c3a6-dc3e-7414cfc8ba10@bobbriscoe.net> <6EC6417807D9754DA64F3087E2E2E03E2D9D2380@rznt8114.rznt.rzdir.fht-esslingen.de>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <398428e9-6796-6940-7e0d-b67eced4e862@bobbriscoe.net>
Date: Thu, 12 Mar 2020 18:58:50 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <6EC6417807D9754DA64F3087E2E2E03E2D9D2380@rznt8114.rznt.rzdir.fht-esslingen.de>
Content-Type: multipart/alternative; boundary="------------911F448925DA88B77821D24A"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/Tvixt-Z3BpCr3jP9oDfulaJXE2I>
Subject: Re: [tcpm] New Version Notification for draft-ietf-tcpm-accurate-ecn-11.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Mar 2020 18:59:07 -0000

Michael,

On 10/03/2020 23:27, Scharf, Michael wrote:
>
> Bob,
>
> Out of my head, I am not aware of any IETF protocol that uses TLV 
> encoding with ambiguous message formats, i.e., message with the same 
> type but different message formats depending on earlier messages. For 
> instance, this can easily get messy during debugging. And a protocol 
> that is hard to debug is **not robust** in my point of view.
>

I understand that the highest bit solution is not ideal. I am merely 
pushing back on the notion that it is completely unacceptable.

> I have already pointed out that IMHO a tool like Wireshark cannot 
> correctly decode the option for a partial PCAP trace. The same applies 
> to any passive monitoring tool that uses sampling. This has 
> significant operational impact. Well, your mileage may vary, but I see 
> no value in obscuring the counters by an „unorthodox“ encoding that 
> requires keeping state to figure out what the option format actually is.
>
> And I am not a hardware expert, but if AccECN should really be 
> sucessful, hardware offloading in NICs will probably matter in future. 
> Thus, the length of options is not the only tradeoff to be considered 
> to make the protocol sucessful. As far as I understand, simple 
> stateless message formats help a lot when it comes to hardware offloading.
>
That was indeed why we originally had a fixed field order.

I proposed two alternative ways to allow different field orders: a) two 
option kinds or b) first counter bit.
The WG chose the first counter bit, so I wrote it up.

I just wish this WG would make up its mind, and have some discipline so 
that people know in advance that a decision will not be reversed at the 
next meeting. Then people know that they have to stop reading their 
email and really think carefully before making a decision.


> It is well understood that many implementers may try to avoid the 
> option altogether. An „unorthodox“ encoding will almost certainly not 
> help to get the option widely deployed.
>
> Just to repeat myself, if I had to pick between the current proposal 
> in -11 and two different option codepoints, I would go for the latter. 
> We have >150 TCP option codepoints left. That resource is not so 
> scarce after all. I fail to see any real-world benefit of the proposal 
> in -11 as compared to two separate codepoints.
>
> In a nutshell, I believe you should look for an option design that can 
> be processed by a stateless decoder. In other words, KISS. There are 
> plenty of ways how to do that, and adding a byte is only one of them. 
> I have already mentioned several other alternatives, and probably 
> there are more (e.g., one could use a 2-bit type field for each 
> counter to identify it). All of them seem **simpler** to me than the 
> current proposal in -11.
>
> To me, this idea in -11 **is broken** and this should be fixed in -12.
>
I don't think you have answered my questions sufficiently to make such a 
claim. You have given examples of cases where 3rd party monitoring will 
get confused. But calling that "broken" is, I think, an overstatement. 
How does the protocol itself /break/?

You have said it doesn't allow the order to be changed mid connection. 
But that is not "broken". That is a feature that I'm not convinced is 
needed. If the occasional connection would have made more efficient use 
of 3B of option space if it had been able to switch to a different field 
order, is that worth the extra complexity?

That is what I was getting at from my questions: What is actually "not 
robust / broken" about it? In what case(s) can it fail? Have you proved 
these problems exist (whether paper proof or empirical)?

What I don't want to happen (again) is that the WG swings to yet another 
solution, before a proper case has been made for the need to do so.


Bob

> Michael
>
> *Von: *Bob Briscoe <mailto:ietf@bobbriscoe.net>
> *Gesendet: *Dienstag, 10. März 2020 23:00
> *An: *Scharf, Michael <mailto:Michael.Scharf@hs-esslingen.de>; tcpm 
> IETF list <mailto:tcpm@ietf.org>
> *Cc: *Richard Scheffenegger <mailto:richard.scheffenegger@netapp.com>; 
> Mirja Kuehlewind <mailto:ietf@kuehlewind.net>
> *Betreff: *Re: [tcpm] New Version Notification for 
> draft-ietf-tcpm-accurate-ecn-11.txt
>
> Michael,
>
> I agree extensibility is nice. And I'm open to a design with a 1 octet 
> field immediately after the length field, for flags and the like. In 
> fact, I encouraged Ilpo to put his idea to the list (which he had put 
> to be offlist). But let's be disciplined about this if we're going to 
> start down this road...
>
> 1) This is not the time for "it would be nice if". I want to go back 
> to your original email and ask you to be much more specific about the 
> problems envisaged. I agree the idea of switching field order based on 
> the first bit is unorthodox, but what is actually "not robust" about 
> it? In what case(s) can it fail? Have you proved these problems exist 
> (whether paper proof or empirical)?
>
> 2) We have a very *inextensible* lack of TCP option space to deal 
> with. And we don't know what other important options might be invented 
> in future. So we shouldn't burn even a single byte unless we have good 
> reason to. Extensibility is good. But it does burn space.
>
> 3) I haven't asked my co-authors, but I know that Mirja wanted the 
> design to be as simple as possible, and now we're moving away from that.
>
> If it ain't broken don't fix it.
>
>
>
> Bob
>
> On 09/03/2020 20:00, Scharf, Michael wrote:
>>
>> With chair hat off, I really wonder if the solution to encode the two 
>> different orders in the TCP Option is an example for good and robust 
>> protocol engineering.
>>
>> For instance, the current design makes it hard to decode the field in 
>> a monitoring tool (such as Wireshark). Also, as far as I understand, 
>> it does not allow to switch the encoding during a connection, which 
>> limits flexibility. We almost certainly do not understand **now** all 
>> future use cases of this Standard.
>>
>> Unless I miss something, there would be several other solutions:
>>
>> First, IMHO, we have enough TCP option codepoints left to spend two 
>> codepoints if there is a good reason for doing so. As compared to the 
>> current design proposal in -10/-11, spending two different option 
>> kinds would look to me like **much** better protocol engineering.
>>
>> Second, if the TCPM community insists in only one option kind 
>> codepoint for whatever reason, IMHO one could add one „sub-type“ byte 
>> to the option. The TCP Option field has to be multiples of 4 byte, 
>> i.e., if a segment only contains a 11 byte AccECN TCP option, an 
>> additional NOP TCP option is needed for padding, no? So, what 
>> downside have 12 bytes as compared to 11 bytes? For the shorter 
>> variants, the overhead of a „sub-type“ field increases, but it may 
>> still be within reasonable limits. What do I miss?
>>
>> Third, one could use different lengths for the different orders, 
>> e.g., lenths 5/8/11 for type 0 and 6/9/12 for type 12. Is this not 
>> possible?
>>
>> In all these cases, the resulting protocol looks simpler and more 
>> robust to me. What prevents us from using the KISS principle?
>>
>> Michael
>>
>> *Von: *Bob Briscoe <mailto:ietf@bobbriscoe.net>
>> *Gesendet: *Freitag, 6. März 2020 04:34
>> *An: *tcpm IETF list <mailto:tcpm@ietf.org>
>> *Cc: *Richard Scheffenegger 
>> <mailto:richard.scheffenegger@netapp.com>; Mirja Kuehlewind 
>> <mailto:ietf@kuehlewind.net>
>> *Betreff: *Re: [tcpm] New Version Notification for 
>> draft-ietf-tcpm-accurate-ecn-11.txt
>>
>> tcpm,
>>
>> You will have seen draft-10 then draft-11 in quick succession, as 
>> already explained.
>> The diffs from draft-09 to -10 were those that had built up since Jul'19.
>> The diffs from draft-10 to -11 were solely those for the change from 
>> EXP track to STD track.
>> Draft-10 doesn't seem to display in the list of links to each 
>> version, but you can manually write the URL.
>>
>> The main technical changes in draft-10 were numerous - many will be 
>> recognized from list discussion since Jul'19.
>> Particular thanks to Ilpo Järvinen who identified many niggles (and 
>> their solutions) while writing and testing a full Linux 
>> implementation (based on Olivier Tilmans's, in turn based on Mirja's).
>>
>>   * Allowed 2 different orders of the fields in the AccECN Option
>>   * Reflect IP-ECN field of SYN/ACK only on ACK of SYN/ACK, not also
>>     on first data packet
>>       o greatly simplifies implementation, esp with TFO.
>>       o repeating on first data packet was for reliable delivery,
>>         which is now achieved with ACE counter (see next bullet)
>>   * Increment the ACE counter if CE on SYN/ACK (but still not if CE
>>     on SYN)
>>       o Reliable delivery of feedback of CE on SYN/ACK
>>   * Redefine 'first packet' as first to arrive, not first in sequence
>>     in 2 cases:
>>       o Handshake reflection on the ACK of the SYN/ACK
>>       o In the test for zeroing of ACE
>>       o Reason: greatly simplifies implementation
>>   * if ACE could have wrapped more than once, SHOULD assume “safest
>>     likely case”
>>     not "conservatively assume" it did cycle
>>       o Reason: avoid unnecessary hit on performance
>>   * More robustness (with flexibility) in rules for when to include
>>     an AccECN Option
>>       o Change-triggered AccECN Option as SHOULD, not MUST
>>       o SHOULD follow change-triggered AccECN Option with another
>>         (removes ambiguity if ACK thinning or loss)
>>       o when same counter continues to increment, SHOULD consistently
>>         include it every n ACKs
>>       o Made rule about precedence of SACK conditional (max 2 SACK
>>         blocks)
>>       o MAY exclude counters that have not changed for the whole
>>         connection
>>   * Allowed an AccECN server not to implement RFC3168 ECN (all
>>     clients still have to)
>>   * Precluded mixed capability negotiation from either end
>>       o reduces freedom to choose SYN & SYN/ACK fall-back strategies
>>       o to prevent cases where each end's outcome after handshake
>>         could be inconsistent (in reordering corner-cases)
>>   * Reserved the codepoint combination used by the historic nonce case
>>   * Merged in a number of points from RFC3168 that we hadn't covered
>>       o (a whole new subsection about obligations to do with ECN)
>>   * Explicit about checking "acceptable packets"
>>       o before counting their ECN markings or before counting the ECN
>>         feedback they carry
>>   * Required retransmitted Fallback SYN to use same ISN
>>       o allows servers to detect ECN downgrade SYN attacks
>>   * Handled corner cases like In-window SYN during TIME-WAIT
>>
>>
>>
>> Bob
>>
>> On 06/03/2020 02:24, internet-drafts@ietf.org 
>> <mailto:internet-drafts@ietf.org> wrote:
>>
>>       
>>
>>     A new version of I-D, draft-ietf-tcpm-accurate-ecn-11.txt
>>
>>     has been successfully submitted by Bob Briscoe and posted to the
>>
>>     IETF repository.
>>
>>       
>>
>>     Name:            draft-ietf-tcpm-accurate-ecn
>>
>>     Revision: 11
>>
>>     Title:           More Accurate ECN Feedback in TCP
>>
>>     Document date:   2020-03-05
>>
>>     Group:           tcpm
>>
>>     Pages:           58
>>
>>     URL:https://www.ietf.org/internet-drafts/draft-ietf-tcpm-accurate-ecn-11.txt
>>
>>     Status:https://datatracker.ietf.org/doc/draft-ietf-tcpm-accurate-ecn/
>>
>>     Htmlized:https://tools.ietf.org/html/draft-ietf-tcpm-accurate-ecn-11
>>
>>     Htmlized:https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-accurate-ecn
>>
>>     Diff:https://www.ietf.org/rfcdiff?url2=draft-ietf-tcpm-accurate-ecn-11
>>
>>       
>>
>>     Abstract:
>>
>>         Explicit Congestion Notification (ECN) is a mechanism where network
>>
>>         nodes can mark IP packets instead of dropping them to indicate
>>
>>         incipient congestion to the end-points.  Receivers with an ECN-
>>
>>         capable transport protocol feed back this information to the sender.
>>
>>         ECN is specified for TCP in such a way that only one feedback signal
>>
>>         can be transmitted per Round-Trip Time (RTT).  Recent new TCP
>>
>>         mechanisms like Congestion Exposure (ConEx), Data Center TCP (DCTCP)
>>
>>         or Low Latency Low Loss Scalable Throughput (L4S) need more accurate
>>
>>         ECN feedback information whenever more than one marking is received
>>
>>         in one RTT.  This document specifies a scheme to provide more than
>>
>>         one feedback signal per RTT in the TCP header.  Given TCP header
>>
>>         space is scarce, it allocates a reserved header bit, that was
>>
>>         previously used for the ECN-Nonce which has now been declared
>>
>>         historic.  It also overloads the two existing ECN flags in the TCP
>>
>>         header.  The resulting extra space is exploited to feed back the IP-
>>
>>         ECN field received during the 3-way handshake as well.  Supplementary
>>
>>         feedback information can optionally be provided in a new TCP option,
>>
>>         which is never used on the TCP SYN.
>>
>>       
>>
>>                                                                                        
>>
>>       
>>
>>       
>>
>>     Please note that it may take a couple of minutes from the time of submission
>>
>>     until the htmlized version and diff are available at tools.ietf.org.
>>
>>       
>>
>>     The IETF Secretariat
>>
>>       
>>
>>       
>>
>>
>>
>> -- 
>> ________________________________________________________________
>> Bob Briscoehttp://bobbriscoe.net/
>>
>
> -- 
> ________________________________________________________________
> Bob Briscoehttp://bobbriscoe.net/

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/