Re: [tcpm] Comments on draft-ietf-tcpm-tcp-edo-13

"Iwashima, Kuniyuki" <kuniyu@amazon.co.jp> Fri, 18 August 2023 23:11 UTC

Return-Path: <prvs=5879c4b1e=kuniyu@amazon.co.jp>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5DEB7C14CE2E for <tcpm@ietfa.amsl.com>; Fri, 18 Aug 2023 16:11:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.405
X-Spam-Level:
X-Spam-Status: No, score=-4.405 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=amazon.co.jp
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eqNqQYu6CICD for <tcpm@ietfa.amsl.com>; Fri, 18 Aug 2023 16:11:22 -0700 (PDT)
Received: from smtp-fw-9105.amazon.com (smtp-fw-9105.amazon.com [207.171.188.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7EFB0C15109A for <tcpm@ietf.org>; Fri, 18 Aug 2023 16:11:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1692400283; x=1723936283; h=from:to:cc:subject:date:message-id:content-id: content-transfer-encoding:mime-version; bh=wxPqi4F/l0gJMdiAkJ4nVRUC5dH/BrcWFU8jiNE/sec=; b=iT/Vvb/2x4Fw51NvcS9p0elIBPwl7iGRGQxV03T1z5lP1UhOaiYtOcfY QKEzyGGSE2VHh+xUaN250Xl4oo+Sf0Qrl0ENLf3e3ZrxlPN1nyayOoD9k P5eSUpy1s9J4xI9UpVX8NPjDG5pVjWyBixXO76tmyoNw61Bex9hZlBG67 0=;
X-IronPort-AV: E=Sophos;i="6.01,184,1684800000"; d="scan'208";a="667139849"
Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-pdx-2a-m6i4x-1197e3af.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-9105.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Aug 2023 23:11:22 +0000
Received: from EX19MTAUWC002.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2a-m6i4x-1197e3af.us-west-2.amazon.com (Postfix) with ESMTPS id 6AB72100E9C; Fri, 18 Aug 2023 23:11:21 +0000 (UTC)
Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.30; Fri, 18 Aug 2023 23:11:21 +0000
Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.30; Fri, 18 Aug 2023 23:11:19 +0000
Received: from EX19D004ANA001.ant.amazon.com ([fe80::f099:cbca:cc6b:91ec]) by EX19D004ANA001.ant.amazon.com ([fe80::f099:cbca:cc6b:91ec%5]) with mapi id 15.02.1118.037; Fri, 18 Aug 2023 23:11:19 +0000
From: "Iwashima, Kuniyuki" <kuniyu@amazon.co.jp>
To: "touch@strayalpha.com" <touch@strayalpha.com>
CC: "Nishida, Yoshi" <nyoshif@amazon.com>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>, "Iwashima, Kuniyuki" <kuniyu@amazon.co.jp>
Thread-Topic: Comments on draft-ietf-tcpm-tcp-edo-13
Thread-Index: AQHZ0ilMhAMQRYPPq0S43DHLQsG0EA==
Date: Fri, 18 Aug 2023 23:11:19 +0000
Message-ID: <D14E8883-0CC5-4A39-8C1F-F9B62FDACA41@amazon.co.jp>
Accept-Language: ja-JP, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.106.101.20]
Content-Type: text/plain; charset="utf-8"
Content-ID: <BEFF50839CADA4419BF44BA0033FA65B@amazon.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/5e0CTDf8VYHsNOixnj9p_rCjmag>
Subject: Re: [tcpm] Comments on draft-ietf-tcpm-tcp-edo-13
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Aug 2023 23:11:26 -0000

From: "touch@strayalpha.com" <touch@strayalpha.com>
Date: Wed, 16 Aug 2023 18:01:19 -0700
> Hi, Kuniyui (if I read your email address correctly)
> 
> > On Aug 16, 2023, at 3:47 PM, Iwashima, Kuniyuki <kuniyu@amazon.co.jp> wrote:
> > 
> > Hello Joe,
> > 
> > I'm Kuniyuki from AWS and recently working on this topic with Yoshi.
> > I have prototyped Extended Data Offset in Linux based on net-next.git and now
> > extending packetdrill to test EDO.
> > 
> > I have two comments about EDO fallback mechanism and option processing.
> > 
> > Let's say a middlebox merges the final ACK of 3WHS and the following segment
> > (3rd & 4th packets in the script below).
> 
> At that point, you have a misbehaving middlebox. It’s bad enough that it modifies TCP packets in-flight (that’s not supported by TCP), but that’s especially true if an unrecognized option is seen. At that point, the middle box should be “hands off’ that flow” at a minimum.
> 
> > ---8<---
> > 0   socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> > +0  setsockopt(3, SOL_TCP, TCP_EXT_DATA_OFFSET, [1], 4) = 0
> > +0  bind(3, ..., ...) = 0
> > +0  listen(3, 1) = 0
> > 
> > +0  < S 0:0(0) win 1000 <mss 1000,edoOK>
> > +0  > S. 0:0(0) ack 1 <edoOK, mss 1460>
> > +0  < . 1:1(0) ack 1 win 1000 <edo>
> > +0  < . 1:2(1) ack 1 win 1000 <edo>
> > ---8<---
> > 
> > If the client uses the longer variant or the middle box changes the header length, 
> > the server notices that the packet has an invalid EDO options.
> > So, the server falls back to non-EDO mode as mentioned in this part.
> 
> Here’s the flaw with even analyzing this:
> 
> 	TCP options are offered in a SYN by a client
> 		using EDO Supported, where the client goes to SYN_SENT
> 	TCP options are offered by by the server in the SYN+ACK
> 		using EDO Supported, where the server goes to SYN_RECEIVED
> 	TCP options are confirmed by the client to the server in the final ACK
> 		using EDO Extension where the client goes to ESTABLISHED
> 		and the client can now start sending data 
> 	TCP options are confirmed at the server upon receipt of the final ACK
> 		where the server goes to ESTABLISHED
> 
> Once in ESTABLISHED, packets without EDO Extension are considered invalid.
> 
> So here’s the case you describe above:
> 	TCP options are offered in a SYN by a client
> 		using EDO Supported, where the client goes to SYN_SENT
> 	TCP options are offered by by the server in the SYN+ACK
> 		using EDO Supported, where the server goes to SYN_RECEIVED
> 	TCP options are confirmed by the client to the server in the final ACK
> 		using EDO Extension where the client goes to ESTABLISHED
> 		and the client can now start sending data 
> >> (which it does - the 4th packet)
> >> final ACK and data packets get merged and the result has an invalid EDO Extension 
> 	TCP options are  NOTconfirmed at the server upon receipt of the final ACK
> 		so the server stays in SYN_RECEIVED
> 
> At this point, the connection will fail. The final ACK goes back to the server but will lack a valid EDO Extension.
> 
> So the client is in ESTABLISHED but the server stays in SYN_RECEVED. It will try to resend the previous ACK, so the server should resend the final ACK of the TWHS, which will NOT confirm receipt of the data the client sent.
> 
> At this point, no progress is made on any packets that the middlebox alters. So either one side times out or they both sit there and get no data through, at which point one side times out. At least that’s what I think so far…
> 
> Note that the server never “falls back” to non-EDO mode here. Malformed packets are silently ignored.
> 
> > ---8<---
> >>> The EDO Extension option MAY be used only if confirmed when the
> >   connection transitions to the ESTABLISHED state, e.g., a client is
> >   enabled after receiving the EDO Supported option in the SYN/ACK and
> >   the server is enabled after seeing the EDO Extension option in the
> >   final ACK of the three-way handshake. If either of those segments
> >   lacks the appropriate EDO option, the connection MUST NOT use any
> >   EDO options on any other segments.
> > ---8<---
> > 
> > If the server accepts the packet and 3WHS completes,
> 
> It doesn’t (see above).
> 
> > EDO is disabled at 
> > the server but enabled at the client.
> 
> That doesn’t happen; the server backs off only if the ACK gets through without EDO Extension, not if the option is there and malformed.

Thank you for explanation.
This is what I wanted to confirm and how my implementation behaves.


> 
> > The server will sends an ACK for the (originally 4th) segment without EDO, 
> 
> Won’t happen (see above).
> 
> > but the segment will be discarded at the client side.
> > In this case, the client cannot receive any ACK and data from the server.
> > 
> > So, I think the following parts MUST be applied to the final ACK of 3WHS as well.
> > It would be helpful to clarify that we MUST NOT complete 3WHS with an invalid EDO.
> 
> That’s possibly useful to explain, but it’s already in the requirements AFAICT. Please let me know if you see otherwise.

I read "lacks the appropriate EDO option" differently:

  (1) lacks EDO Extension in ACK of 3WHS
  (2) has an invalid EDO Extension in ACK of 3WHS

In the situation above, the server side (SYN_RECV) "has not been
negotiated" EDO yet because of (2), and I read the following part
as "option MUST be ignored"

---8<---
   >> If EDO has not been negotiated and agreed, the EDO Extension
   option MUST be silently ignored on subsequent segments.
---8<---

and was wondering if it conflicts with this part and which should be
prioritised.

---8<---
   >> ... When the
   EDO Header Length is invalid, the TCP segment MUST be silently
   dropped.
---8<---

If the server ignores the option only (this should not happen though),
the server can process the ACK and transitions to ESTABLISHED state.

Then, the server did not agree EDO and "MUST NOT use any EDO options
on any other segments", leading to non-EDO fallback.

---8<---
The EDO Extension option MAY be used only if confirmed when the
connection transitions to the ESTABLISHED state
---8<---


> 
> > 
> > ---8<---
> >>> The EDO Header Length MUST be at least as large as the TCP Data
> >   Offset field of the segment in which they both appear. When the EDO
> >   Header Length equals the Data Offset length, the EDO Extension
> >   option is present but it does not extend the option space. When the
> >   EDO Header Length is invalid, the TCP segment MUST be silently
> >   dropped.
> > ...
> >>> When an endpoint receives a segment using the 6-byte EDO
> >   Extension option, it MUST validate the Segment_Length field with the
> >   length of the segment as indicated in the TCP pseudoheader. If the
> >   segment lengths do not match, the segment MUST be discarded and an
> >   error SHOULD be logged in a rate-limited manner.
> > ---8<---
> > 
> > Also, after that, client may send out another packet for retransmission and 
> > newly queued data.
> 
> I don’t see that happening, as per above.
> 
> > On the server side, EDO is disabled, but the packet will be processed ignoring
> > only EDO option.
> > Then, options after 60 bytes would be recv()ed accidentally by user.
> > 
> > ---8<---
> >>> If EDO has not been negotiated and agreed, the EDO Extension
> >   option MUST be silently ignored on subsequent segments.
> > ---8<---
> > 
> > I think this behaviour is not intended.
> > So, I think it's better to make it clear that what MUST be ignored should be
> > the whole segment, instead of EDO Extension option.
> 
> 
> We did say that elsewhere (the MUST NOT in the first quoted text above), but yes, this one should have said to drop the segment, not just the option.
> 
> Was that the only change?

Another change I added in my kernel is like:

  EDO Extension MUST NOT be used more than once in a segment.

In the current draft, we can use multiple EDO Extension if all of them
meet these conditions:

  (1) The EDO Extension option MUST occur within the space indicated by
      the TCP Data Offset.

  (2) The EDO Header Length MUST be at least as large as the TCP Data
      Offset field of the segment in which they both appear.

For example, we can craft a header like this.

  doff : 15 (60 bytes)
  EDO1 : 61
  EDO2 : 62
  EDO3 : 63
  ...
  <-- 60 bytes -->

This just slows down parsing and could be exploited for DoS.

Also, an EDO could have length less than other preceding EDO, and this
is confusing.

  EDO1 : 80
  EDO2 : 65
  EDO3 : 70

I do not see a good reason to support these use cases.  So, to make EDO
handling simpler and safer, I think we MUST restrict this kind of EDO 
Extension use.