Re: [iccrg] [ippm] [tsvwg] New Internet Draft: Congestion Signaling (CSIG)

Sebastian Moeller <moeller0@gmx.de> Tue, 20 February 2024 20:08 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C60ACC14F6BC; Tue, 20 Feb 2024 12:08:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.853
X-Spam-Level:
X-Spam-Status: No, score=-1.853 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmx.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hbg_tBJs7h9q; Tue, 20 Feb 2024 12:08:00 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 290DBC14F6B8; Tue, 20 Feb 2024 12:08:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.de; s=s31663417; t=1708459677; x=1709064477; i=moeller0@gmx.de; bh=04QUkV0/BQIK2vM4o0eBvx3qgtuYEyw3/oReI1lUSVU=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References: To; b=G2xJWpzoRukQqy++eyqCWc8L73PE5TbuGVnNQfcYZXubvAqI4sAGIao7qBAtPeGG wlxEbjJ/m4gFHuURbcrR82yn5NwIRSCXZJQCvdUlC0I8ajdBoETF8MvcNZFKdWxqk SpvRZSfHjaWfMEF+LhKxqUhF9aqSbc7iW0WQfLVJfaSlC9gpqkEIXGHCUMUL34Q3v +IRO4x1CF7TwKCZX5d2OOZyb0+ggrkMhcSs3e6OyzeD2yACA41afYzybDdqdA+VQ9 5aolM5vB6cxjYD4OYo9cqUczBqmIlYur89L0i6Sho1OXfh37v/d3yZV3Fd365tnDi bQrcaigxl7raoVUfzA==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from smtpclient.apple ([95.112.170.242]) by mail.gmx.net (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1N6siz-1qrlD81w8l-018MTW; Tue, 20 Feb 2024 21:07:57 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.400.31\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <CALx6S36xe=aQsXNgBVOjUFAMdX+_=benPPjROiSL_Et00888Bg@mail.gmail.com>
Date: Tue, 20 Feb 2024 21:07:46 +0100
Cc: Sebastian Moeller <moeller0=40gmx.de@dmarc.ietf.org>, tsvwg <tsvwg@ietf.org>, IETF IPPM WG <ippm@ietf.org>, Nandita Dukkipati <nanditad@google.com>, iccrg@irtf.org, Naoshad Mehta <naoshad@google.com>, ccwg@ietf.org, Abhiram Ravi <abhiramr@google.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <EA4EB2F1-17CF-44CD-8C86-1FCF588A01CC@gmx.de>
References: <CAF0+TDD+44TAHf7y05GzmCgbau66ey7AU2RaVroim_Tukf=7nQ@mail.gmail.com> <CALx6S35V8xyDBkN0m8kDEcNk0N734Fqq0Ne8ZJ284ZnSSUwV9w@mail.gmail.com> <CALx6S35XNyBe5=gh7JpaCKEkiXaEwPGHrDZe=E-EPkiF5mUCLA@mail.gmail.com> <CAB_+Fg5McYXt=M5MNkuxHrKrXQgZMS6PLRoVeUKiSUe5Qb7LjA@mail.gmail.com> <CALx6S35OHyhWjmkV2jiOqO-sB9Csugx0umB_yF_ann9rB8Tgbw@mail.gmail.com> <CAEsRLK9_bHrhyvFqCz3do=Ax3mKZor4EtqXY2chdfL7fzi1UMw@mail.gmail.com> <500388A6-50D3-4535-84CB-E6EF454960DD@gmx.de> <CALx6S37gOatLC_DZiM4M=e8qrzyE9y1D1i+UqOYXatd7Y6Nauw@mail.gmail.com> <918C1325-EC13-48CF-9B29-50EEB3A0FF1C@gmx.de> <CALx6S37zGrNMai+9khwG2_rpsiQuTd8bSiWbxZK-oiVEB0aimQ@mail.gmail.com> <A68A0319-7942-482D-A395-BB72901B2EA7@gmx.de> <CALx6S36AON6GkPLLcBVaq1uKxaRwgvc-txCkb9PCyX0DGs7ktw@mail.gmail.com> <4E3C7A28-C810-4420-A799-81ACC320A5D2@gmx.de> <CALx6S36xe=aQsXNgBVOjUFAMdX+_=benPPjROiSL_Et00888Bg@mail.gmail.com>
To: Tom Herbert <tom=40herbertland.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3774.400.31)
X-Provags-ID: V03:K1:BG4lVO+d9gFxgDWRr4IjMzK4uO7qI/Sf+TmtoYDyXuTdlNIK4XB fa0Vv+/ZDRjUWfKsc8cqcwh7jjUQzpOhB1rWv45RozK4SS2fDow8N5KeqFusVl8ukZ6Y/NS b2F58ChAY3YgU3Z1Irf+peQ9c68t/UB+5TUjzGL4zkn3uZEk8KeK/dp2rPwuC4ncn86NZFX TjUzzgvyaZ2KuTw0XGMrw==
UI-OutboundReport: notjunk:1;M01:P0:ID1rFKOzN/c=;E3FQczcf9HQ/WCpK3VEKAnwvXAA VbbMfvVzfyhcPJJxAgYETxcRAoIIwZfXZBNNF2jJRO7PYt6wK0f4t4sv9EKrdGqHEiOlpEjov 3BcNRw5cLtinSwmT5mX29Zl4KpyKzHmHaUjxq+HauzgXRKoYAwF384K0KYjewp8mVwZ2Yt3r1 hG8DYKEesZZ2B0+vQe9ImdxYatYhaVUlKB3/tV6HDs3xvfQwmCySNhWxgM8Jz6hBmMzIewmFY shYdOM6nfSOMVKfqdlHWMzrYuUnucLcHN9l2IoJyz0iVTAq1h7ou5HPYIRV0moylAHeLLgqIF J1RWEarOukgMa8wYtJ+jJjCtzN5Yqs4+uN/VUlp/z4TA7oKwXDelwy9Mc90hCFghqaDBmK53j Dkfy47q0aN2YhYX5NlmLe4zQwNrMCAY3FneJYT1q2rChVS6HNtSV+4bXhDshPpK6Sz6hdVqqh J9rVaI7HtdjNqaDGJJzYDb2VyJ/E0tNhpwIqkDKJFG1o6Wumet56BMbBN6FFUdWsgBKyGnk05 JIz+yN0/2gpDI15DKtyvJNUQhpxxd3VHPTTLV4lNYpDKLXKWX1Ui2zPEgGFoQvvUNVBY9ThEV zUJjV5dbdgNAzLvTSdQF4GkpPBcVNm4LaDYM6ga9XQFlv2QCBaurRg5x/mYIXpHnCBe58wGG8 d52zsdXSrfv556d6WjDKPttO0Q5f/VKufHi/9CABtUdMCGO9E4+rc4U31f98IM3MOVqVn+VnY XDKGbkEBNe9aUGBhYgDlNr9CL6AaJekSkz6Yo3025L4Dj5kElzWMf9q4yebB/T+/Oy7fxFIhu HOcZ9oZPCfnp1cTeTq+yYDN8guVBkTWhP/s4xoRjVJzLU=
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/OaPQ2ASKXv19CsR4n97UrIElkOc>
Subject: Re: [iccrg] [ippm] [tsvwg] New Internet Draft: Congestion Signaling (CSIG)
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://mailman.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://mailman.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Feb 2024 20:08:04 -0000

Hi Tom,



> On 20. Feb 2024, at 20:03, Tom Herbert <tom=40herbertland.com@dmarc.ietf.org> wrote:
> 
> On Tue, Feb 20, 2024 at 9:55 AM Sebastian Moeller
> <moeller0=40gmx.de@dmarc.ietf.org> wrote:
>> 
>> Hi Tom,
>> 
>> 
>>> On 20. Feb 2024, at 18:04, Tom Herbert <tom=40herbertland.com@dmarc.ietf.org> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Feb 19, 2024 at 11:59 PM Sebastian Moeller <moeller0=40gmx.de@dmarc.ietf.org> wrote:
>>>> 
>>>> Hi Tom,
>>>> 
>>>> 
>>>>> On 19. Feb 2024, at 21:09, Tom Herbert <tom=40herbertland.com@dmarc.ietf.org> wrote:
>>>>> 
>>>>> On Mon, Feb 19, 2024 at 11:31 AM Sebastian Moeller
>>>>> <moeller0=40gmx.de@dmarc.ietf.org> wrote:
>>>>>> 
>>>>>> Hi Tom,
>>>>>> 
>>>>>> 
>>>>>>> On 19. Feb 2024, at 18:53, Tom Herbert <tom=40herbertland.com@dmarc.ietf.org> wrote:
>>>>>>> 
>>>>>>> On Sun, Feb 18, 2024 at 11:34 PM Sebastian Moeller
>>>>>>> <moeller0=40gmx.de@dmarc.ietf.org> wrote:
>>>>>>>> 
>>>>>>>> Hi Matt,
>>>>>>>> 
>>>>>>>>> On 17. Feb 2024, at 20:17, Matt Mathis <mattmathis@measurementlab.net> wrote:
>>>>>>>>> 
>>>>>>>>> I think the L2/L4 split is brilliant.
>>>>>>>> 
>>>>>>>> [SM] Respectfully, the brilliance depends very much on the goal/gamer plan. Is this purely aimed at data center traffic this looks like a sweet solution that is 'organically' confined to the domain with appropriately capable L2 elements? Or is the end-game here an (overdue) improvement of end-to-end loads/congestion information? In the former case L2/L4 seems a decent solution, in the latter case less so (not that getting a common L3 solution would be guaranteed or easy).
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Putting the forward instrumentation as low as possible in the stack permits easy processing in HW w/o parsing any L3.
>>>>>>>> 
>>>>>>>> [SM] Sweet, but that really means that this solution is unlikely to survive over a full internet path.
>>>>>>>> 
>>>>>>>>> Putting the replies in L4 only requires a handful of implementations to cover all possible paths,
>>>>>>>> 
>>>>>>>> [SM] Mmmh, that might be but partly because the L2 solution noticeably restricts the set of possible paths, no?
>>>>>>>> 
>>>>>>>>> and piggybacks on existing solutions to session layer issues, such as authentication and authorization.
>>>>>>>> 
>>>>>>>> [SM] What is the threat model here? I would guess an attacker that knows the full path might just as well probe the congestion level and an attacker that does not know the path might not be able to do much with the congestion information? (Any attacker that can modify the congestion information might as well drop the packet directly).
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I would consider mentioning but then temporarily excluding alternet placements: either as a shim at the top of L2, sort of like VLAN tags, or within an L3 option.   Both of these have their own challenges, but might be extremely valuable in some environments.
>>>>>>>> 
>>>>>>>> [SM] Some environments, like the internet? I know that the I in IETF is not a strict limiter of scope, but still it would be nice if drafts would have a viable path of being implemented over the internet... That said, well possible that the current state does not merit use-over-the-internet yet and so maybe starting with an L2/L4 solution might be considered a safety back-stop?
>>>>>>> 
>>>>>>> Sebastian,
>>>>>>> 
>>>>>>> There's no reason to believe that Congestion Signaling isn't of
>>>>>>> interest to use on an internet (lower case 'i' is explicit here).
>>>>>> 
>>>>>> [SM] Well, in my mind that still would keep this out of scope for the capital I ETCopy of Copy of Enfabrica-SiPandaF,
>>>> 
>>>> [SM2] Not sure how that 'Copy of Copy of Enfabrica-SiPanda' string ended up in that paragraph, I do not recall writing that... either autocompletion run wild or some other issue.
>>>> 
>>>> 
>>>>>> I am interested in improving end to end congestion signalling over the Internet so I desire these signals to sink and source at my endpoints... Again, I understand that my position is in the rough regarding what the IETF should care about.
>>>>>> 
>>>>>>> This is almost certainly beneficial for 6G for instance which is an
>>>>>>> internet composed for various link layer technologies.
>>>>>>> Neither is this
>>>>>>> the only protocol of this nature there are and will be others-- from
>>>>>>> an IETF POV I believe we want a extensible protocol solution that
>>>>>>> benefits multiple use cases and works in different environments.
>>>>>> 
>>>>>> [SM] That way lies madness IMHO. Getting enough routers/switches support one signal and hence make it useful is already almost a Sisyphus task expecting them to support multiple signals selected individually per packet seems like a recipe of never getting this to work end to end (which is my motivator here). If we do not know which single signal to use here, I guess keeping this private and do more research seems like a productive way forward.
>>>>> 
>>>>> Sebastian,
>>>>> 
>>>>> I would agree with that if this was the first protocol ever trying to
>>>>> do something like this, but it's not. IOAM is already a published RFC.
>>>> 
>>> 
>>> Hi Sebastian,
>>> 
>>>> [SM2] There are a number of points that are against IOAM being a suitable encapsulation here:
>>> 
>>> Yes, I wasn't suggesting that IOAM could be used without modification. I was suggesting that a compressed format could be derived where we can leverage the underlying mechanisms and implementation.
>> 
>> [SM3] But the same could be achieved outside of the IOAM constraints, no?
> 
> Yes, that's my point.

[SM4] Great, ew agree on that point then.


> 
>> 
>> 
>>> 
>>>> 
>>>> https://datatracker.ietf.org/doc/html/rfc9378 :
>>>> "IOAM is focused on "limited domains", as defined in [RFC8799]. IOAM is not targeted for a deployment on the global Internet."
>>> 
>>> That's more of a statement of security and not feasibility. There's simply no security in the Internet, so we cannot trust or validate that anonymous intermediate nodes are going to write correct information. Any plain text in a packet on the Internet is subject to inspection and modification if the data isn't authenticated, and in the worst case this could be a DoS vector by writing bad information.
>> 
>> [SM3] Indeed, but e.g. for TCP you would need to know a lot about the most recent packet to be able to play games, no? So either you are on path and already can drop/duplicate packets at will or you are off path but still need a recent enough veridical packet to be able cause mischief, no? (I might be insufficiently creative in attack vectors)
> 
> Sure. All the more reason to encrypt Transport Layer headers going
> forward like QUIC.

[SM4] This is orthogonal, but there is a minimal set of information we need to reveal to the network (at the very least the destination IP address) and there is a little information we want from the network. IMHO encrypting these pieces of information makes zero sense, and that is all I am concerned with here.

> 
>> 
>>> 
>>>> 
>>>> IMHO that already disqualifies IOAM here as the goal needs to be a fully end to end method having the Internet as its scope.
>>>> 
>>> 
>>> Because of the fundamental security issues on the global Internet, it's unlikely we'll ever have a usable network-thost signaling protocol on the Internet.
>> 
>> [SM3] But we already do, ECN, CSIG really is just ECN on steroids so will not open qualitatively new attack avenues, no? I am not looking fore a generic network to host protocol for any kind of information, my goal is really just to fix "ECN"...
> 
> Are you trying to make ECN resilient against abuse on the Internet?

[SM4] IMHO ECNs biggest flaw is not abuse, but the fact that it is a single bit and research has shown that multibit congestion information is strictly superior. Sure L4S tries to use a (badly defined) rate code of the CE bit, but that IMHO is not an evolutionary stable strategy... But really I am using ECN on my homenetwork (and the internet) for over a decade now, and I would say ECN abuse so far is not a big issue. Yes and anecdote does not make robust data, but it is at least an observation.

> 
>> 
>>> 
>>>> [SM2] This is already handled in the ' IOAM Proof of Transit Type 0' field, the facrt that you did not notice that supports my argument that IOAM is too complicated an RFC for this simple use-case, no?
> 
> RIght.
> 
>>>> 
>>>> 
>>>>> If that constraint is
>>>>> removed then the only remaining argument against IOAM seems to be that
>>>>> it's easier for hardware to handle L3 rather than L2 in hardware.
>>>> 
>>>> [SM2] No, it is still way to complicated (offering options all intermediary nodes will need to check before setting values) and way way to overheady...
>>> 
>>> We can define a fixed length L3 format for the fast path that would be quite efficient. See below.
>> 
>> [SM3] Sure but then we might drop all of the IOAM dressings... at which point doing this inside of IOAM might not be a good fit anymore?
> 
> I view IOAM as an example of network to host signaling. The format I
> proposed isn't IOAM, but it does share some characteristics.

[SM4] Yes this clearly seems a better fit.


> 
>> 
>>> 
>>>> 
>>>>> I don't believe there is currently consensus that that is generally
>>>>> true.
>>>> 
>>>> [SM2] Unclear what the consensus is. IMHO is is not really a question of consensus, if we want this to be a first class end 2 end signal, L2 is simply not an option independent of our wishes.
>>> 
>>> I agree, but we have seen these sort of statements from vendors before that processing L3 in hardware is overly complex and difficult to make efficient.
>> 
>> [SM3] I am guessing here, but the more conditions need to be checked before reaching a location to modify the more vendors are going to hate it... and with at least some justification...
> 
> I'm not sure there's any more conditions to check than would be needed
> in the CSIG protocol.

[SM4] Let's count Hardware needs to test for VLAN tags anyway and CSIG would hide as an additional VLAN tag, so we are one above no CSIG, for an L3 header we have to add one mnore test and for an extension header add one on top... sure as you point out we can test these in parallel, but at the end of the day we need to evaluate a bunch of the tests to make sure we have the correct offset for the congestion information. But do not get me wrong, I do not think putting CSIG into L2 is a good idea (as I want this to operate end top end over the internet).

> 
>> 
>> 
>>> Instead of blindly accepting that, or just dismissing the point, I believe it's in everyone's best interest to openly discuss this, and note there's already a lot of work to address known inefficiencies in IP processing see draft-ietf-6man-hbh-processing for instance.
>>> 
>>>> 
>>>>> And, if this is why IOAM "has such a sparse (or no) support from
>>>>> switch vendors" as Jai claims then it seems like this is maybe
>>>>> something that should be discussed instead of just arbitrarily
>>>>> dismissing IOAM. Why exactly is IOAM in HW such a problem and can it
>>>>> be fixed? (a quick look at ippm archives didn't reveal any
>>>>> discussions).
>>>> 
>>>> [SM2] As shown above IOAM seems to be one of these everything and the kitchensink solution, probably great for its intended use cases, but efficient end/network to end signalling IM;HO is not one of these use cases. Sure it could be bent into shape to support that use case, but I would rather see a meaner and leaner signal design for the congestion information aggregation along a path use case. If I had control, I would propose a new IPv6 header type (with a new next header number) mutually exclusive with hop by hop (so that confirming the next header number is the only check required before updating data at a known offset). There is value in simplicity... (I guess this scheme will fail due to firewalls only permitting a limited set of next header values...)
>>> 
>>> We don't need to define a new extension header (and realistically we'd never get a new EH through IETF anyway),
>> 
>> [SM3] Then we should close down IETF for good... because the whole justification for the 8 bit for this field is that we might end up with 256 different instances... if we declare this to be a complete set we will have failed. That said, just because we could define a new number/type here does not mean it would be the best way forward, but it would mean reading the primary next header value would give us an unambiguous offset to the to be RMW word...
> 
> The format I provided does give an unambiguous offset to the RMW word.

[SM4] Only if we enforce that the CSIG option always is the first in the hop by hop TLV list, no? If I understand rfc8200 correctly the hop by hop header might contain multiple TLVs...


> And also the reason there won't be any new extension headers is
> because it's not feasible to add new extension headers-- it would be
> years before routers would reliably forward packets with new extension
> headers (an unfortunate consequence of protocol ossification on the
> Internet).

[SM4] That is on us though, it is clear that network visible fields follow the 'use it, or loose it' rule, that is if we keep a field constant for too long the network will start to assume this constancy... the solution is IMHO quite simple just do not keep values constant but send enough random values to make filtering on a few fixed values a fools errand. IMHO this is what the IPv6 interface identifier does right (I do not agree that spending 64bits on that part of the address is the best long term strategy, but filling these with moastly random numbers is for the time being the best approach to conserve these fields as true variasbles. but I digress).

> 
>> 
>> 
>> 
>>> CSIG can be efficiently done as a Hop-by-Hop Option. For example:
>>> 
>>> 
>>>                                                                      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>                                   |                               |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
>>>      |                                                   Src Mac address                                                  |
>>>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |                         Dest Mac address                      |
>>>   +                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+    |                               |             0x86dd            |
>>>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |   6   | Traffic Class |              Flow Label               |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |         Payload Length        |      0        |   Hop Limit   |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |                                                               |
>>>   +                                                               +
>>>   |                                                               |
>>>   +                         Source Address                        +
>>>   |                                                               |
>>>   +                                                               +
>>>   |                                                               |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |                                                               |
>>>   +                                                               +
>>>   |                                                               |
>>>   +                      Destination Address                      +
>>>   |                                                               |
>>>   +                                                               +
>>>   |                                                               |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |      0        |  Next header  |      50       |       4       |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   | Type  |         LM            |               S               |
>>>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>>   |                                                               |
>>>   |              ULP (e.g. TCP header and payload                 |
>>> 
>>> This assumes 0x32 HBH Option type is assigned to CSIG.
>> 
>> [SM3] Now we need to first read primary next-hop field and then check the HBH Option type, sure we can solve a lot of things by adding layers of indirection, but it will not get cheaper that way...
> 
> I don't see where there's any indirection here.

[SM4] First read next header, then read HBH Option type are two operations were the first redirects us into the extension header... the fact that we can evaluate both in parallel does not relieve us from the fact that we have to evaluate both... pointing from one data structure (the IPv6 header) into another data structure (the HBH extension header) is what I would call an indirection, but then this is not my field, and I might have the nomenclature wrong, sorry for that.


> We're just checking
> some values at fixed offsets in the packet and then jumping into
> processing. This is no more complicated than what switches already do
> today (for instance, parsing packets for the purposes of ECMP).

[SM4] But switches do that because the switches operator believes that this work is worth it, we need to make a point that this is also worth for the CSIG information.


> 
>> 
>>> 
>>> * Is this encoding efficient? Yes. Compared to the compact format in the CSIG draft there is an additional four bytes of on-the-wire overhead, and it's the same overhead as the Expanded format which is eight bytes.
>> 
>> [SM3] Not sure that is fair... we just increased the minimal encoding from 4 to 8 bytes... this might be OK if the 4 byte format simply was a crutch to make CSIG masquerade as a VLAN tag, but if the 54 byte version was supposed to be the bread and butter encoding we just doubled the overhead (sure we also increased fidelity).
> 
> The overhead difference is 4 bytes, a whole 0.2% of standard 1500 byte
> MTU.

[SM4] Ans considerably higher for a minimal sized packet... this is dearth by a thousand cuts, first we double the IP headser to 40 bytes, then we add another 8, because when we designed that 40 byte header we forgot to make room for 2 additional bytes of information...

> That will not break anyone's network or application. I'd also
> point out that the CSIG draft has two formats-- that twice as many
> formats they need support, twice as many formats they need to support,
> twice as many opportunities for implementation bugs. IMO, that draft
> would be better off just going with the expanded format-- I don't
> believe anyone will ever complain about the additional four byte
> overhead.

[SM4] Well, if we look at the current extended CSIG formats we see:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TPID | LM |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| T | S | R |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

|0-15| TPID : IEEE allocated Tag Protocol ID for 8 Byte CSIG tag
|16-31| LM : Locator Metadata of bottleneck device / port
|0-3| T : Signal Type (0:min(ABW), 1: min(ABW/C), 2:max(PD))
|4-23| S : Signal Value: Uniformly quantized
|24-31| R : Reserved for future use

This is not going to work for an L3 header over the internet:
a) we do not need 16 bits of TPID (only needed to disguise as VLAN tag)
b) we can not use 16bit LM (over the internet that likely needs to be a 128bit IPv6 address or likely nothing) 
c) T 4bits IMHO this needs to be either a single type or at worst two, but then these two always need to be filled in, so another 4 bit recovered, but le#t keep them
d) we are left with 20 bits for the value
e) and 8 bits reserve

so in L3 instead of 8 bytes we are down to 4 bytes, or up to 20... so respectfully, the additional 4 will not cut it...


>> 
>>> * Is this friendly to hardware processing in the fast path? Yes. The Ethernet header, IPv6 header, and Hop-by-Hop Options header with a single option for CSIG form a fixed header of 42 bytes which will fit well within a router's parsing buffer. that can be quickly processed. The match of this format is Bytes 12-31 == 0x86dd && Byte 20 == 0 && Byte 34 ==0 Bytes 36-37 == 0x3204. Upon receiving a packet, these fields can be extracted and run through a TCAM.
>> 
>> [SM3] My limited experience tells me that TCAM is not cheap...
> 
> Well, switches already match on EtherTypes and protocol numbers
> anyway, so this should be using similar mechanisms. Besides that, the
> TPID needs to be matched in CSIG so it's not like that's free (this is
> one area where IP got it right, protocol numbers and option types are
> eight bits which means we can use an array for O(1) lookup even in SW.
> Much harder to efficiently lookup a sixteen bit value especially in SW
> where we don't have a CAM).

[SM4] Thanks, learned something new.


> 
>> 
>>> If there's a hit that this is a packet with CSIG then it can be processed. Note that the CSIG option can be processed in parallel with processing the Ethernet header and IP header.
>> 
>> [SM3] Playing devils advocate, this preocessing, parallel or not will come with some power cost, better to keep it simple.
> 
> We have a solution to make parallelism in network protocol processing
> efficient, planning make it public shortly...

[SM4] Well, that would shot down my argument nicely (which I am fine with, I would be happy if your proposal would be feasible, all I want is better congestion information end to end ;) )


> 
>> 
>>> 
>>> * Is this format extensible? Yes. The Type field allows CSIG formats with different semantics and structure (the CSIG draft defines three types, so thirteen are remaining for extensibility).
>> 
>> [SM3] This is against making this efficient for hardware, IMHO CSIG needs to be whittled down to a single unambiguous piece of information (or at worst two) extensibility sounds great in theory, but in practise... (see above about getting a new next header ID through the IETF ;) ).
> 
> The EtherType makes Ethernet extensible. IP protocol numbers, like 6
> for TCP, make IP extensible. Switches already have to deal with
> extensible protocols anyway. Even the CSIG draft has the type field.
> The point of making the protocol extensible isn't that we expect to
> have hundreds of variants. It's an insurance policy to future proof
> the protocol. For instance, if we find a major security issue then we
> might be able to create a new variant as opposed to having to throw
> out the whole protocol and start from scratch. In the case of CSIG the
> cost is four bits of overhead and a check on the value in a nibble, in
> the grand scheme of all the processing a router does when forwarding a
> packet, this overhead is negligible.

[SM4] Aye!


> 
>> 
>>> 
>>> * Is the format routable? Yes, The format is contained in the Network layer so it is properly end-to-end data that is routable over a limited domain internet and the Internet (modulo that packets with extension headers might be dropped in the Internet)
>> 
>> [SM3] That last one should be a fixable issue...
> 
> Yes. 6man has been working on that.

[SM4] Yeah, progress is both hard in the now and inevitable in retrospect (or so I hope).

> 
>> 
>>> 
>>> * Can the format be authenticated? Yes. AH header can be used. AH would authenticate the option but not the data since the intent is for routers along the path to write CSIG data in the packet.
>> 
>> [SM3] Yes aggregating information from all nodes along a network path is the whole reason for this exercise...
>> 
>> 
>>> AH prevents someone from inserting or removing the option in flight,
>> 
>> [SM3] Mmmh, removing such an option would be quite noticeable and adding it would probably require playing some MTU/MSS games to avoid creating packets that wiuld need fragmentation, no?
> 
> Removing the option in flight wouldn't be noticeable,

[SM4] For an L3 protocol expecting that information this case should be discoverable.


> The receiver
> doesn't get the option and therefore doesn't reflect it, and so the
> original sender doesn't see the reflected data but wouldn't know why
> (option dropped by network, receiver doesn't understand option, etc.)

[SM4] Yes, the reason would be unknowable, but the fact it is missing should be obvious.

> 
> Someone inserting the option could be more insidious which is why
> firewalls would likely block the option from entering the network
> (like they pretty much block any protocol the network provider doesn't
> trust in their infrastructure).

[SM4] Having read a few 'recommendations' for firewalls (for end users), I am sure that anything will be tried by firewalls no matter how sane or insane.

> 
>> 
>> 
>>> the data written by routers isn't authenticated and there's really no way to do that since we'd have to establish and SA between routers and hosts; the typical answer to this is that we should restrict protocols like this to trusted limited domains (for instance, using CSIG over the Internet is pretty pointless since nothing prevents intermediate devices from writing bad information into a packet-- this could even be a DoS vector).
>> 
>> [SM3] I disagree, if end to end congestion control really really wants timely information for the current network path, and we already do this pretty successfully with the ECN bits. Plus most routers that could fudge a CSIG value might as well just drop the packet, and if someone accidentally/purposefully modifies the true signal to no-congestion (to cause the sender to send too much the only real DoS vector I can see) there is always the 'the congested node simply drops packets' remedy...
> 
> Dropping the packet is fine because that's easily detectable by the
> sender. Subtly skewing the data that is processed by endpoints could
> be an most undetectable and potentially effective attack (I am
> reminded of Stuxnet attack).

[SM] But if you fudge the information, there are only two cases:
fake higher congestion, the flow will slow down more than merited, but again that network node might as well have dropped the packet in the first place resulting in a similar harsh window reduction
fake lower congestion, if that flow exceeds its welcome at the bottleneck packets are dropped and the flow responds trough the drops...

This is really a pretty subtle attack vector... not really useful fore DoS.

>> 
>>> * Can this format be encrypted? No. We'd need an SA established network and hosts which doesn't seem plausible.
>> 
>> [SM3] We would need ALL nodes along a path have the key, at which point we loose only a little security by not encrypting at all...
> 
> Yes.
> 
>> 
>>> However, the return path for reflecting the signal could be done in a Destination Option which could be encrypted by an IPsec header.
>> 
>> [SM3] That is a interesting question, currently everyone and their pony packs the return path into L4 or higher, but in theory that information might also be put into an L3 header... I guess the horse has left that barn long ago so this is purely theoretical...
> 
> It's not theoretical and not everyone is putting their information
> into L4 or higher.

[SM4] I am speaking about congestion information here, so ECN and co. where each protocol needs to roll its own reflection solution (or multiple mutually exclusive ones in TCP)

> There's no universal solution in L4 for doing this.

[SM4] Yes this is why putting this into L3 now would be tempting, it is a greenfield solution allowing for a relative clean design (for now, complications surely will arise in the future resulting in hot fixes and less cleanliness)

> We do have TCP options,

[SM4] ECN uses TCP flags, no option required (but that would not hold for multibit CSIG)

> but there is not an equivalent for UDP
> (conceptually, UDP options could do this but those have yet to be
> proven deployable). In lieu of a ubiquitous solution, every protocol
> that runs over UDP will need to be modified to carry the information.

[SM4] As in the context of ECN QUIC already does today...

> If we put this information in a Destination Option, L3,  then it "just
> works" with all use cases of any IP protocol: TCP, UDP, DCCP, IPsec,
> IPIP, etc.)

[SM4] Yes that is my point, except I would consider putting this into the  HBH option as the forward information... 
But for this to work all endpoint IP stacks will need to learn how to pass this information to the appropriate receivers... (but that is true for the CE mark in ECN already)

Regards
	Sebastian

P.S.: Thanks for the discussion, learned a  lot. I will try to calm down a bit to give the CSIG authors/proponents a chance to comment on;)

> 
> Tom
> 
>> 
>> Regards
>>        Sebastian
>> 
>>> 
>>> Tom
>>> 
>>>> 
>>>> 
>>>> Regards
>>>>        Sebastian
>>>> 
>>>> 
>>>>> 
>>>>> Tom
>>>>> 
>>>>>> 
>>>>>> Regards
>>>>>>       Sebastian
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Tom
>>>>>>> 
>>>>>>>> 
>>>>>>>> Regards
>>>>>>>>      Sebastian
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Sat, Feb 10, 2024 at 7:42 AM Tom Herbert <tom=40herbertland.com@dmarc.ietf.org> wrote:
>>>>>>>>> On Fri, Feb 9, 2024 at 10:53 PM Nandita Dukkipati <nanditad@google.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Tom,
>>>>>>>>>> 
>>>>>>>>>> We updated the draft, correcting some nit errata, and to not let the draft expire. It's not discussed in any other mailing lists.
>>>>>>>>> 
>>>>>>>>> Thanks Nandita.
>>>>>>>>> 
>>>>>>>>> I still have fundamental concerns about the protocol layering in this
>>>>>>>>> draft, please see my previous comments on that. The draft defines a
>>>>>>>>> protocol for end-to-end network to host signaling and IMO, such a
>>>>>>>>> protocol belongs in the network layer but the draft puts the protocol
>>>>>>>>> in L2 and L4 and seems to avoid L3 without explanation. IOAM defines a
>>>>>>>>> very similar method of signaling and RFC9486 is a good model for
>>>>>>>>> network layer protocol that provides network to host signaling.
>>>>>>>>> 
>>>>>>>>> Tom
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Nandita
>>>>>>>>>> 
>>>>>>>>>> On Thu, Feb 8, 2024 at 3:53 PM Tom Herbert <tom@herbertland.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I noticed there is now an -01 version of the draft posted on Feb. 2.
>>>>>>>>>>> Is this draft being discussed on some other list?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Tom
>>>>>>>>>>> 
>>>>>>>>>>> On Sat, Sep 9, 2023 at 9:09 AM Tom Herbert <tom@herbertland.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi, thanks for draft!
>>>>>>>>>>>> 
>>>>>>>>>>>> The first thing that stands out to me is the carrier of the new packet headers. In the forward path it would be in L2 and in reflection it would be L4. As the draft describes, this would entail having to support the protocol in multiple L2 and multiple L4 protocols-- that's going to be a pretty big lift! Also, L2 is not really an end-to-end protocol (would legacy switches in the path also forward the header)l?).
>>>>>>>>>>>> 
>>>>>>>>>>>> The signaling being described in the draft is network layer information, and hence IMO should be conveyed in network layer headers. That's is L3 which conveniently is the average of L2+L4 :-)
>>>>>>>>>>>> 
>>>>>>>>>>>> IMO, the proper carrier of the signal data is Hop-by-Hop Options. This is end-to-end and allows modification of data in-flight. The typical concern with Hop-by-Hop Options is high drop rates on the Internet, however in this case the protocol is explicitly confined to a limited domain so I don't see that as a blocking issue for this use case.
>>>>>>>>>>>> 
>>>>>>>>>>>> The information being carried seems very similar to that of IOAM (IOAM uses Hop-by-Hop Options and supports reflection). I suppose the differences are that this protocol is meant to be consumed by the transport Layer and the data is a condensed summary of path characteristics. IOAM seems pretty extensible, so maybe it could be adapted to carry the signals of this draft?
>>>>>>>>>>>> 
>>>>>>>>>>>> A related proposal might be FAST draft-herbert-fast. Where the CSIG is network to host signaling, FAST is host to network signaling for the purposes of requesting network services. These might be complementary and options for both may be in the same packet. FAST also uses reflection, so we might be able to leverage some common implementation at a destination.
>>>>>>>>>>>> 
>>>>>>>>>>>> Tom
>>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Sep 8, 2023, 7:43 PM Abhiram Ravi <abhiramr=40google.com@dmarc.ietf.org> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi IPPM folks,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I am pleased to announce the publication of a new internet draft, Congestion Signaling (CSIG): https://datatracker.ietf.org/doc/draft-ravi-ippm-csig/
>>>>>>>>>>>>> 
>>>>>>>>>>>>> CSIG is a new end-to-end packet header mechanism for in-band signaling that is simple, efficient, deployable, and grounded in concrete use cases of congestion control, traffic management, and network debuggability. We believe that CSIG is an important new protocol that builds on top of existing in-band network telemetry protocols.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> We encourage you to read the CSIG draft and provide your feedback and comments. We have also cc'd the TSVWG, CCWG, and ICCRG mailing lists, as we believe that this work may be of interest to their members as well.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you for your time and consideration.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>> Abhiram Ravi
>>>>>>>>>>>>> On behalf of the CSIG authors
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> iccrg mailing list
>>>>>>>>> iccrg@irtf.org
>>>>>>>>> https://mailman.irtf.org/mailman/listinfo/iccrg
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Thanks,
>>>>>>>>> --MM--
>>>>>>>>> Evil is defined by mortals who think they know "The Truth" and use force to apply it to others.
>>>> 
>>>> 
>> 
> 
> _______________________________________________
> ippm mailing list
> ippm@ietf.org
> https://www.ietf.org/mailman/listinfo/ippm