[tsvwg] From L4S to SCE+DSCP and RFC-4774 Option 3

Jonathan Morton <chromatix99@gmail.com> Fri, 26 March 2021 10:56 UTC

Return-Path: <chromatix99@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A10093A1B4A for <tsvwg@ietfa.amsl.com>; Fri, 26 Mar 2021 03:56:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.848
X-Spam-Level:
X-Spam-Status: No, score=-1.848 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d27jE3M3T8Fh for <tsvwg@ietfa.amsl.com>; Fri, 26 Mar 2021 03:56:57 -0700 (PDT)
Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CAAB23A1B49 for <tsvwg@ietf.org>; Fri, 26 Mar 2021 03:56:56 -0700 (PDT)
Received: by mail-lf1-x134.google.com with SMTP id n138so7048736lfa.3 for <tsvwg@ietf.org>; Fri, 26 Mar 2021 03:56:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:message-id:date :to; bh=Yjs4Hl/OIXbsjnFGfQIdY145mz91g65+3R3+M0oI54Y=; b=DQfnAyboFZb7FSLMvVisfnjroc8mpo2js1Lpdt5LffLWWZw1DsdHm5+Q7HfeiyCr+O vtxNRCB2mFuNX1JQDZHnJSNGTWZyRj5nI2C2zPCzc7exFGDVNB79+y9eLDL8RndZnP2N +yLimwIbofCGgGtOG6dRaSwyMjATUMyOBdCCToyzka0i4VgIw7hULv+dDXUrIvvsh0TK IvW6Ix9vnLlr49YIQawHdrXa4EzPHFja+7KYC3RRhV+jy/yZ3CpIMdEF2g9L3MnoPmqd yxaiTdihKDIT7v0CA7BFpnCtYbTk3mSrRJGN39PF72rNul3o1Td7jGtbuWpk66togA3Q iFRw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:message-id:date:to; bh=Yjs4Hl/OIXbsjnFGfQIdY145mz91g65+3R3+M0oI54Y=; b=bxDeyMsKmLKnZ1tBbSpvAjZbNjYBVIIohR2EQnMEsB/VoAKRmIGXeq3NrJ6WHY/W1c RihgOiaP0THv24PhiwDK/tlD/Hkif6CbjOX9TWtU84Z1EOmQH8c0Q67KtzfjSuTCKhZI kFEmb5upaa2WPxhhS4YjtfphS1iDmMu1A7ldFOvdncWe8BG+zJxAJ3BIqxrP3nVv7ZfV PJ6GLna09eN7pMvgdePO9QXihEWTLSNA3N5lCwvs0HXjMSuR+vLHW04lCFHYURvdGgBw PegUD3kTlu3GfTzA92YV3+wKje+Fei63QqL3Voxyu6KmG/dpTZBHKIwOLUIRxXhOcFqD +Hog==
X-Gm-Message-State: AOAM532aw4gvNWz3+iiTvD0vTolOmypqCH4h6TaU7RmZRqO/FcBfczFj EexxemGK63LSBH20yIxmHRaBJ7EdUms=
X-Google-Smtp-Source: ABdhPJwqqpZ/Baxlgf2iIIKqVrSkwd40THEuOk74uDgmPA9V/cgwD3z4QNYDX9OSg2yrFiUdylwJ9w==
X-Received: by 2002:a05:6512:2356:: with SMTP id p22mr7573545lfu.347.1616756213638; Fri, 26 Mar 2021 03:56:53 -0700 (PDT)
Received: from jonathartonsmbp.lan (178-55-25-11.bb.dnainternet.fi. [178.55.25.11]) by smtp.gmail.com with ESMTPSA id y23sm1125411ljm.53.2021.03.26.03.56.52 for <tsvwg@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Mar 2021 03:56:53 -0700 (PDT)
From: Jonathan Morton <chromatix99@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\))
Message-Id: <19246E36-433D-424A-977D-2D32E426255E@gmail.com>
Date: Fri, 26 Mar 2021 12:56:51 +0200
To: tsvwg IETF list <tsvwg@ietf.org>
X-Mailer: Apple Mail (2.3445.9.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/2LrtkGRZPk5H_UNiXULmyIaNR8I>
Subject: [tsvwg] From L4S to SCE+DSCP and RFC-4774 Option 3
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Mar 2021 10:56:59 -0000

In light of Koen's musings, and knowing that Low Latency DOCSIS relies on a "dual queue" structure with an explicit classifier as its lowest common denominator, here is a summary of how SCE can be combined with a DSCP to make that work.  The result is similar to DualQ-Coupled in most implementation aspects, so I think it should be feasible to adapt a device designed for the one into the other.

First, a brief review of how SCE itself works:

1: ECT(1) is a congestion signal output (renamed SCE) from the network.  It is not normally emitted by senders.  SCE flows are identified in the ECN field as ECN Capable Transports compliant with RFC-3168 (because they are) by emitting ECT(0) at origin.

2: CE marks are applied by both RFC-3168 and SCE AQMs under similar circumstances, at a relatively deep queue threshold.  They are fed back by the receiver to the sender in the normal way, using ECE and CWR in the case of TCP, and senders are expected to respond with a Multiplicative Decrease compliant to RFC-8511.  Not-ECT traffic receives packet drops instead.

3: SCE marks are applied to packets carrying ECT(0), at a shallower queue threshold.  RFC-3168 middleboxes and receivers ignore the distinction between ECT(0) and ECT(1), so they are not affected by this.  SCE receivers feed SCE marks back to the sender using a spare TCP flag renamed ESCE, or using the detailed ECN feedback mechanisms already present in SCTP, QUIC, etc.  Technically AccECN could also be used for TCP, but I consider the ACE field to be a liability since SCE doesn't need to feed back more than one CE mark per RTT.  RFC-3168 senders ignore ESCE feedback, while SCE senders respond with an Additive Decrease of some sort per ESCE.  The "two marks per RTT is steady state" function of DCTCP is acceptable here.

4: If SCE and non-SCE traffic occupy the same queue and the same SCE AQM instance simultaneously, then the SCE traffic will tend to give way to the non-SCE traffic since there is a range of queue depths where SCE signals are applied but CE signals are not.  Therefore some mechanism to distinguish between the two is needed at SCE-marking nodes in the network.  Nodes which only drop packets or are merely RFC-3168 compliant do not need this.  Nodes which do not expect to receive reliable distinguishing information are free to implement RFC-3168 marking without SCE support, as at present.

Our prototype SCE qdiscs use either FQ to distinguish flows into individual queues and AQMs, or AF to distinguish flows into individual AQMs in the same queue.  Since there is no more space in the ECN field, identifying SCE traffic explicitly to the network, which would be needed for a "dual queue" implementation without per-flow AQMs, needs to be done in some other field.  A natural choice is the Diffserv field, next door in the same byte as the ECN field.

Because Diffserv codes are often changed along paths in the Internet, some SCE-capable traffic will not be so identified, but cooperating networks can arrange to ensure that SCE-identifying DSCPs are preserved within their sphere of influence, in order to benefit from the improved service quality.  This should be the case in the environments where L4S-type service is envisaged, ie. between an ISP's subscribers and a CDN attached to that ISP.

The SCE DSCP therefore requests a PHB, at SCE dual-queue nodes, where it requests use of the L queue in which SCE marks are applied at a shallow threshold, rather than the C queue in which only CE marks are applied at a deeper threshold.  Should the L queue overflow, the spilled traffic goes to the C queue instead of being dropped.

The L and C queues SHOULD be serviced in a deficit-round-robin manner so that one type of traffic does not dominate the other.  If information about flow counts is available, this SHOULD be used to weight the servicing so that throughput is proportional to flow count.  This is the main technical difference from DualQ-Coupled, besides the details of the signalling mechanisms.

When the DSCP of SCE-capable traffic is lost before reaching a dual-queue SCE node, this traffic will go to the C queue and receive RFC-3168 signals (CE marks).  The SCE transport will correctly interpret these because the signalling in the ECN field is not ambiguous, and clearly indicates the type of congestion signal actually applied.  Thus it will compete on an equitable basis with other traffic in the C queue, and losing the DSCP does not create a hazard.

The DSCP may also be used by other SCE-marking nodes, which normally use FQ or AF, as a hint to distinguish mixed SCE and non-SCE traffic sharing the same 5-tuple.  This may be helpful when mixed traffic passes through a tunnel that intentionally hides the distinction between flows, but still reveals the TOS byte of the inner packet in the header of the outer packet.  Without this information, the SCE traffic would be disadvantaged under this specific combination of circumstances.

It is not mandatory for an SCE-capable sender to emit the SCE DSCP.  SCE receivers do not need to inspect the Diffserv field before interpreting the ECN field, nor do SCE senders need to check for a correct DSCP on acks before interpreting ECN feedback.  A negotiation to ensure both endpoints of the connection understand SCE should occur before asserting SCE capability using a DSCP, so that traffic not capable of responding to SCE signals is not inadvertently sent into the L queue.  Details of this negotiation are not yet finalised.

It is possible to assign more than one DSCP to indicate SCE capability, with different PHBs requested.  For example, one DSCP could indicate SCE capability with best-effort high-throughput service, another might indicate SCE capability with low-latency service, and a third could indicate SCE capability with low-cost service.  This should be by arrangement between the sender and the network, although an assignment of recommended DSCPs may also be desirable.

Many networks will have no distinction between these classes of service, and will be satisfied if SCE merely does its primary job of settling on a cwnd close to the true BDP - a genuine solution to the long-standing bufferbloat problem.

Because SCE is fully compatible with existing RFC-3168 compliant infrastructure, it does not require the network to be prepared as a whole in advance of endpoint deployment, but can be deployed incrementally as convenient.  This would be a key advantage of converting existing proposals to deploy L4S into proposals to deploy SCE or SCE+DSCP.

 - Jonathan Morton