Re: [tcpm] Updating Proportional Rate Reduction RFC6937 to PS

Mirja Kuehlewind <mirja.kuehlewind@ericsson.com> Thu, 22 October 2020 11:58 UTC

Return-Path: <mirja.kuehlewind@ericsson.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B87B43A0A68 for <tcpm@ietfa.amsl.com>; Thu, 22 Oct 2020 04:58:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.102
X-Spam-Level:
X-Spam-Status: No, score=-2.102 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HwSZOJz4aHem for <tcpm@ietfa.amsl.com>; Thu, 22 Oct 2020 04:58:50 -0700 (PDT)
Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60068.outbound.protection.outlook.com [40.107.6.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 838BD3A09DA for <tcpm@ietf.org>; Thu, 22 Oct 2020 04:58:49 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eCS2/mRMQx194POVYzWrYqV1TivaC7WOzVPOzQuD8dbP65lRAxsMJHbtWAnjNayQTTrUDS+RE4h+LbSAREMY7+b5Vvi4Dwhnie16TzYw9IpMvYxhDF+O4R7jMIOnQev8m8BerGj9YBko/o9FyB/04is1XY+wsbwGkK4Z9v9vpmHylvWAy9rXXcVccN3wyap9TofDu3q/6FPDEr3hiav6MZkWZvzgHeyJUUZ9zWutIwCFn6B3+2ypYMHwu2HRFt1IknXnWK5Om+ziXAdNwX2I5faMeMyYZXuHmBWkaB4NsCnY/sQw64+EUJZUzSGNJaAugMbjMtRi83+PXkjZlnoi/g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=39jCINt+3/e1+qArUk+AWKglXJtsyGPPYrS6AWpGCyg=; b=mLcahPWUbreJ4tgpZjnPQSqKyUvYSQ1EXpFEvZN8J67ex2vw/bi5QyfIKpSEA8gPKH9Nf6gFWe4+45SxnlMZRnUqVV6Wpl4WioiJE1XN7V2x2+onYT4PAY4zgiMhyOQi40OgGzF+q14X7zIFlWb6Hn+tOF1cKxjepDjh90tTqD1XILX7nS1SR2LBq0XphXa38ux8/SzYdsP5GckBMwp+EZoTfcJuFCYHjnHzG+BVb5OaeYGNZ0wy1riE2IXWD4OT67ohwNMI020L5gg/XAsbrsSYUEwHbdqZvTaoz9ZjglJbG+K5BOCe01jm7L3i6fcXI5NSon0TggOqbk89TP5keQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ericsson.com; dmarc=pass action=none header.from=ericsson.com; dkim=pass header.d=ericsson.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=39jCINt+3/e1+qArUk+AWKglXJtsyGPPYrS6AWpGCyg=; b=PggkRQu+26ZcI6/pafaLAFUZkX6EHa2u9tgl+E7rt0mkS4cpGrwjqQi7SQVjMHleQOrHKd9mYzjPp3ToSKHq8opTJPPeMwADg6P654nb3GG/iFy+Rc8zXVTw+GkzEwmSJ/YetDp040htB0ykCbZN4Uq0u9TYJwEwPnKXr5RlBKw=
Received: from AM0PR0702MB3713.eurprd07.prod.outlook.com (2603:10a6:208:19::10) by AM0PR07MB4724.eurprd07.prod.outlook.com (2603:10a6:208:7a::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.11; Thu, 22 Oct 2020 11:58:47 +0000
Received: from AM0PR0702MB3713.eurprd07.prod.outlook.com ([fe80::9820:af8a:cdbc:73b0]) by AM0PR0702MB3713.eurprd07.prod.outlook.com ([fe80::9820:af8a:cdbc:73b0%7]) with mapi id 15.20.3499.017; Thu, 22 Oct 2020 11:58:47 +0000
From: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>
To: Michael Welzl <michawe@ifi.uio.no>, Matt Mathis <mattmathis@google.com>
CC: tcpm IETF list <tcpm@ietf.org>
Thread-Topic: [tcpm] Updating Proportional Rate Reduction RFC6937 to PS
Thread-Index: AQHWpPH0I9lXo03nJ0in9JRDhfrQ7amd21+AgAFS8oCAAEqsgIAAf+uAgAAM6gCAA6c1AA==
Date: Thu, 22 Oct 2020 11:58:47 +0000
Message-ID: <F77B2ABA-0CCB-4C8E-BAF0-0C099F79F90F@ericsson.com>
References: <CAH56bmDXUrJRdnCRq1mug95B16yUQFp4mN4Hur7q9aau-DAk0Q@mail.gmail.com> <2CE9D0F2-88B6-4736-99C8-1533F625ACAA@ifi.uio.no> <CAH56bmDgkVKZvXWr=dL=LptZob+N_nFUP4AkO52J8EiHKvQgvQ@mail.gmail.com> <44DB6DE9-B150-4C2D-B516-A052D325370A@ifi.uio.no> <CAH56bmDfattW-kLd=PHu7684-rYKNKtqjLdY5waq9ZBU2KJePQ@mail.gmail.com> <365A684B-6AC8-4DF4-9312-A934423DE7BA@ifi.uio.no>
In-Reply-To: <365A684B-6AC8-4DF4-9312-A934423DE7BA@ifi.uio.no>
Accept-Language: en-US
Content-Language: en-GB
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.42.20101102
authentication-results: ifi.uio.no; dkim=none (message not signed) header.d=none;ifi.uio.no; dmarc=none action=none header.from=ericsson.com;
x-originating-ip: [2003:de:e713:1b00:4560:3a00:2be6:1f6]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 8f985cd9-9071-418b-c7e0-08d87681d53c
x-ms-traffictypediagnostic: AM0PR07MB4724:
x-microsoft-antispam-prvs: <AM0PR07MB472483D4C25A8FB75053CE4EF41D0@AM0PR07MB4724.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:9508;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: V52Z+YZykodH3qQsVHPFHRvzxX6cfdQL+NT2555chw5eV4SAGdrm6aedIIAUsegYbbv9XiZVXDT/f0Tp59tXffxy/+MABOOrgqQJw0zwgc8wAZBBjpgH5AdF5a5Dt6oFDLeCC/D0Nye2vzwEjIlCbEDGF/M9TcEReM/1to96pq4BDynPz5/NVcI7O8KVVzwM4s+3bcPHViwyREEU3FUkuOxAMwy9dSbXEt0JMXkxElMWWmrRucg4DfUDMp2+wJM9n+yXt7eVm/JXuFf+5QDh1rcPuyA4s81bA/Xto2NidHApx45kPAe/yoCPPO3Be+i6TZw2A/MrNsJ3da5h18O+7OCtWK7mKyMQpK36qYlMgeFtu6aSSlIzRsv4kscZgKpbeYNEdNKieIwBkKPGc498vg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR0702MB3713.eurprd07.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(396003)(376002)(346002)(366004)(39860400002)(8936002)(36756003)(478600001)(6506007)(8676002)(53546011)(44832011)(5660300002)(186003)(4326008)(966005)(2616005)(71200400001)(110136005)(316002)(76116006)(66446008)(64756008)(66476007)(66946007)(66556008)(83380400001)(86362001)(33656002)(6486002)(6512007)(2906002); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: fKgZLt36P5gORklI09Rda89knyFQLobSZ5yybe0MtGP64CzCOFaYNyAX1S769S08fTn8yyZVVoLXJPlaDofKMwl0b/k+2+UWinKBxAf3liC4ajArUbdXG9OaR3jDsvA5YMLwruPtHPxCuerMlRvrH1gH/3y2jXBGxgPklwU8fLPW06uPxB/0ykdXIyt6KfBmmSXZV73WXL07u92/nitWug7G67E9UX93byKArEsEMDfVSFB+YxtDC6Nz83rslee1iq5ui6/9PnTHcE4pIePuvu6N8ujrGwHwvlr76R9LJJekrPSVo5D12rY4j/aorGEcFW+yhqdCj4I+JSZZSr9lMRw+zDGUK4Tp2wk/sU/nSpKQGwu370xSoyhq8D0XzkB/4FFs9MpXbJTHZxn7uC3r7H0u9tldUHypa+DIoxLiuYZh6h3ab8Ul+lS0Wr2XGtsE1Rku/eUQ30F/LSaoomi8gzCI7uo4HWkvbK3fvG3C1glv7pUc0nuC83x7oCWMHaFuriesqMu6AdGtMYz5NPOqbtm6WbzQs7cZNBOAnQOpGmemx+bm2b/vQ/EWemf54UhxvGr05yjUFJwhUaRha1203P2lddJQaFzna2iJXWlfGz1xq+F/JzcrkCtnQyO4/2CEVJTNFxPbF+wNcxwQ4gqgnt25axFFcvY7VbWpo+lLGt4OdDj0vKcYILx8k80NSGh19vC63JitrLiet129BPT8jw==
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-ID: <2134F1AD50C1444CBBABEEC20BF65058@eurprd07.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: ericsson.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: AM0PR0702MB3713.eurprd07.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 8f985cd9-9071-418b-c7e0-08d87681d53c
X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Oct 2020 11:58:47.2534 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: jjeQQeZ9usNbahFQmhoe7xkfawQePZQg/xliODzCDjOGXNVwBBO2CIXtHPbJqglc5TKCvvhFAADavromOrEpn7hYN0CyJQUg5ILbmVCoI64=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB4724
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/xd4uZUMk2O_j3ri7_T86MqBh1Cg>
Subject: Re: [tcpm] Updating Proportional Rate Reduction RFC6937 to PS
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Oct 2020 11:58:52 -0000

Hi Michael, hi Matt,

Frist I like to note that I support moving RFC6937 to PS and I'm happy to see that there is an additional heuristic to address "double drops". 

Years ago when we worked on RFC6937 in tcpm, I believe I observed a similar problem after slow start, where slow start will not only fill the queue but overshoot to double the cwnd. So when you "only" halve the cwnd as PRR does correctly, you end up will a full queue after recovery and you will see another congestion event soon after. Rate halving however would end up with only 1/4 of the cwnd after recovery and respectively empty the queue. However, in the end there was not much of a difference (regarding throughput and number of losses), only PRR would actually keep the queue slightly longer full... is this case also addressed by this heuristic?

Mirja



On 20.10.20, 08:11, "tcpm on behalf of Michael Welzl" <tcpm-bounces@ietf.org on behalf of michawe@ifi.uio.no> wrote:

    Hi again,

    You don't understand it because it doesn't make sense; I get it now, and I was just wrong, I'm sorry.
    I've recently been thinking too much about a paced, non-ack-clocked behavior, where things might indeed play out as I described, even with an overshoot of only one segment.

    With ACK clocking, as you said, that can't happen. In fact, I can see it become a problem only in one case: when the overshoot is so high that setting back the cwnd won't save you, even with ack clocking. But I think this translates exactly into "pipe < ssthresh", and this is the case that the heuristics cover - which, as you said, were designed to handle these double drops (*). So it all makes sense to me now, sorry for this excursion into the land on nonsense. Case closed, go home everyone, nothing to see here!  :-)

    Cheers,
    Michael
    --
    (*) In practice, I therefore think that these heuristics can play an important role after the very first slow start.



    > On 20 Oct 2020, at 07:25, Matt Mathis <mattmathis@google.com> wrote:
    > 
    > I don't understand the situation you are thinking about.  If the bottleneck BW changes and it causes loss any loss, you would expect to see a whole RTT of periodic losses between the first detected loss, and the ACK from the first retransmission.  If inflight (aka pipe) is larger than ssthresh, PRR will always send less data than has left the network.  This is guaranteed by the
    > conservative self clock and the "proportional" part of the algorithm.  Yes it drains the queue slowly, and yes cross traffic can cause additional losses, but that is because the cross traffic is being too aggressive (e.g. the cross traffic has to still be increasing its instantaneous rate for some reason).
    > 
    > Thanks,
    > --MM--
    > The best way to predict the future is to create it.  - Alan Kay
    > 
    > We must not tolerate intolerance;
    >        however our response must be carefully measured: 
    >             too strong would be hypocritical and risks spiraling out of control;
    >             too weak risks being mistaken for tacit approval.
    > 
    > 
    > On Mon, Oct 19, 2020 at 2:47 PM Michael Welzl <michawe@ifi.uio.no> wrote:
    > Hi,
    > 
    > In line:
    > 
    > > On Oct 19, 2020, at 7:20 PM, Matt Mathis <mattmathis@google.com> wrote:
    > > 
    > > The heuristic was designed to address the "double drops" that you noted.    Both RFC 6675 and PRR-SSRB can be too aggressive in situations where there are burst losses that cause the flight size to fall below ssthresh.  Sometimes these are ok, but they often cause lost retransmissions.  The new heuristic is to prevent these situation from persisting.
    > 
    > I get that, but what I’m concerned about is a case where pipe is *not* below ssthresh, so the heuristics don’t even apply. It’s enough to consider the “diagram" on page 7, where it’s clear that RFC 6675 takes a break from segment 4 to 11 but PRR (or rate-halving) doesn’t. This is fine when the new rate is smaller than the bottleneck’s capacity, but if it’s not, the queue should just overflow again, I believe.
    > 
    > (again, these are all more suspicions than anything else - I only have indications, from our own experiments and the paper I pointed to, nothing really clear that shows that this really happens; come to think of it, it actually wouldn’t be hard for me to run a check in our testbed… perhaps that would be better than making a fuss here about something that may turn out to be nothing. Then again, maybe it’s a good conversation to have anyway, just to see what the thinking around this type of concern is.)
    > 
    > 
    > > These situations are hard to reason about because they can not be caused by events that are modeled by simple queues.  The Flatch paper is about token bucket policers which cause huge losses when they run out of tokens -- all without significant queues.  Other potential events include bursty cross traffic from long RTT or non-responsive flows.
    > 
    > Ok, but that’s not the case I mean, mine is simpler (above). Even just a single packet loss would suffice for the situation I’m talking about.
    > 
    > 
    > > With packet conservation, a single flow can not cause self inflicted losses that last more than one RTT, and thus can't cause lost retransmissions.
    > 
    > Hmmmmmm….. I doubt this; I think the backoff factor and queue also play a role. The point is that the new cwnd may be too much, just like the old was too much.
    > 
    > 
    > >     If there are lost retransmissions something happened that is outside the scope a simple network model.
    > > 
    > > While I agree that 6675's half RTT of silence reduces queue pressure, the responsibility for that really belongs to congestion control, which should not be controlling against queue full.
    > 
    > I agree with the thought, but in IETF terms I think that a recovery mechanism in a PS RFC has to be compatible with RFC 5681 too.
    > 
    > 
    > >   The goal of PRR is to minimize the opportunities to lose the self clock by accurately controlling flightsize to the target set by the congestion control.
    > > 
    > > The heuristic also helps in the situation where there was a large step change in the available bandwidth, and the CC can not possibly estimate the the correct flightsize quickly enough. 
    > 
    > I understand; generally, I don’t doubt that there are many good aspects to PRR, and that it will work better than RFC 6675 in many cases. The half-RTT-of-nothing pattern is just odd.
    > So maybe what I’m worrying about is a corner case. But, as we talk about proceeding to PS, I think it deserves a thought.
    > 
    > Cheers,
    > Michael
    > 

    _______________________________________________
    tcpm mailing list
    tcpm@ietf.org
    https://www.ietf.org/mailman/listinfo/tcpm