Re: [tcpm] Updating Proportional Rate Reduction RFC6937 to PS

Michael Welzl <michawe@ifi.uio.no> Sun, 18 October 2020 21:07 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 340A23A0C5F; Sun, 18 Oct 2020 14:07:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[HTML_MESSAGE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tZWIj6bE2e1A; Sun, 18 Oct 2020 14:07:14 -0700 (PDT)
Received: from mail-out02.uio.no (mail-out02.uio.no [IPv6:2001:700:100:8210::71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 503AD3A0C55; Sun, 18 Oct 2020 14:07:10 -0700 (PDT)
Received: from mail-mx11.uio.no ([129.240.10.83]) by mail-out02.uio.no with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93.0.4) (envelope-from <michawe@ifi.uio.no>) id 1kUFtU-000Cuv-2c; Sun, 18 Oct 2020 23:07:08 +0200
Received: from ti0182q160-0624.bb.online.no ([109.189.132.119] helo=[192.168.1.12]) by mail-mx11.uio.no with esmtpsa (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) user michawe (Exim 4.93.0.4) (envelope-from <michawe@ifi.uio.no>) id 1kUFtS-000B08-U0; Sun, 18 Oct 2020 23:07:08 +0200
From: Michael Welzl <michawe@ifi.uio.no>
Message-Id: <2CE9D0F2-88B6-4736-99C8-1533F625ACAA@ifi.uio.no>
Content-Type: multipart/alternative; boundary="Apple-Mail=_7CCE749E-EFB5-47AC-8F77-5C169DB7A1B4"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
Date: Sun, 18 Oct 2020 23:07:05 +0200
In-Reply-To: <CAH56bmDXUrJRdnCRq1mug95B16yUQFp4mN4Hur7q9aau-DAk0Q@mail.gmail.com>
Cc: tcpm IETF list <tcpm@ietf.org>
To: Matt Mathis <mattmathis=40google.com@dmarc.ietf.org>
References: <CAH56bmDXUrJRdnCRq1mug95B16yUQFp4mN4Hur7q9aau-DAk0Q@mail.gmail.com>
X-Mailer: Apple Mail (2.3445.104.17)
X-UiO-SPF-Received: Received-SPF: neutral (mail-mx11.uio.no: 109.189.132.119 is neither permitted nor denied by domain of ifi.uio.no) client-ip=109.189.132.119; envelope-from=michawe@ifi.uio.no; helo=[192.168.1.12];
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, AWL=0.001, HTML_MESSAGE=0.001, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 8ABED5429B589392FCC4A3F90F53AE1D3F065255
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/NU2Hm86s3r5DDd5OsaLmWOTRXPQ>
Subject: Re: [tcpm] Updating Proportional Rate Reduction RFC6937 to PS
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Oct 2020 21:07:17 -0000

Hi,

Are there any known bad experiences with PRR at all?

I like RFC 6937, as I appreciate that it's trying to do the right thing, in the right way (and, as a side note, at least I did enjoy its “tutorial” tone writing style) … but I do think that there may be situations where the RFC 6675 style reduction has an advantage, as it allows the queue to drain for half an RTT before sending again.

We once saw consistent double drops upon entering FR/FR from CA in local tests with a large queue (above a BDP), and ended up blaming PRR for it, as it would detect loss (little loss) and quite immediately set the rate to half of the previous rate, which (in case of an above-BDP bottleneck queue) was still faster than the capacity limit. Then, the queue never really got a chance to drain (it seemed), it filled up again, and there was another loss. Now, one could argue that such long queues are not a case to optimize for anyway, but e.g. Cubic’s backoff could show the same behavior with a shorter queue.

We didn’t investigate this deeper, and now I can’t be say for sure if this really was PRR’s fault - but shortly after this experience, I stumbled over this:
https://www.ee.technion.ac.il/~isaac/p/sigcomm16_vcc_extended.pdf
(long version of the VCC Sigcomm’16 paper)

… which also finds double drops occuring as a result of PRR (see appendix B).  I have to admit that I don’t fully get the discussion around ECN in this appendix; altogether, I’m not really convinced there are  hard facts about this being a “problem” at all, but I thought it’d be worth bringing to the group’s attention. Maybe someone has investigated this deeper and found out if this is a real issue or not.

Cheers,
Michael


> On Oct 18, 2020, at 3:56 AM, Matt Mathis <mattmathis=40google.com@dmarc.ietf.org> wrote:
> 
> Following a discussion with the tcpm chairs, the authors of RFC6937 plan to introduce a .bis document to update PRR from Experimental to Proposed Standard.   PRR is supported in one form or another in 3 major operating systems and has come to be very widely deployed over the last several years.
> 
> There have been no changes to the base algorithms for PRR-CRB (Conservative Rate Bound) and PRR-SSRB (Slow Start Rate Bound).  However PRR can be substantially improved by using a heuristic to dynamically switch between algorithms, depending on the presence of additional losses.    We plan to present a candidate heuristic, however there has not been any deep studies of alternatives.  This approach was first described in section 5.2. of Flach et al "An Internet-Wide Analysis of Traffic Policing" <https://dl.acm.org/doi/abs/10.1145/2934872.2934873> and has already been upstream for several years.
> 
> I could use some editorial advice:  RFC 6937 is too long and much too "tutorial" in tone.   Does it work for RFC 6937.bis to state the algorithms in normative language, but to have non-normative references to RFC 6937 for context and background?  Can somebody point me to a pair of existing RFC's that use this approach?   How much explanation should remain in RFC 6937.bis?
> 
> We are aiming to have something ready for the next tcpm meeting.
> 
> Thanks,
> --MM--
> The best way to predict the future is to create it.  - Alan Kay
> 
> We must not tolerate intolerance;
>        however our response must be carefully measured: 
>             too strong would be hypocritical and risks spiraling out of control;
>             too weak risks being mistaken for tacit approval.
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm