Re: [tcpm] Updating Proportional Rate Reduction RFC6937 to PS

Hi,

Are there any known bad experiences with PRR at all?

I like RFC 6937, as I appreciate that it's trying to do the right thing, in the right way (and, as a side note, at least I did enjoy its “tutorial” tone writing style) … but I do think that there may be situations where the RFC 6675 style reduction has an advantage, as it allows the queue to drain for half an RTT before sending again.

We once saw consistent double drops upon entering FR/FR from CA in local tests with a large queue (above a BDP), and ended up blaming PRR for it, as it would detect loss (little loss) and quite immediately set the rate to half of the previous rate, which (in case of an above-BDP bottleneck queue) was still faster than the capacity limit. Then, the queue never really got a chance to drain (it seemed), it filled up again, and there was another loss. Now, one could argue that such long queues are not a case to optimize for anyway, but e.g. Cubic’s backoff could show the same behavior with a shorter queue.

We didn’t investigate this deeper, and now I can’t be say for sure if this really was PRR’s fault - but shortly after this experience, I stumbled over this:
https://www.ee.technion.ac.il/~isaac/p/sigcomm16_vcc_extended.pdf
(long version of the VCC Sigcomm’16 paper)

… which also finds double drops occuring as a result of PRR (see appendix B).  I have to admit that I don’t fully get the discussion around ECN in this appendix; altogether, I’m not really convinced there are  hard facts about this being a “problem” at all, but I thought it’d be worth bringing to the group’s attention. Maybe someone has investigated this deeper and found out if this is a real issue or not.

Cheers,
Michael

> On Oct 18, 2020, at 3:56 AM, Matt Mathis <mattmathis=40google.com@dmarc.ietf.org> wrote:
> 
> Following a discussion with the tcpm chairs, the authors of RFC6937 plan to introduce a .bis document to update PRR from Experimental to Proposed Standard.   PRR is supported in one form or another in 3 major operating systems and has come to be very widely deployed over the last several years.
> 
> There have been no changes to the base algorithms for PRR-CRB (Conservative Rate Bound) and PRR-SSRB (Slow Start Rate Bound).  However PRR can be substantially improved by using a heuristic to dynamically switch between algorithms, depending on the presence of additional losses.    We plan to present a candidate heuristic, however there has not been any deep studies of alternatives.  This approach was first described in section 5.2. of Flach et al "An Internet-Wide Analysis of Traffic Policing" <https://dl.acm.org/doi/abs/10.1145/2934872.2934873> and has already been upstream for several years.
> 
> I could use some editorial advice:  RFC 6937 is too long and much too "tutorial" in tone.   Does it work for RFC 6937.bis to state the algorithms in normative language, but to have non-normative references to RFC 6937 for context and background?  Can somebody point me to a pair of existing RFC's that use this approach?   How much explanation should remain in RFC 6937.bis?
> 
> We are aiming to have something ready for the next tcpm meeting.
> 
> Thanks,
> --MM--
> The best way to predict the future is to create it.  - Alan Kay
> 
> We must not tolerate intolerance;
>        however our response must be carefully measured: 
>             too strong would be hypocritical and risks spiraling out of control;
>             too weak risks being mistaken for tacit approval.
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm