Re: [tcpm] [tsvwg] draft-han-tsvwg-cc

Toerless Eckert <tte@cs.fau.de> Fri, 16 March 2018 16:23 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 62D2E12D868 for <tcpm@ietfa.amsl.com>; Fri, 16 Mar 2018 09:23:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.039
X-Spam-Level: *
X-Spam-Status: No, score=1.039 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=5, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zOkY3leOn3-g for <tcpm@ietfa.amsl.com>; Fri, 16 Mar 2018 09:23:44 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 61092126C83 for <tcpm@ietf.org>; Fri, 16 Mar 2018 09:23:44 -0700 (PDT)
Received: from faui40p.informatik.uni-erlangen.de (faui40p.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:77]) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTP id 499EA58C4E5; Fri, 16 Mar 2018 17:23:39 +0100 (CET)
Received: by faui40p.informatik.uni-erlangen.de (Postfix, from userid 10463) id 2C24AB0DD48; Fri, 16 Mar 2018 17:23:39 +0100 (CET)
Date: Fri, 16 Mar 2018 17:23:39 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: "Scharf, Michael (Nokia - DE/Stuttgart)" <michael.scharf@nokia.com>
Cc: Thomas Nadeau <tnadeau@lucidvision.com>, "tcpm@ietf.org" <tcpm@ietf.org>, Yingzhen Qu <yingzhen.qu@huawei.com>
Message-ID: <20180316162338.GD18047@faui40p.informatik.uni-erlangen.de>
References: <20180316040208.GA9492@faui40p.informatik.uni-erlangen.de> <VI1PR0701MB255800F95712D680DECC977C93D70@VI1PR0701MB2558.eurprd07.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <VI1PR0701MB255800F95712D680DECC977C93D70@VI1PR0701MB2558.eurprd07.prod.outlook.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/8mJyWRT4XmaUpU-16eCjEhX5ng0>
Subject: Re: [tcpm] [tsvwg] draft-han-tsvwg-cc
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Mar 2018 16:23:49 -0000

On Fri, Mar 16, 2018 at 01:44:48PM +0000, Scharf, Michael (Nokia - DE/S So, thats what you then usetuttgart) wrote:
> Your points seem not to be about my questions on why draft-han-tsvwg-cc-00 would be 
> a) better as compared to running today's TCP congestion control over the same scenario,
> b) and how the congestion control would be ensured to work properly.

I did say i was not answering all questions but tried to take it
from the top, e.g.: how this fits into the existing IETF
model for PHB/DSCPs and bandiwdth management/admission control.

a) I had sent previously already an example use-case through which i tried
to explain how we think this improves over todays CC in TCP, but i
think that was not Cc'ed to CPM, but only when the discussions started
on TSVWG, so let me repeat here:

Many applications nowaday are elastic, but need a guaranteed minimum amount
of bandwidth. In todays world of solely "fully elastic from 0 on" congestion
model, we have a wide and not controllable resulting share
under congestion. This share solely relies on how "aggressive" the CC
is. between the least aggressive delay variation over to the most aggressive
rate estimation approaches like BBR, we may have 10x share difference. Instead
of calling such ranges fair and stop improving, i think it is more appropriate
to figure out how one can move to a fairness model where its more about the
resulting experience than the absolute bitrates alone:

What i think is the most easy incremental step we can do is to combine the
CC based bandwidth share with an admission controlled/bandwidth management
ensured minimum bandwidth. As said in before, this is not a new concept, this
was proposed in well received drafts even 10 years ago, but not trying to
be applied to simply TCP proper. Yet even today, the mayority of applications
even in applicable scenarios such as video streaming heavily rely on TCP.

Consider simple scenarios such as TV-sets/STBs in the home. You have the
$4000 huge living room TV set where you want to ensure 4k always. And you
have a bunch of other cheaper TV sets, and you can't get 4k on all of them
at the same time due to DSL speed limit. So you use an appropiate mean
of admission control to figure out what guaranteed bandwidth will fit
into the DSL link if worstcase all TV sets are active and come up e.g.:
with a minimum guaranteed CIR of the 720p equivalent for all other TVs
plus the 4k equivalent for the living room. During times when there is
no contention, every TV gets the max bandwidth it supports (most may likely
not be 4k anyhow), and during peak hours during the weekend, you have
ensured the desired minimums for them.

This approach well generalizes. You typically do have to scale up your
bandwidth to support desirable minimums across peak hour maximum use anyhow,
but you have no way to easily apply this to existing TCP using applicatoins
today. This scheme introduces that ability.

b) I am sure there will be a lot more details to look into re. the details
of competing flows, but in basic, once the flow goes down to CIR it
simply turns into an inelastic flow. To any flows that are operating
at rates igher than CIR, this will just be as if operating on competing bandwidth
reduced by the aggregate CIRs of  CIR>0 flows.

> > Fundamentally, to comply to existing IETF QoS architecture, this method
> > should be used in traffic classes with DS code points intended to be used
> > with admission control/bandwidth management.  Such as AF.  or EF, or
> > (forgot the others, but we'll look them up). Definitel not BE.
> 
> My reading of the document is that it assumes that CIR and PIR applies to a single TCP connection and CIR is guaranteed *for each connection*. So the admission control and bandwidth management must apply for each TCP connection using this congestion control scheme, not for traffic aggregates.

a) CIR:

Yes. As a starting point CIR is guaranteed:, this is exactly how
it will look like to the TCP layer operating with the CIR > 0 modification.
Nevertheless, this does not mean that the overall bandwidth management needs to
be constrained to just that. There are quite useful more
flexible use cases in which this CC scheme can be applied.

For once, the admission control / bandwidth management scheme does
not need to actually keep track of every active allocation. It
just needs to make sure that the sum of all flows that are given
a CIR will fit into the available bandwidth - and still leave
headroom for any CIR=0 flows to reach an "experience fair"
bitrate during peak hour. 

The notion of knowing all flows by name upfront may look silly
from the average core internet perspective, but scenarios like
a home network or detnet very often can life with such simplified
admission control.

Of course, when you do know that you can not support the peak
hour sum of all possible CIR you need to have actually signaled
admission control. We prefer inband, but of course any option
(bandwidth broker == offpath PCEC, RSVP,...) is fair game.

Another vector of flexibility are traffic groups. For example,
in video conference with N participants, you typically only
as man flows active as there are screens, eg: 3 out of N
(e.g.: flow going to an MCU). So here you could have
a bandwidth scheme that allocates to the application the bandwidth
for for 3 streams (eg: 3 Mbps), but it does not matter which 3 out
of the N streams are actively sending. Each participants TCP
stream get a CIR of 1 Mbps, and the application control ensures
that only 3 of the N flows are actively sending at the same time.
(typically the last three who spoke).

I don't want to bring any of this admission control complexity
into the poor TCP layer, i am just trying to explain some
examples of the feasible flexibility of admission / bandwidth
management schemes on top of it.

b) PIR: I think for the purpose of TCP behavior it is only a
self-constrained to that PIR. As far as the CC draft is
concerned, i think there should be no expectation
of guarantees involved between CIR and PIR bitrates other
than normal CC experience.

Think of just an app indictaing CIR=0, PIR=X. It has
full elsaticity (e.g.: no natural max limit, e.g.: file transfer)
and uses this simply to be paced by TCP.

Having said this, there are nice ways on how QoS/bandwidth
management mechanisms could provide further guarantees. For example
you could manage bandwidth through guaranteeing the CIR and
providing per-flow weighted fair queues for the CIR..PIR range,
where the weight is just (PIR-CIR). This would result in
quite controlled relative upspeeding between CIR-PIR for those
flows, so flow for whom the fair quality experience of N times
that of anothe rflow would just need to have a 4 times larger
(PIR-CIR) range.

[ Note that per-flow weighted fair queues where used in products
  as late as the end of the 90th, e.g.: in Cisco IOS with RSVP,
  its just that the onset and prevalence of cheap silicone...
  and now silicone becomes intelligent enough to revive
  these solution scenarios. But as said in before, i would like to
  keep the CC spec as independent of these options as possible. ]

> This specifically means that TCP connections will have to be rejected by the network if they intent to use this congestion control mechanism but the CIR cannot be guaranteed, or the network would have to ensure that the congestion control falls back to TCP Reno in that case. This has to be specified, otherwise the congestion control is not safe.

Right. See above for how varied this can be though. Need to figure
out how to keep it more pithy in the next text rev than my
probably too extensive examples given here.

I think th text should say that CIR can only be indicated
by app to TCP if reasonable admission control /bandwidth for
the classed used has succeeded. And TCP itself MUST revert
to CIR=0 behavior or stop if OAM or circuit breaker
mechanism indicate that the CIR is not actually available.

> > I think we definitely should define a circuit breaker based on TCP behavior,
> > based on the principles of RFC8084.
> 
> The TCP congestion control already takes care of this, no?

As discussed, when the rate goes below CIR, the flow becomes
inelastic, and then we need to figure out how to disting randomn loss
which we just want to correct from congestion loss which we would
have to interpret as a circuit breaker.

> > Something like:
> > - No TCP replies for more than N secs, break circuit
> > - Only loss indication for more than M secs, fall back
> >   to CIR=0 behavior or break circuit (app based).
> 
> This is basically the TCP RTO.

I think thats just one case.

> > Maybe also based on other TCP derived error recognition.
> > Suggestions welcome. RFC8084 is so new, i have not seem good practices
> > that could be applied to TCP. And i guess unlikely others have thought about
> > this, given how there was no minimum CIR inelastic behavior for TCP..
> 
> If the network guarantees a minimum CIR to a single TCP connection, the TCP congestion control will fully use this CIR as long as there is no congestion, in particular if one uses a modern TCP stack with high-speed congestion control, efficient error recovery (SACK), and possibly even ECN (to avoid packet drops if PIR is reached, which can make sense for latency-sensitive apps). So an inelastic application that needs at minimum CIR should work perfectly with existing TCP over that path with CIR guarantees, right?

I think there are at least two points.

One is the proposed behavior change when experiencing congestion at rate > CIR,
falling back to just CIR, not below.

The other one isbased on how the diffserv class/PHB actually separates flows. The goal is to have the CIR>0 TCP flow self-guarantee its CIR even when there is congestion. This is of course only necessary when the PHB does not separate out flows into separate queues, if it does, then the queue should take care of that.

> And the same TCP congestion control will also work very well if for some reason CIR cannot be guaranteed in the network. That can e.g. occur even in a network that can guarantee CIR and PIR, e.g., if there is network misconfiguration. In that case, the app will need to throttle down their sending rate, but as the network does not deliver CIR, there is no alternative to reduce the rate anyway.

As said above, its about te modification that work towards distinguishing randomn
loss from congestion loss and the ability to support leightweight DSCP/PHB
implementations, where there will be no network based defense of CIR for
flows who have them.

> > When sending with less than CIR, it should be subject to a (binary) parameter
> > whether or not to use ECN as an explicit congestion OAM indication resulting
> > in a faster circuit breaker/revert to CIR=0, or whether to ignore it completely
> > (and use the above loss/other TCP parameter based circuit breaker).
> 
> That seems to be a different algorithm than what is currently in the draft.

Not different, just explained how the not-yet-included-in-00 ECN
behavior would be.

> The starting point for a discussion about congestion control is a complete specification

;-) I can mostly remember early drafts in in rmcat, so i am not sure
about accurate metrics for "complete", but we definitely  have good
improvements to do for -01.

> and measurement results.

Did you take a look at the 10 year old draft-lochin-ietf-tsvwg-gtfrc-02
i pointed you to ? Would a presention of analysis as done in that document
meet your bar for measurement results (obviously, one would need to repeat
this type of analysis for the proposed TCP mechanism instead of TFRC) ?

> This could possibly also be done at research conferences.

Given how e.g.: Emmanuels draft did the same 10 years ago for TFRC, and this work
tries to apply the same scheme to TCP proper, i wonder how interesting
this is to research. To me this is a long overdue, well understood,
pragmatic, quite constrained extension to TCP (obviously in support of a
larger framework of improving QoS/admission control systems).


> > Parameter really depends on how DSCP is set up:
> > 
> > When you end up in a single queue / same drop priority with flows sending
> > more than CIR, then you will see congestion even though you're sending
> > below your admission approved CIR, so you need to ignore ECN. This is not
> > an ideal case, but would happen if you only use external,
> > eg: SDN/PCEP admission control schemes without e.g.:
> > our in-band signaling (mileage with RSVP would vary based on how it controls
> > the forwarding plane).
> 
> I cannot follow why it would be appropriate to ignore ECN as in that case a router is clearly congested. 

As said, this is primarily an issue when having to defend CIR against competing
flows in same queue, so not an issue in the more desirable deployment
options such as with inband signaling, so i am happy to ultimtely derive
at conclusions where we may not want to call that option a standards option,
it would just reduce the possible deployment options.

I am actually not sure if you would have to ignore ECN to defend CIR in
the same queue when the other competing flows are well behaved and also support
ECN. It seems obvious to me that you will always loose out when you do
support ECN but competing flows now. As long as you are in the (CIR...PIR) range,
thats basically the price you pay for playing nice against a bully. Its just when you
reach down to CIR you shouldn't want to be nice anymore in the same way -
you've been competing with a bully all the way down from CIR..PIR and loosing
to a bully, now you want to be affirmative ;-)

> As far as I recall as "TCP wizard", the acronyms "SDN" and "PCEP" have not been mentioned in TCPM so far, albeit PCEP is a TCP-based protocol. As this have not come up in TCPM so far, I would assume that the use of "SDN" and "PCEP" technology could also work with the existing TCP standards. What do I miss?

I am probably loosing these terms to naively too, the correct
terminology i think that has ben used in DiffServ QoS docs is bandwidth
broker to indicate offpath bandwidth management. I am using SDN somewhat
cynical for everything offpath. And on second thought, let me withdraw the
terms PCEC (PCEP was typo) and only bring it back when i am more cerrtain
about its applicability as one option.
>
> > I would definitely default ECN=on to ensure some integration with external
> > bandwidth management explicitly turns it off when its known to break the
> > scheme.
> > 
> > To re-summarize:
> > 
> > - Only use on bandwidth management compatible traffic classes
> 
> No, as far as I understand, draft-han-tsvwg-cc requires bandwidth management per TCP connection

"per-flow bandwidth management compatible traffic classes"

> > - Constrain to controlled network otherwise
> 
> No to "otherwise"

But tht was the point i was making about the standards RFC for traffic classes such as AF. We can go back, but as far as i remember none of these have been defined to be constrained to controlled networks. They ar all applicable to Intenet. They just have almost nowhere ben deployed there.But its definitely an architecturally saner approach to constrain applicability to well behaved and defined DSCP/PHB behavior than the somewhat vague "controlled network" term.

> > - no redefinition of intended use of traffic classes
> >   (i think, need to revisit some AF details)
> 
> No, out of my head per-TCP connection admission control would require changes in AF (but I don't recall for sure)

I've been using AF with RSVP forever, admittedly only with RTP, but i don't think that the protocol should matter. Just the behavior of the protocol. And not being able to use TCP easily in those classes is one of the drivers.

> > - circuit breaker making it safe against
> >   use across ERRONEOUS use across non-controlled/Internete/BE.
> 
> No, circuit breaker is not needed in TCP as long as applications are elastic. For totally unelastic traffic, the question would first be why to use TCP first of all and I don't see this answered in the drafts.

I did give examples... More and more even video apps rely on TCP (for better or worse).

> > - ECN > CIR, ECN as parameter < CIR
> 
> No, at first sight that does not make sense but I would need to see a full spec, measurement, etc. 
> 
> Thanks
> 
> Michael

Thanks
    Toerless
> 
> 
> > Of course, lots of still unanswered points from your side, but hope this is a
> > good start.
> > 
> > Cheers
> >     toerless
> > 
> > P.S: You said no hat on, but i still see the pointy TCP wizard hat.
> > 
> > Authors, all,
> > 
> > I have read draft-han-tsvwg-cc-00. Below I have listed a number of
> > questions, which I believe would have to be addressed when discussing such
> > a mechanism in the IETF or IRTF.
> > 
> > This e-mail is strictly limited to the content of draft-han-tsvwg-cc-00. As the
> > draft does not specify how the CIR and PIR will actually be guaranteed in the
> > Internet, as well as how OAM signaling will work at Internet scale, I will not
> > comment here on these assumptions, except regarding requirements that
> > strictly follow from the content of the I-D. The technical, economical, and
> > regulation aspects of the assumptions are not in scope of TCPM and they
> > need to be discussed and solved elsewhere.
> > 
> > Questions on draft-han-tsvwg-cc-00:
> > 
> > 1/ The document seems to implicitly assume that network resources are
> > reserved for *every* single TCP connection, right?
> > 
> > 
> >   *   If that assumption is correct, it has to be spelt out explicitly in the text
> > and it has to be noted that the underlying technology has to provide these
> > capabilities *for every single* TCP connection.
> >   *   Otherwise sentences like "after a TCP session is successfully  initiated its
> > congestion window (cwnd) jumps to CIR" would not make sense as multiple
> > TCP connections within an traffic aggregate policied by CIR/PIR could start to
> > all send with CIR in parallel, which would trigger massive congestion.
> >   *   As an example, in my reading draft-han-6man-in-band-signaling-for-
> > transport-qos-00 would allow also reservations e.g. for aggregates of
> > multiple TCP connections. Such an operation mode seems not be compatible
> > with the suggested mechanism in this I-D, as far as I understand. So the
> > requirements have to be made explicit.
> >   *   Also, sentences such as "it is assumed that in bandwidth guaranteed
> > networks there have been network resources (bandwidths, queues etc.)
> > dedicated to the TCP flows" have to be corrected to specify that for the
> > mechanism in this draft to work correctly, the resources have to be
> > guaranteed to every single TCP connection, not multiple "flows".
> > 
> > 2/  Why does the document not rely on ECN (and not even reference ECN)?
> > 
> > 
> >   *   For instance, the following requirement "It is important that OAM needs
> > to be able to detect if any device's  buffer depth has exceeded the pre-
> > configured threshold, as this is an indication of potential congestion and
> > packet drop" could possibly be solved by ECN, no?
> >   *   Even in case another OAM mechanism could be used in addition, a
> > comprehensive TCP congestion control specification would have to also cover
> > the reaction to ECN marks as well, as well as the potential combination of
> > feedback results. Why is this missing?
> >   *   Or would the document mandates that ECN MUST NOT be enabled for
> > TCP connections using this congestion control mechanism?
> > 
> > 3/ Why does the document assume that congestion windows are calculated
> > in segments and not in bytes?
> > 
> > 
> >   *   RFC 5681 as well as many other RFCs calculate CWND in bytes.
> >   *   However, I believe equations such "MinBandwidthWND = CIR *
> > RTT/MSS" or "MaxBandwidthWND = PIR * RTT/MSS" would return a window
> > counted in MSS segments.
> >   *   Apart from the mismatch with the TCP standards, this sort of equation
> > might also requires a discussion on how to deal with integer division.
> > 
> > 4/ How does the mechanism deal with IP and TCP header overhead?
> > 
> > 
> >   *   TCP window sizes are about the TCP bytestream, while the actual IP
> > packets sent by a TCP/IP stack will include an IP and TCP header. If one
> > neglects the IP and TCP headers in the congestion window calculation, the
> > resulting IP packet rate will be larger than the CIR and PIR seen in the TCP
> > layer. This could result in packet drops if CIR and PIR are enforced e.g. for IP
> > packet length.
> >   *   How will this problem be solved? Note that TCP (and also IP) can include
> > header options, which results in variable header sizes. The number of TCP
> > options can be different for each TCP segment. How does this congestion
> > control mechanism correctly handle the headers and the options in IP and
> > TCP headers?
> > 
> > 3/ How does the document deal with RTT variations? Is the assumption that
> > the RTT is constant?
> > 
> > 
> >   *   As far as I can tell from experiments, the RTT estimation is important
> > when applying a rate to window-based congestion control, which is what this
> > document does.
> >   *   Equations such "MinBandwidthWND = CIR * RTT/MSS" or
> > "MaxBandwidthWND = PIR * RTT/MSS" only provide a window equivalent to
> > the bandwidth-delay product of the path if the RTT sample is a correct
> > prediction of the actual delay that the segments in flight will experience. How
> > does the mechanism suggested in this document correctly predict the future
> > RTT of the segments that are sent by the sender at a given point in time?
> >   *   As an example, assume that the RTT at time t=10s is determined as 80ms.
> > Assume PIR = 10 Mbps and neglect the questions 3/ and 4/. Then this
> > document would probably assume that MaxBandwidthWND=100 kB is the
> > bandwidth delay product of the path during t=10s and t=10.08s, i.e., the
> > maximum amount of outstanding data that can be sent in that time without
> > drops (or exceeding PIR). But assume that the actual round-trip delay of
> > segments has dropped to 40ms after the last RTT management, which means
> > that the maximum bandwidth delay product of the path at t=10s+epsilon is
> > only 50 kB. As a result, 50 Kb out of the congestion window would likely be
> > dropped during t=10s and t=10.08s. And, due to the wrong RTT value, the
> > effective data rate of the sender could even be 20 Mbps, if the RTT mismatch
> > is not detected immediately, or, e.g., if EWMA will delay the update of the
> > weighted RTT parameter to the actual value.
> >   *   So how does the proposed scheme to indeed determine a window that
> > meets the statement "This means a TCP sender is never allowed to send data
> > at a rate larger than PIR"  if the RTT is not constant? Does this assume rate
> > pacing in the TCP sender for each TCP connection?
> > 
> > 4/ How is it ensured that OAM alarms will reach the TCP sender in time in all
> > possible "random failure" cases?
> > 
> > 
> >   *   As far as I understand, the following statement "When a sender receives
> > the third duplicated ACK, but no previous OAM congestion alarm has been
> > received, then it is considered that a segment is lost due to random failure
> > not congestion.  In this case the cwnd is not changed." mandates that an
> > OAM alarm is received prior to the third duplicate ACK *in all potential cases*
> > of congestion. If the OAM alarm got lost or delayed, this condition would
> > imply that cwnd is not changed despite a segment has been dropped due to
> > congestion, which would be a violation of fundamental Internet congestion
> > control principles.
> >   *   Please expand on how this document ensures that cwnd will be reduces
> > in all potential cases when a packet gets dropped due to congestion, and
> > what requirements on the OAM alarms propagation follow from that. Of
> > specific interest are effects such as packet drops of packets relevant for the
> > OAM information, reordering of packets, asymmetric routing in forward and
> > reverse direction, use of multiple paths in parallel (ECMP), and the like. If the
> > document makes assumption about the path such as in-order packet delivery
> > or the like, these assumptions need to be spelt out explicitly.
> >   *   I understand that the OAM could be solved in different ways and the
> > solution is independent of this document. But this document has to
> > comprehensively specify all technical requirements that the OAM
> > mechanism has to meet in order to ensure that every single packet drop due
> > to congestion always results in a cwnd reduction. Otherwise the algorithm
> > has to change as it does not safely prevent congestion.
> > 
> > 5/ What is the expected performance benefit? Are there situations in which
> > performance will be worse than standard TCP congestion control?
> > 
> > 
> >   *   The document does not contain any data of potential improvements or
> > deteriorations as compared to the TCP standard congestion control. I assume
> > that such data will be presented to explain why this mechanism is proposed,
> > and what benefits it has.
> >   *   As I have experimented with similar mechanisms quite a bit, I believe
> > there will be cases in which this congestion control mechanism will perform
> > worse than a TCP sender fully compliant to RFC 5681, when using the same
> > network path with CIR and PIR guarantees. I believe this document should
> > analyze these cases and reason why a worse performance than standard TCP
> > congestion control will be acceptable. IMHO this issue will specifically apply in
> > cases when PIR is significantly larger than CIR, and if the RTT is large. As far as
> > I can see, this draft mandates to start the data transfer in congestion
> > avoidance at CIR rate, which means that it can take many RTTs until the
> > sender reaches the PIR. In contrast, RFC 5681 will run slow-start, and RFC
> > 5681 states that the "initial value of ssthresh SHOULD be set arbitrarily high".
> > This means that the TCP sender can reach PIR within few RTTs and thus can
> > send with full PIR speed, while a sender using draft-han-tsvwg-cc-00 will
> > send with a much lower speed CIR+epsilon. For short and medium-sized data
> > transfers, IMHO it can happen that the congestion control according to RFC
> > 5681 will significantly outperform the mechanism suggested in this
> > document, i.e., it will complete data transfers orders of magnitude faster
> > even without any knowledge about CIR and PIR. Have the authors compared
> > this mechanism to the performance of RFC 5681?
> >   *   Also, have the authors compared the performance of this mechanism as
> > compared to a modern TCP stack, which often use RFC 6928 (IW 10) and RFC
> > 8312 (CUBIC)? In what cases has the suggested congestion control a better
> > performance? I ask this because I have performed experiments 10 years ago
> > with congestion control schemes that have some similarity to what is
> > suggested here, and they also used knowledge about the path properties. In
> > those experiments, it turned out to be quite difficult to design an algorithm
> > that uses knowledge about the path (such as CIR/PIR) and that outperforms
> > CUBIC in combination IW 10, even if such a stack is totally unaware of the
> > path. This has been discussed e.g. in ICCRG
> > https://www.ietf.org/proceedings/73/slides/iccrg-2.pdf. As context, "more-
> > start" in this document is somewhat similar to what is proposed in this I-D
> > (but applied to CUBIC), while the "initial-start" graphs somehow corresponds
> > to what was later specified in RFC 6928 (IW 10) and RFC 8312 (CUBIC).
> > 
> > 6/ For what application traffic patterns is this mechanism proposed?
> > 
> > 
> >   *   The document states in Section 3 "... with the development of new
> > applications, such as AR/VR". What properties do such applications have to
> > leverage the mechanism suggested in this document? Is it possible to
> > characterize what the "new" requirements are, and how the suggested
> > algorithm meets these requirements?
> >   *   Is it suggested to apply this congestion control to real-time media traffic
> > over TCP? If so, what would be the benefit of using TCP in general and of the
> > specific mechanism compared to congestion control algorithm for such traffic
> > (e.g., the RMCAT working group)?
> > 
> > This list of questions is not comprehensive, but I'll stop here.
> > 
> > Regarding potential next steps of this document in the TCPM working group,
> > I believe that the applicable TCPM charter wording is: "In addition, TCPM may
> > document alternative TCP congestion control algorithms that are known to
> > be widely deployed, and that are considered safe for large-scale deployment
> > in the Internet." Until these prerequisites are fulfilled in the Internet, in my
> > view the document cannot be adopted in TCPM. Research could be
> > performed in the IRTF, e.g., in ICCRG.
> > 
> > Thanks
> > 
> > Michael (with no hat on)

-- 
---
tte@cs.fau.de