Re: [iccrg] Questions on draft-han-tsvwg-cc-00

Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com> Sun, 18 March 2018 23:52 UTC

Return-Path: <spencerdawkins.ietf@gmail.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 30A4312D86B for <iccrg@ietfa.amsl.com>; Sun, 18 Mar 2018 16:52:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dg8Hc2XBkOHp for <iccrg@ietfa.amsl.com>; Sun, 18 Mar 2018 16:52:15 -0700 (PDT)
Received: from mail-yw0-x22b.google.com (mail-yw0-x22b.google.com [IPv6:2607:f8b0:4002:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8E6151241F8 for <iccrg@irtf.org>; Sun, 18 Mar 2018 16:52:15 -0700 (PDT)
Received: by mail-yw0-x22b.google.com with SMTP id y203so10442949ywg.5 for <iccrg@irtf.org>; Sun, 18 Mar 2018 16:52:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=ZcY45+l7ziPXzaU066v5K3VTxlb0GPSX7PF0UJsPQck=; b=j0jh05JtS+Dk5UVwnPM0u4j3HSt5pjXpPQx4U2c/vFt/lMxijih9zJuJiJGDpvJMdG +yCJwYwitjFVMjm+eXLgu88s+nFkhM93i7LLlUrocit0494SrOPOMAyQOTYAfSx1o6r0 rle+pUZY9GPzCuScAfGHClyfRmjAU4dQ5ZBeF0ApPOgIIH7D3+VmHJOTskZvxxRiWs5F Sbvb3PyQrlmxEORkh69SMo/QN92T1AYM7LlM+08m7GgvSLMaOohOHOjbbZXM/IfZvbjq DcJ5GKriLGMRX4ZfQaXhM8NEqlq5HuWY+70Ubl1W+t5zbzrKmXl/U7YpvpuU/4eE6t1s 85rA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ZcY45+l7ziPXzaU066v5K3VTxlb0GPSX7PF0UJsPQck=; b=ct3y5r17hil8A935X7J8jvJfpbp/xH28FyjLHLWrXS50qDFCsqQIkRCtlzbcqjUsUv Kp+/RqPcpb7ag5yzZVxjSnh+nwvKtQJWOf3g28xERUgsjsrq84JEduZ6HFrbqwAo/Ef5 D85twa9OvBVG9c9VJKRTmoaIalTuLklfFJ0Ih/rAUOjGfIQw9KvZGB1tajhghw6+c3Zp X6oTEsUm2537abPRbADFivfnjzvPycJt/OGM+rKTCA9IqkwMtHr36U4TZB9hDd+GRRN8 5ywalLIPXzcCPaVD0/Gi9j1C4405Wawqo4BwWIINtjJWKJD151EA4OB+ZWNHBGDG6QIQ LVVQ==
X-Gm-Message-State: AElRT7ELfUbRo3HK2lS0y4Ul1btK3+VWrtHX5Sk18H5DDtW5qFJ63vyL Xivj82kHLcUadbHYfixLqLPozWMxpUPS4vwMOE0=
X-Google-Smtp-Source: AG47ELs8bF1kVNTW5sjQSDQ1RCOMpaUL2nHH6/F88l79Dlgz1zHT7quVQ9urpstsHfgpLWBhzgnr9hrsyyf1sXNRpOI=
X-Received: by 10.129.199.13 with SMTP id m13mr3165723ywi.279.1521417134396; Sun, 18 Mar 2018 16:52:14 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a25:6b4d:0:0:0:0:0 with HTTP; Sun, 18 Mar 2018 16:52:13 -0700 (PDT)
In-Reply-To: <AM5PR0701MB25470F197ABD05A1374CDCC493D50@AM5PR0701MB2547.eurprd07.prod.outlook.com>
References: <AM5PR0701MB2547AA3C16E849FDFED8857093D00@AM5PR0701MB2547.eurprd07.prod.outlook.com> <EB58F23F-C561-4D9D-A926-43ED428F36D5@huawei.com> <27982D64-95F7-48DB-AADE-F8D3015CF790@huawei.com> <1D30AF33624CDD4A99E8C395069A2A162CDBC402@sjceml521-mbs.china.huawei.com> <AM5PR0701MB2547979A2B7D9D3CE40475E893D50@AM5PR0701MB2547.eurprd07.prod.outlook.com> <1D30AF33624CDD4A99E8C395069A2A162CDBC911@sjceml521-mbs.china.huawei.com> <AM5PR0701MB25470F197ABD05A1374CDCC493D50@AM5PR0701MB2547.eurprd07.prod.outlook.com>
From: Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com>
Date: Sun, 18 Mar 2018 23:52:13 +0000
Message-ID: <CAKKJt-cr-n5TNHA5U9SOdniJEzO+x9VKWQs4b3h8=-GeKYkyBw@mail.gmail.com>
To: "Scharf, Michael (Nokia - DE/Stuttgart)" <michael.scharf@nokia.com>
Cc: Lin Han <Lin.Han@huawei.com>, Thomas Nadeau <tnadeau@lucidvision.com>, "tcpm@ietf.org" <tcpm@ietf.org>, Yingzhen Qu <yingzhen.qu@huawei.com>, "tsvwg-chairs@ietf.org" <tsvwg-chairs@ietf.org>, "iccrg@irtf.org" <iccrg@irtf.org>
Content-Type: multipart/related; boundary="089e0821fc0cda00980567b88758"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/Evaz3OSFlGeFAjUNXPqRxIP6QF0>
Subject: Re: [iccrg] Questions on draft-han-tsvwg-cc-00
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Mar 2018 23:52:22 -0000

Hi, Michael,

Just for my information, I know
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71.4007&rep=rep1&type=pdf
is dated (it appeared in 1999), but since you folks are having a
conversation that is constantly looking back at least a decade, is that
still a useful reference for someone who is thinking about proposing
changes to TCP? Or is there a more modern description that I could be
looking at?

Thanks for the feedback you've provided thus far.

Spencer

On Sun, Mar 18, 2018 at 11:38 PM, Scharf, Michael (Nokia - DE/Stuttgart) <
michael.scharf@nokia.com> wrote:

> Thanks a lot for confirming that in draft-han-6man-in-band-signaling-for-transport-qos
> “more study and research need to be done” for applying the mechanism e.g.
> to Ethernet, which seems to me a pretty common technology to transport TCP
> connections. So there seem to be pretty fundamental gaps that need
> research. Research can e.g. be published at conferences or in journals.
>
>
>
> Furthermore, it is obvious that a rate policer could be applied in the
> host in addition to the proposed algorithm. But I still believe that the
> proposed algorithm itself will result in data rates larger than PIR in the
> scenarios I have explained (drops of the RTT), i.e., it does not meet the
> objective that “a TCP sender is never allowed to send data at a rate
> larger than PIR”. Depending on how it is implemented, an additional rate
> policer for PIR may drop packets already locally inside the host, because
> the congestion control algorithm plays out too many packets. That would be
> very ineffective. If your objective is a congestion control algorithm that
> at most results in a rate of PIR, your algorithm needs to change, IMHO.
>
>
>
> Also, I have said before that I believe TCP Reno can outperform your
> algorithm for the same path with CIR/PIR. If you don’t believe it, I
> suggest to look at the transfer times of medium sized transfers with a CIR
> << PIR over an larger RTT, and compare the performance of TCP Reno with
> your algorithm. IMHO you will find quite a number of cases in which TCP
> Reno will complete such a data transfers significantly faster than the
> algorithm in draft-han-tsvwg-cc-00. I have already explained the root cause
> earlier. This effect should occur even if there is no PIR capping. It means
> that a user could still request the guaranteed resources, but just use the
> existing TCP congestion control with it, and he will get better throughput
> and/or response times over the resources that were requested. In that case,
> what benefit would a user have from using draft-han-tsvwg-cc-00?
>
>
>
> Finally, I cannot parse the figure in the e-mail below; this is useless
> without labels and the like. A statement “We will have more comparison test
> results (CC vs new CC) in the future” is somewhat concerning. Before
> proposing an algorithm for IETF standardization, I believe that such a
> proposal must be comprehensively tested in simulations and measurements
> with real TCP stacks to ensure that the algorithm is properly designed,
> works as expected, and indeed has benefits over what already exists. None
> of this is clear for draft-han-tsvwg-cc-00.
>
>
>
> Michael
>
>
>
> *From:* Lin Han [mailto:Lin.Han@huawei.com]
> *Sent:* Sunday, March 18, 2018 12:09 PM
> *To:* Scharf, Michael (Nokia - DE/Stuttgart) <michael.scharf@nokia.com>
> *Cc:* tcpm@ietf.org; tsvwg-chairs@ietf.org; 'iccrg@irtf.org' <
> iccrg@irtf.org>; Thomas Nadeau <tnadeau@lucidvision.com>; Yingzhen Qu <
> yingzhen.qu@huawei.com>
> *Subject:* RE: Questions on draft-han-tsvwg-cc-00
>
>
>
>
>
>
>
> *From:* Scharf, Michael (Nokia - DE/Stuttgart) [
> mailto:michael.scharf@nokia.com <michael.scharf@nokia.com>]
> *Sent:* Sunday, March 18, 2018 12:09 AM
> *To:* Lin Han <Lin.Han@huawei.com>
> *Cc:* tcpm@ietf.org; tsvwg-chairs@ietf.org; 'iccrg@irtf.org' <
> iccrg@irtf.org>; Thomas Nadeau <tnadeau@lucidvision.com>; Yingzhen Qu <
> yingzhen.qu@huawei.com>
> *Subject:* RE: Questions on draft-han-tsvwg-cc-00
>
>
>
> Some follow-up remarks inline [ms]
>
>
>
> Michael
>
>
>
> *From:* Lin Han [mailto:Lin.Han@huawei.com <Lin.Han@huawei.com>]
> *Sent:* Sunday, March 18, 2018 12:40 AM
> *To:* Scharf, Michael (Nokia - DE/Stuttgart) <michael.scharf@nokia.com>
> *Cc:* tcpm@ietf.org; tsvwg-chairs@ietf.org; 'iccrg@irtf.org' <
> iccrg@irtf.org>; Thomas Nadeau <tnadeau@lucidvision.com>; Yingzhen Qu <
> yingzhen.qu@huawei.com>
> *Subject:* FW: Questions on draft-han-tsvwg-cc-00
>
>
>
>  Hi, Scharf
>
>
>
> For some reason, I did  not receive this thread, maybe my filter set has
> problem.
>
>
>
> See below for more clarification with [LH]
>
>
>
> Regards
>
>
>
> Lin
>
>
>
> *From: *"Scharf, Michael (Nokia - DE/Stuttgart)" <michael.scharf@nokia.com
> >
> *Date: *Thursday, March 15, 2018 at 4:17 PM
> *To: *"draft-han-tsvwg-cc@ietf.org" <draft-han-tsvwg-cc@ietf.org>
> *Cc: *"tcpm@ietf.org" <tcpm@ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>,
> "iccrg@irtf.org" <iccrg@irtf.org>
> *Subject: *Questions on draft-han-tsvwg-cc-00
> *Resent-From: *<alias-bounces@ietf.org>
> *Resent-To: *<yingzhen.qu@huawei.com>, <tnadeau@lucidvision.com>
> *Resent-Date: *Thursday, March 15, 2018 at 4:17 PM
>
>
>
> Authors, all,
>
>
>
> I have read draft-han-tsvwg-cc-00. Below I have listed a number of
> questions, which I believe would have to be addressed when discussing such
> a mechanism in the IETF or IRTF.
>
>
>
> This e-mail is strictly limited to the content of draft-han-tsvwg-cc-00.
> As the draft does not specify how the CIR and PIR will actually be
> guaranteed in the Internet, as well as how OAM signaling will work at
> Internet scale, I will not comment here on these assumptions, except
> regarding requirements that strictly follow from the content of the I-D.
> The technical, economical, and regulation aspects of the assumptions are
> not in scope of TCPM and they need to be discussed and solved elsewhere.
>
>
>
> Questions on draft-han-tsvwg-cc-00:
>
>
>
> 1/ The document seems to implicitly assume that network resources are
> reserved for **every** single TCP connection, right?
>
>
>
>    - If that assumption is correct, it has to be spelt out explicitly in
>    the text and it has to be noted that the underlying technology has to
>    provide these capabilities **for every single** TCP connection.
>    - Otherwise sentences like “after a TCP session is successfully
>    initiated its congestion window (cwnd) jumps to CIR” would not make
>    sense as multiple TCP connections within an traffic aggregate policied by
>    CIR/PIR could start to all send with CIR in parallel, which would trigger
>    massive congestion.
>    - As an example, in my reading draft-han-6man-in-band-signaling-for-transport-qos-00
>    would allow also reservations e.g. for aggregates of multiple TCP
>    connections. Such an operation mode seems not be compatible with the
>    suggested mechanism in this I-D, as far as I understand. So the
>    requirements have to be made explicit.
>    - Also, sentences such as “it is assumed that in bandwidth guaranteed
>    networks there have been network resources (bandwidths, queues etc.)
>    dedicated to the TCP flows” have to be corrected to specify that for
>    the mechanism in this draft to work correctly, the resources have to be
>    guaranteed to every single TCP connection, not multiple “flows”.
>
> [YQ]: no, it doesn’t assume that network resources are reserved for EVERY
> single TCP connection. It assumes when a TCP connection uses this proposed
> congestion control, network resources need to be reserved. The TCP example
> used in draft-han-6man-in-band-signaling-for-transport-qos-00 is per-flow
> based, and the congestion control draft also assumes the resources are
> reserved per-flow. We will add this clarification in the next version of
> the draft.
>
>
>
> [LH] The draft-han-tsvwg-cc-00 is about the CC algorithm for “NEW TCP”
> setup by draft-han-6man-in-band-signaling-for-transport-qos (I call it as
> “in-band signaling method” below) or any other method that can provide a
> CIR/PIR QoS service for the “NEW TCP”.
>
> As we said in in-band signaling method, the new in-band signaling method
> is not supposed to be used for applications that normal TCP (reno, cubic,
> etc) can satisfy the throughput requirement. This is because the new method
> is more costly than normal TCP and involve network devices. Off course, SP
> may charge it, but this is business issue.
>
> In the “in-band signaling method”, we do talk about three level of
> granularity for the signaling, but only the “Flow level in-band signaling”
> is discussed in details. So, for this new CC draft, we can say each “NEW
> TCP” session or flow will have a PIR/CIR.
>
> Having said above, we can come up some summary as follows:
>
>    1. In a real deployment, We will use the in-band signaling method to
>    setup “NEW TCP”, and each “NEW TCP” will have PIR/CIR associated, and the
>    CIR is guaranteed end-to-end for the session. For all these “NEW TCP”
>    sessions, we suggest to use the CC method described in
>    draft-han-tsvwg-cc-00, since old CC is not very adaptive to the traffic
>    behavior and network significance for “NEW TCP”; i.e, the minimum rate is
>    guaranteed in network level to pass without any congestion; the given
>    maximum rate PIR cannot be exceeded, etc.
>    2. In the same network, there certainly have much more regular TCP
>    sessions. For those TCP, they still use the old CC algorithm.
>    3. The bandwidth resource is shared between “NEW TCP” sessions and old
>    TCP sessions in a simple way
>
>
>    1. The total reserved bandwidth for all “NEW TCP” sessions is
>       sum(CIR),
>       2. The reserved bandwidth sum(CIR) still can be used for old TCP if
>       the total “NEW TCP” real bandwidth usage is less than sum(CIR).
>       3. If total “NEW TCP” real bandwidth usage is equal or less than
>       sum(CIR), “NEW TCP” will always grab the required bandwidth from the link
>       un-used bandwidth and/or the bandwidth old TCP sessions are using.
>       4. If total “NEW TCP” real bandwidth usage is great than sum(CIR),
>       how much “NEW TCP” can use are dependent on the remained bandwidth for the
>       link.
>
>
>
> [ms] As far as I can see, draft-han-6man-in-band-signaling-for-transport-qos
> ONLY is not sufficient to guarantee the assumptions of this algorithm. IP
> packets using this congestion control mechanism could be dropped due to
> congestion at devices that are not routers and do not process the IP or TCP
> header (e.g., a simple unmanaged Ethernet switch). So, the draft must
> explicitly mention that an IP signaling mechanism such as
> draft-han-6man-in-band-signaling-for-transport-qos is absolutely not
> sufficient for enabling this TCP congestion control. Depending on the
> technology below IP used for different IP hops further mechanisms must be
> used. It could also make sense to exclude in deployment section these
> sub-IP techniques for which there is currently no known technique to meet
> the requirements of deploying the algorithm. Such a list of technologies
> would probably be useful to network administrators as they have to ensure
> that the network using this algorithm does not include any link with such a
> technology.
>
>
>
> [LH] What you said is correct, the “draft-han-6man-in-band-signaling-for-transport-qos”
> only provides a method for IP network to support QoS for upper layer like
> TCP, but it does not help if IP network is tunneling through a non-IP
> network, like MPLS or Ethernet switch. This is a topic needs further study
> and mentioned in the section 5.4 “Heterogeneous network” in the
> “draft-han-6man-in-band-signaling-for-transport-qos”.
>
>
>
>
>
> 2/  Why does the document not rely on ECN (and not even reference ECN)?
>
>
>
>    - For instance, the following requirement “It is important that OAM
>    needs to be able to detect if any device's  buffer depth has exceeded the
>    pre-configured threshold, as this is an indication of potential congestion
>    and packet drop” could possibly be solved by ECN, no?
>    - Even in case another OAM mechanism could be used in addition, a
>    comprehensive TCP congestion control specification would have to also cover
>    the reaction to ECN marks as well, as well as the potential combination of
>    feedback results. Why is this missing?
>    - Or would the document mandates that ECN MUST NOT be enabled for TCP
>    connections using this congestion control mechanism?
>
> [YQ]: This is a good point. We haven’t thought about combining OAM and ECN
> together. Will need to do some research and figure it out.
>
>
>
> [LH] We just don’t want to mess up with the current ECN usage; i.e, in a
> same device, “NEW TCP” can coexist with all current TCP variations, and new
> CC does not impact them.
>
>
>
> [ms] The “co-existience” would have to be specified, including the
> reaction to ECN marks.
>
>
>
> [LH] we will specify this, and we don’t want to interfere with ECN mark
> and its algorithm
>
>
>
> 3/ Why does the document assume that congestion windows are calculated in
> segments and not in bytes?
>
>
>
>    - RFC 5681 as well as many other RFCs calculate CWND in bytes.
>    - However, I believe equations such “MinBandwidthWND = CIR * RTT/MSS”
>    or “MaxBandwidthWND = PIR * RTT/MSS” would return a window counted in
>    MSS segments.
>    - Apart from the mismatch with the TCP standards, this sort of
>    equation might also requires a discussion on how to deal with integer
>    division.
>
> [YQ]: you’re right, indeed in this draft the congestion windows are
> calculated in segments. We’ll some calculations also in bytes, or how to
> make the conversion.
>
>
>
>
>
> 4/ How does the mechanism deal with IP and TCP header overhead?
>
>
>
>    - TCP window sizes are about the TCP bytestream, while the actual IP
>    packets sent by a TCP/IP stack will include an IP and TCP header. If one
>    neglects the IP and TCP headers in the congestion window calculation, the
>    resulting IP packet rate will be larger than the CIR and PIR seen in the
>    TCP layer. This could result in packet drops if CIR and PIR are enforced
>    e.g. for IP packet length.
>    - How will this problem be solved? Note that TCP (and also IP) can
>    include header options, which results in variable header sizes. The number
>    of TCP options can be different for each TCP segment. How does this
>    congestion control mechanism correctly handle the headers and the options
>    in IP and TCP headers?
>
> [YQ]: good point. The more accurate way is: the size of real packets that
> got transmitted on the wire should be used to calculate how many packets
> can be transmitted using the bandwidth reserved. We’ll add this in the next
> version.
>
>
>
> 3/ How does the document deal with RTT variations? Is the assumption that
> the RTT is constant?
>
>
>
>    - As far as I can tell from experiments, the RTT estimation is
>    important when applying a rate to window-based congestion control, which is
>    what this document does.
>    - Equations such “MinBandwidthWND = CIR * RTT/MSS” or “MaxBandwidthWND
>    = PIR * RTT/MSS” only provide a window equivalent to the
>    bandwidth-delay product of the path if the RTT sample is a correct
>    prediction of the actual delay that the segments in flight will experience.
>    How does the mechanism suggested in this document correctly predict the
>    future RTT of the segments that are sent by the sender at a given point in
>    time?
>    - As an example, assume that the RTT at time t=10s is determined as
>    80ms. Assume PIR = 10 Mbps and neglect the questions 3/ and 4/. Then this
>    document would probably assume that MaxBandwidthWND=100 kB is the bandwidth
>    delay product of the path during t=10s and t=10.08s, i.e., the maximum
>    amount of outstanding data that can be sent in that time without drops (or
>    exceeding PIR). But assume that the actual round-trip delay of segments has
>    dropped to 40ms after the last RTT management, which means that the maximum
>    bandwidth delay product of the path at t=10s+epsilon is only 50 kB. As a
>    result, 50 Kb out of the congestion window would likely be dropped during
>    t=10s and t=10.08s. And, due to the wrong RTT value, the effective data
>    rate of the sender could even be 20 Mbps, if the RTT mismatch is not
>    detected immediately, or, e.g., if EWMA will delay the update of the
>    weighted RTT parameter to the actual value.
>    - So how does the proposed scheme to indeed determine a window that
>    meets the statement “This means a TCP sender is never allowed to send
>    data at a rate larger than PIR”  if the RTT is not constant? Does this
>    assume rate pacing in the TCP sender for each TCP connection?
>
>  [YQ]: no, RTT is not assumed to be constant in this draft. It’s
> calculated using RTT = a* old RTT + (1-a) * new RTT   (0 < a < 1)   (1)
>
>
>
> [ms] And, as explained in my example, the statement “This means a TCP
> sender is never allowed to send data at a rate larger than PIR” is not
> met by the current design of the algorithm. I have provided an actual
> example where the algorithm breaks. Please explain how the algorithm
> ensures that a TCP sender never sends more than PIR over links with
> variable RTT.
>
>
>
> [LH] Do not allow the rate exceeding the PIR is to regulate the traffic at
> ingress to reduce the burst of traffic. This will effectively reduce the
> buffer depth and the worst case latency.
>
> After per flow shaping, we will mark the color to different rate of
> traffic: GREEN (rate<CIR); YELLOW(CIR<rate<PIR) and RED (rate>PIR). The
> GREEN traffic will always get scheduled at egress side after IP forwarding
> process; the YELLOW traffic will be scheduled if the egress buffer depth is
> less than some pre-configured value (i.e, 70%), and will be dropped if the
> buffer depth is exceeding the pre-configured value.
>
> We can easily do nothing after rate exceeding CIR, but this will make PIR
> parameter meaningless. We use PIR and CIR instead of CIR only is because
> this is a traditional definition from QoS point of view. We have already
> simplified the definition, and removed some parameters like CB (committed
> burst) and ECB (Exceeding Committed burst), etc.
>
> If we want to have the behavior of reno, user can easily set the PIR as a
> very high value or even link bandwidth, the traffic will get to the
> equilibrium point caused by congestion before reaching up its PIR. If the
> community think this is more reasonable solution (don’t not cap the rate to
> PIR at ingress), we can easily change the rule in hardware and CC algorithm
> accordingly.
>
>
>
> 4/ How is it ensured that OAM alarms will reach the TCP sender in time in
> all possible “random failure” cases?
>
>
>
>    - As far as I understand, the following statement “When a sender
>    receives the third duplicated ACK, but no previous OAM congestion alarm has
>    been received, then it is considered that a segment is lost due to random
>    failure not congestion.  In this case the cwnd is not changed.”
>    mandates that an OAM alarm is received prior to the third duplicate ACK **in
>    all potential cases** of congestion. If the OAM alarm got lost or
>    delayed, this condition would imply that cwnd is not changed despite a
>    segment has been dropped due to congestion, which would be a violation of
>    fundamental Internet congestion control principles.
>    - Please expand on how this document ensures that cwnd will be reduces
>    in all potential cases when a packet gets dropped due to congestion, and
>    what requirements on the OAM alarms propagation follow from that. Of
>    specific interest are effects such as packet drops of packets relevant for
>    the OAM information, reordering of packets, asymmetric routing in forward
>    and reverse direction, use of multiple paths in parallel (ECMP), and the
>    like. If the document makes assumption about the path such as in-order
>    packet delivery or the like, these assumptions need to be spelt out
>    explicitly.
>    - I understand that the OAM could be solved in different ways and the
>    solution is independent of this document. But this document has to
>    comprehensively specify all technical requirements that the OAM mechanism
>    has to meet in order to ensure that every single packet drop due to
>    congestion always results in a cwnd reduction. Otherwise the algorithm has
>    to change as it does not safely prevent congestion.
>
>  [YQ]: in case an OAM alarm for congestion is lost, currently cwnd is not
> changed. The chance of packet loss due to congestion might be increased, so
> an OAM alarm should be received soon. But I agree more
> design/considerations should be added here.
>
>
>
> [LH] if the OAM alarm msg is lost, we loss the capability to distinguish
> the random failure and the congestion loss, we will treat the loss as
> congestion loss. This is not worse than current reno.
>
>
>
> [ms] This is not what is written in the draft so the algorithm will need
> to change. As a side note, the IETF has tried many years to find a solution
> of how to distinguish congestion loss and random failures. If the algorithm
> wants to make that distinction, it has to explain how all congestion packet
> losses will always be detected.
>
>
>
> [LH] will make it clear in the document.
>
>
>
> 5/ What is the expected performance benefit? Are there situations in which
> performance will be worse than standard TCP congestion control?
>
>
>
>    - The document does not contain any data of potential improvements or
>    deteriorations as compared to the TCP standard congestion control. I assume
>    that such data will be presented to explain why this mechanism is proposed,
>    and what benefits it has.
>    - As I have experimented with similar mechanisms quite a bit, I
>    believe there will be cases in which this congestion control mechanism will
>    perform worse than a TCP sender fully compliant to RFC 5681, when using the
>    same network path with CIR and PIR guarantees. I believe this document
>    should analyze these cases and reason why a worse performance than standard
>    TCP congestion control will be acceptable. IMHO this issue will
>    specifically apply in cases when PIR is significantly larger than CIR, and
>    if the RTT is large. As far as I can see, this draft mandates to start the
>    data transfer in congestion avoidance at CIR rate, which means that it can
>    take many RTTs until the sender reaches the PIR. In contrast, RFC 5681 will
>    run slow-start, and RFC 5681 states that the “initial value of ssthresh
>    SHOULD be set arbitrarily high”. This means that the TCP sender can reach
>    PIR within few RTTs and thus can send with full PIR speed, while a sender
>    using draft-han-tsvwg-cc-00 will send with a much lower speed CIR+epsilon.
>    For short and medium-sized data transfers, IMHO it can happen that the
>    congestion control according to RFC 5681 will significantly outperform the
>    mechanism suggested in this document, i.e., it will complete data transfers
>    orders of magnitude faster even without any knowledge about CIR and PIR.
>    Have the authors compared this mechanism to the performance of RFC 5681?
>    - Also, have the authors compared the performance of this mechanism as
>    compared to a modern TCP stack, which often use RFC 6928 (IW 10) and RFC
>    8312 (CUBIC)? In what cases has the suggested congestion control a better
>    performance? I ask this because I have performed experiments 10 years ago
>    with congestion control schemes that have some similarity to what is
>    suggested here, and they also used knowledge about the path properties. In
>    those experiments, it turned out to be quite difficult to design an
>    algorithm that uses knowledge about the path (such as CIR/PIR) and that
>    outperforms CUBIC in combination IW 10, even if such a stack is totally
>    unaware of the path. This has been discussed e.g. in ICCRG
>    https://www.ietf.org/proceedings/73/slides/iccrg-2.pdf
>    <https://www.ietf.org/proceedings/73/slides/iccrg-2.pdf>. As context,
>    “more-start” in this document is somewhat similar to what is proposed in
>    this I-D (but applied to CUBIC), while the “initial-start” graphs somehow
>    corresponds to what was later specified in RFC 6928 (IW 10) and RFC 8312
>    (CUBIC).
>
> [YQ]: Totally agree. This is not a one to fit all solution, it has its
> usage limitations. Detail comparisons and suggestion on when to use this
> algorithm will be added later.
>
>
>
> [LH] Cannot be, as I said for question one, the new CC is just a
> improvement over the current TCP-reno when used for “NEW TCP”. It makes the
> TCP-reno adapted to the new behavior of “NEW TCP”, the logic is very
> straightforward.
>
>
>
> [ms] As I shown in my example, there will be cases in which this proposed
> congestion control is **not** an improvement over TCP Reno. There will be
> cases in which it transfers data significantly slower than TCP Reno, and
> this will affect interactive applications that need e.g. very fast response
> times. An application designer will observe that even TCP Reno can give
> better performance over the same network. For some cases I think one can
> even calculate explicitly how much slower this algorithm is as compared to
> TCP Reno, and how much worse application performance will be – the effect
> and its root cause is so simple that one does not even need any
> implementation for calculating how much worse this algorithm is than TCP
> Reno. I believe that application developers would be interested in
> understanding when (and why) this algorithm is slower than a
> standard-compliant TCP over the same network.
>
>
>
> [LH]See the explanation for question 3.
>
> I guess this is the major misunderstanding point from our document, we
> need to make document more clear in this regard.
>
> The new CC is not to compete with reno in throughput for any case since it
> does not need to, it try to make reno more adaptive to the new traffic
> behavior enforced by the QoS. The QoS itself will provide completely
> different behavior with traditional TCP CC, see below (PIR=CIR; B1, B2 are
> new TCP set by in-band signaling and without changes of CC, n1 and n2 are
> reno).
>
> It seems that In-band signaling method can have “better throughput” than
> traditional TCP, but is it surprised? No, Off course not, QoS will be
> better than best-effort! The better throughput is not caused by new CC, it
> is caused by QoS. Directly comparing new CC with reno for throughput does
> not make sense -J New CC means nothing without the per-flow QoS .
>
> Picture below also shows the new TCP will always provide the expected
> throughput no matter how much bandwidth used by other traditional TCP. We
> will have more comparison test results (CC vs new CC) in the future.
>
>
>
> [image: Rate(Kbps)][image: Rate(Kbps)][image: Time(s)][image: Time(s)]
>
>
>
>
>
>
>
> 6/ For what application traffic patterns is this mechanism proposed?
>
>
>
>    - The document states in Section 3 “… with the development of new
>    applications, such as AR/VR”. What properties do such applications have to
>    leverage the mechanism suggested in this document? Is it possible to
>    characterize what the “new” requirements are, and how the suggested
>    algorithm meets these requirements?
>    - Is it suggested to apply this congestion control to real-time media
>    traffic over TCP? If so, what would be the benefit of using TCP in general
>    and of the specific mechanism compared to congestion control algorithm for
>    such traffic (e.g., the RMCAT working group)?
>
> [YQ] same as above answer, the exact traffic patterns that can benefit
> from this algorithm will be studied in more detail and provided in later
> version of the draft.
>
>
>
> [ms] I assume that “later version” means “next version”. It is pointless
> to further discuss a congestion control algorithm without details about
> workloads and the algorithm performance, resulting from simulations, actual
> measurement from implementations, etc. Such data can also be published at
> research conferences or in research journals prior to presenting it in the
> IRTF or IETF.
>
>
>
> Michael
>
>
>
>
>
> This list of questions is not comprehensive, but I’ll stop here.
>
>
>
> Regarding potential next steps of this document in the TCPM working group,
> I believe that the applicable TCPM charter wording is: “In addition, TCPM
> may document alternative TCP congestion control algorithms that are known
> to be widely deployed, and that are considered safe for large-scale
> deployment in the Internet.” Until these prerequisites are fulfilled in
> the Internet, in my view the document cannot be adopted in TCPM. Research
> could be performed in the IRTF, e.g., in ICCRG.
>
>
>
> Thanks
>
>
>
> Michael (with no hat on)
>
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg
>
>