Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)

Mirja Kuehlewind <ietf@kuehlewind.net> Thu, 19 March 2020 17:41 UTC

Return-Path: <ietf@kuehlewind.net>
X-Original-To: 6lo@ietfa.amsl.com
Delivered-To: 6lo@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0777C3A0C50; Thu, 19 Mar 2020 10:41:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id beWUC6OHFy13; Thu, 19 Mar 2020 10:41:38 -0700 (PDT)
Received: from wp513.webpack.hosteurope.de (wp513.webpack.hosteurope.de [IPv6:2a01:488:42:1000:50ed:8223::]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 69D423A0C23; Thu, 19 Mar 2020 10:41:37 -0700 (PDT)
Received: from p200300dee7239a007961e4d1ac1a08c3.dip0.t-ipconnect.de ([2003:de:e723:9a00:7961:e4d1:ac1a:8c3]); authenticated by wp513.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) id 1jEzAj-0008Eg-0Z; Thu, 19 Mar 2020 18:41:33 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Mirja Kuehlewind <ietf@kuehlewind.net>
In-Reply-To: <MN2PR11MB3565C75A4707268047E1F90ED8F40@MN2PR11MB3565.namprd11.prod.outlook.com>
Date: Thu, 19 Mar 2020 18:41:32 +0100
Cc: "6lo-chairs@ietf.org" <6lo-chairs@ietf.org>, The IESG <iesg@ietf.org>, "6lo@ietf.org" <6lo@ietf.org>, "draft-ietf-6lo-fragment-recovery@ietf.org" <draft-ietf-6lo-fragment-recovery@ietf.org>, Carles Gomez <carlesgo@entel.upc.edu>, Martin Duke <martin.h.duke@gmail.com>, Benjamin Kaduk <kaduk@mit.edu>
Content-Transfer-Encoding: quoted-printable
Message-Id: <9823A19B-CA7D-47D1-92FE-8AA436240BD4@kuehlewind.net>
References: <158212059997.17584.9409485384556514167.idtracker@ietfa.amsl.com> <MN2PR11MB356540107CC4FC7F9CB9F412D8E30@MN2PR11MB3565.namprd11.prod.outlook.com><7D77F80D-9D2C-4AF7-AB7E-4EDF58A9258A@kuehlewind.net> <MN2PR11MB3565C75A4707268047E1F90ED8F40@MN2PR11MB3565.namprd11.prod.outlook.com>
To: "Pascal Thubert (pthubert)" <pthubert=40cisco.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3445.104.11)
X-bounce-key: webpack.hosteurope.de;ietf@kuehlewind.net;1584639698;377fef96;
X-HE-SMSGID: 1jEzAj-0008Eg-0Z
Archived-At: <https://mailarchive.ietf.org/arch/msg/6lo/_ma_04loUQZ8gd0A6-ggb-aE3F0>
Subject: Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)
X-BeenThere: 6lo@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Mailing list for the 6lo WG for Internet Area issues in IPv6 over constrained node networks." <6lo.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/6lo>, <mailto:6lo-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/6lo/>
List-Post: <mailto:6lo@ietf.org>
List-Help: <mailto:6lo-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/6lo>, <mailto:6lo-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Mar 2020 17:41:43 -0000

Hi Pascal,

Inline.


> On 19. Mar 2020, at 15:57, Pascal Thubert (pthubert) <pthubert=40cisco.com@dmarc.ietf.org> wrote:
> 
> Hello Mirja
> 
>> Thanks for you updates. Sorry for my late reply. I unfortunately have some
>> more comments. Please see below.
> 
> More comments => more thanks, you'll have to live with it 😊
> 
> As usual I published for your convenience so you can observe the global effect of your review.
> Please see https://tools.ietf.org/rfcdiff?url2=draft-ietf-6lo-fragment-recovery-18.txt 
> (and as usual also I found a typo there that I fixed, this time the duplication of "reduce the")
> 
> Please see below the discussion:
> 
>>>> ---------------------------------------------------------------------
>>>> -
>>>> DISCUSS:
>>>> ---------------------------------------------------------------------
> 
> 
>>> 4.3.  Flow Control
>>> 
>>>  The inter-frame gap is the only protection that [FRAG-FWD] imposes by
>>>  default.  This document enables to group fragments in windows and
>>>  request intermediate acknowledgements so the number of in-flight
>>>  fragments can be bounded.  This document also adds an ECN mechanism
>>>  that can be used to adapt the size of the window, the size of the
>>>  fragments, and/or the inter-frame gap to protect the network.
>>> 
>>>  This specification enables the source endpoint to apply a flow
>>>  control mechanism to tune those parameters, but the mechanism itself
>>>  is out of scope.  In most cases, the expectation is that most
>>>  datagrams will represent only a few fragments, and that only the last
>>>  fragment will be acknowledged.  A basic implementation of the source
>>>  endpoint is NOT REQUIRED to variate the size of the window, the
>>>  duration of the inter-frame gap or the size of a fragment in the
>>>  middle of the transmission of a datagram, and it MAY ignore the ECN
>>>  signal or simply reset the window to 1 (see Appendix C for more) till
>>>  the end of this datagram upon detecting a congestion.
>>> 
>>>  The size of the fragments is typically computed from the Link MTU to
>>>  maximize the size of the resulting frames.  The size of the window
>>>  and the duration of the inter-frame gap SHOULD be configurable, to
>>>  roughly adapt the size of the window to the number of hops in an
>>>  average path, and to follow the general recommendations in
>>>  [FRAG-FWD], respectively.
>>> “
>>> 
>> Thanks for adding this. However, as I said a couple of times in my discuss there
>> must be more guidance. This is not only about flow control but also about
>> congestion control and it is not okay to declare congestion control out of
>> scope. If you only do fragmentation but no retransmission, you don’t need to
>> care about congestion control (but only flow control) as you don’t increase the
>> actual network load by this. However, if you retransmit you are sending more
>> data than the original sender (that hopefully is congestion controlled) and
>> therefore you increase the load on the network and you MUST implement your
>> own congestion control or some fixed rate limiting for that additional load.
>> Saying this is out of scope and you want to do experimentation is not
>> acceptable for a Proposed Standard.
> 
> I agree. I should be a lot more careful with my wording. 
> 
> Understanding that the flow control is pacing from the receiver to protect the receiver and congestion control is protecting the network (ECN etc...); stricto sensu  we are not doing any flow control since the receiver is expected to have a buffer for the whole datagram and he does not need to be protected. All we do is congestion control.
> 
> => I should change "flow control" to "congestion control" throughout, and add text about the above, is that correct?

Yes, thanks!

> 
> Note that all classical wireless interfaces perform ARQ. On one hop, you get the same effect of multiplying the traffic in the air vs. what the transport see. The factor can be high, e.g., 64. On a mesh, we get the additional effect of multiple flows converging on a same node.

Yes but with only “one hop”/the network you are connected to directly, and there is usually also some kind of back-off mechanism that reacts to congestion/collision/contention on that layer.


> 
>> 
>> To be clear the request of this discuss is to give a normative recommendation
>> for the default value of the window size and/or inter-frame gap.
> 
> Yes, and since there is no great expectation that ECN will be implemented, that must be reasonable.
> Also we want to agree on the proposed mechanism to drop the window to 1 in case of congestion notification, or is that behind us already?

Dropping to 1 on CE mark is fine. However, when do you increase the window again? If you want to say something here, you have to specify that as well.

> 
> 
>> 
>> Further note, as you allow to adapt both the window and the inter-frame gap
>> dynamically, you actually implement two control mechanisms here. I actually
>> recommend to only use the inter-frame gap and don’t have window here. You
>> say a couple of times in your reply below, that the window determines the ask-
>> rate, however, it is not clear to me because the ack rate should be a parameter
>> at the receiver and not at the sender (maybe I don’t remember things correctly
>> because it’s a while back since I read the draft and I couple find anything about
>> this in the draft quickly). If the window size however does define the ack rate,
>> then maybe you should rename that parameter respectively.
> 
> The ack is not for flow control (as we do not have it) but in support of ARQ. The possibility to use it for congestion control is a desirable side effect. The fragmenting endpoint FSM may want to wait and see how things went for the fragments that it already sent. E.g., there's the case where the fragmenting endpoint would use an ack on the first fragment for  a number of reasons such as check that a path is available, that the MTU is OK or assess an initial RTT. It may maximize the number of fragments in flight for congestion control. But whether to do any of that is left to implementation (so far).
> 
> 
>> 
>> However, if there is really a need for a window, I still recommend to talk less
>> about adapting this value dynamically and make clear that having a fixed value
>> is the recommended default. Therefore I recommend to remove the parameter
>> MinWindowSize and MaxWindowSize because these should actually not be
>> parameters than can be configured but are actual bounds. If someone decides
>> to implement dynamic window adaption, they can decide to re-introduce these
>> parameter again and make them configurable but it doesn’t need to be part of
>> this spec.
> 
> I can see that, yes. I still like the idea to drop to 1 in case of ECN. 
> Do you suggest to remove that as well?
> If so, should we augment the inter-frame gap in case of ECN? 
> That may be better though not simpler to specify or implement.

That’s an option as well. Again when you reduce something you might as well need to specify when to increase it again and that means you are specifying basically a congestion control scheme.

> 
> 
> 
>> 
>> So it could be something like:
>> 
>> "Window_Size:  Window_Size MUST be at least 1 and less than 33. If the inter-
>> frame gap is selected large enough to not overload the path and the one-way
> 
> I see the IFG as more efficient for flow control than for congestion control: Increasing IFG slows down the packets but as long as the result is faster than the bottleneck, it does not help much does it? So I'm still  unsatisfied on how to characterize an IFG that does not overload a path. But I'm not sure we can do better. I moved that piece to the IFG definition if that's OK?

So how do you currently set the IFG? Both IFG and window_size can be used for both flow as well as congestion control, it only depends who generated the signal that is used to adapt the value, either the endpoint/receiver or the network/nodes on the path.

Using a window would be a window-based congestion control. Using the IFG would be a rate-based congestion control. But the principle is the same.


> 
>> delay is known, the Window_Size SHOULD be set to the one-way trip time
>> divided by the inter-frame gap.  Otherwise a small value of X SHOULD be
>> configured. Note that the Window_Size determines the ack rate. If the
>> window_size is set this to 32, this means only the last Fragment is
>> acknowledged in the first round. If it is set to a smaller value, more acks are
>> generated but the load on the forward path will be lower. Window_Size MAY
>> be adapted dynamically to reduce load on the forward path in case of
>> congestion.”
> 
> The assertion that the load on the forwarding path will be lower is usually incorrect in a typical  LLN, since the radios are half duplex. In the example of 6TiSCH, an rfrag_ack consumes the exact same bandwidth as a fragment (one time slot). Also the path of the rfrag_ack is the reverse LSP, so it goes across the exact same links.

Okay. Yes makes sense.
> 
> The last sentence is already present above in the text above it, all quoted below, so I'm also trying to avoid duplication.
> 
>> Still you also need to say more about how to set and dynamically adapt the
>> inter-frame gap because that is probably the real limiting fact to avoid network
>> overload.
>> 
> 
> Yes, I see that tuning IFG impacts the rate and can help alleviate the congestion once you pass below the rate the bottleneck can give you. I've done some adaptive CIR long ago in IBM FR switches and it can be made to work, and though it was a lot of fun, it's not any easier than window-based flow control. And it really depends on the relay doing the right thing, e.g., reacting quickly on growing queue latency and applying fair sharing.
> 
>> Also below you remove the recommendation for using the number of hops as
>> window size but here you added it again. This is just incorrect. There is no
>> dependency between the number of hops and the window size: If there is no
>> bottleneck on the path, you can just send with line rate at the sender. 
> 
> The rationale was: If there is no bottleneck on the path and the window is less than the number of hops then the sender will be blocked and the maximum throughput cannot be achieved.. If the rfrag_ack is as slow as the frag, which is reasonable in an LLN, we're talking about a window of twice the number of hops to keep the fragments going.

I see that the idea was rather to get the frames flowing (that avoiding overload) under the assumption that there is no bottleneck on the path. However, in this case you don’t really need the window at all and using the IFG should be enough.

> 
> I saw the number of hops as a starting point, but I'm (really) happy to stick to RTT/IFG which makes more sense considering the focus that you seem to recommend placing on IFG (and I agree with that too).
> 
>>                                                                                                    If there
>> is a bottleneck on the path and you send at a higher rate than the bottleneck
>> than soon or later the buffer at that hop will fill up completely. So the window
>> size depends only on the bottleneck rate and end-to-end delay (BDP) (which
>> let’s you calculate the number of packet in flight) plus the buffer size at the
>> bottleneck. The number of hops is irrelevant.
> 
> Yes, I understand that model. It was easier to apply some 25 years ago.
> So far the links in a LLN are usually the same and the PHYs are usually the same so it's still usable.
> But that is bound to change rapidly as even LLN radios are going to be agile WRT PHY rate. Meaning that the rate at the bottleneck will become hard to fathom and will change (rapidly) over time (same as Wi-Fi).
> 
> All in all we'd get:
> 
> "
>   An implementation must control the rate at which it sends packets
>   over the same path to allow the next hop to forward a packet before

What does “same" relate to here?

>   it gets the next.  In a wireless network that uses the same frequency
>   along a path, more time must be inserted to avoid hidden terminal
>   issues between fragments (more in Section 4.2).  An implementation
>   should consider the generic recommendations from the IETF in the
>   matter of congestion control and rate management in [RFC5033].  An

Maybe RFC8085 is the better reference?

>   implementation may perform a congestion control by using a dynamic
>   value of the window size (Window_Size), adapting the fragment size
>   (Fragment_Size), and may reduce the load by inserting an inter-frame
>   gap that is longer than necessary.

This is a bit the part that I don’t understand fully. Why do you need three different ways to enable congestion control instead of just having one.

You already have the IFG. What's the benefits of having a window based rate limit in addition?

>  In a large network where nodes
>   contend for the bandwidth, a larger Fragment_Size consumes less
>   bandwidth but also reduces fluidity and incurs higher chances of loss
>   in transmission.
> 
>   This is controlled by the following parameters:
> 
> 
>   inter-frame gap:  Indicates the minimum amount of time between
>      transmissions.  The inter-frame gap protects the propagation of
>      one transmission before the next one is triggered and creates a
>      duty cycle that controls the ratio of air time and memory in
>      intermediate nodes that a particular datagram will use.  The
>      inter-frame gap controls the rate at which fragments are sent and
>      SHOULD be selected large enough to protect the network.

I think you need to provide some (normative) recommendation for the default configuration of the IFG. If that is specified in draft-ietf-6lo-minimal-fragment a pointer would be good and more explanation.

> 
> 
> <snip>
> 
>   Window_Size:  The Window_Size MUST be at least 1 and less than 33.
> 
>      *  If the round-trip time is known, the Window_Size SHOULD be set
>         to the round-trip time divided by the time per fragment, that
>         is the time to transmit a fragment plus the inter-frame gap.
> 
>      Otherwise:
> 
>      *  Setting the window_size to 32 is to be understood as only the
>         last Fragment is acknowledged in each round.  This is the
>         RECOMMENDED value in a half-duplex LLN where the fragment
>         acknowledgment consumes roughly the same bandwidth on the same
>         links as the fragments themselves
> 
>      *  If it is set to a smaller value, more acks are generated.  In a
>         full-duplex network, the load on the forward path will be
>         lower, and a small value of 3 SHOULD be configured.
> “
> 

If having the window is still useful, this is fine I think.

Mirja



>