Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)

Mirja Kuehlewind <ietf@kuehlewind.net> Fri, 20 March 2020 13:48 UTC

Return-Path: <ietf@kuehlewind.net>
X-Original-To: 6lo@ietfa.amsl.com
Delivered-To: 6lo@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 159083A098D; Fri, 20 Mar 2020 06:48:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CJN_oBpMVx4r; Fri, 20 Mar 2020 06:48:47 -0700 (PDT)
Received: from wp513.webpack.hosteurope.de (wp513.webpack.hosteurope.de [IPv6:2a01:488:42:1000:50ed:8223::]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37CB93A0988; Fri, 20 Mar 2020 06:48:46 -0700 (PDT)
Received: from p200300dee7239a007961e4d1ac1a08c3.dip0.t-ipconnect.de ([2003:de:e723:9a00:7961:e4d1:ac1a:8c3]); authenticated by wp513.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) id 1jFI0v-0005aQ-UA; Fri, 20 Mar 2020 14:48:42 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Mirja Kuehlewind <ietf@kuehlewind.net>
In-Reply-To: <C3771F28-264D-4D0D-9864-D40711EBA0FD@cisco.com>
Date: Fri, 20 Mar 2020 14:48:40 +0100
Cc: "Pascal Thubert (pthubert)" <pthubert=40cisco.com@dmarc.ietf.org>, "6lo-chairs@ietf.org" <6lo-chairs@ietf.org>, The IESG <iesg@ietf.org>, "6lo@ietf.org" <6lo@ietf.org>, "draft-ietf-6lo-fragment-recovery@ietf.org" <draft-ietf-6lo-fragment-recovery@ietf.org>, Carles Gomez <carlesgo@entel.upc.edu>, Martin Duke <martin.h.duke@gmail.com>, Benjamin Kaduk <kaduk@mit.edu>
Content-Transfer-Encoding: quoted-printable
Message-Id: <D3F4ECB5-3E1D-48EF-83D9-71F3F14629C4@kuehlewind.net>
References: <158212059997.17584.9409485384556514167.idtracker@ietfa.amsl.com> <MN2PR11MB356540107CC4FC7F9CB9F412D8E30@MN2PR11MB3565.namprd11.prod.outlook.com><7D77F80D-9D2C-4AF7-AB7E-4EDF58A9258A@kuehlewind.net> <MN2PR11MB3565C75A4707268047E1F90ED8F40@MN2PR11MB3565.namprd11.prod.outlook.com> <9823A19B-CA7D-47D1-92FE-8AA436240BD4@kuehlewind.net> <C3771F28-264D-4D0D-9864-D40711EBA0FD@cisco.com>
To: "Pascal Thubert (pthubert)" <pthubert@cisco.com>
X-Mailer: Apple Mail (2.3445.104.11)
X-bounce-key: webpack.hosteurope.de;ietf@kuehlewind.net;1584712126;ef3ee6c1;
X-HE-SMSGID: 1jFI0v-0005aQ-UA
Archived-At: <https://mailarchive.ietf.org/arch/msg/6lo/KVNG5sYpf6t2v6MS9rQpsL6AjiQ>
Subject: Re: [6lo] Mirja Kühlewind's Discuss on draft-ietf-6lo-fragment-recovery-13: (with DISCUSS and COMMENT)
X-BeenThere: 6lo@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Mailing list for the 6lo WG for Internet Area issues in IPv6 over constrained node networks." <6lo.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/6lo>, <mailto:6lo-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/6lo/>
List-Post: <mailto:6lo@ietf.org>
List-Help: <mailto:6lo-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/6lo>, <mailto:6lo-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Mar 2020 13:48:50 -0000

Hi Pascal,

Okay! :-)

About the use of ECN, I agree as you say there should only be a few fragments and not increasing might be okay. However, you would need to clarify that the window is reset for each new datagram, I guess, right? Also I don’t think you necessarily need to reduce to 1 on CE marking but maybe halve the window or something. Or you leave this open like “If an E flag is received the window SHOULD be reduced, at least by 1 and at max to 1. Halving the window for each E flag received, could be a good compromise but needs further experimentation.”…

I wonder if it would be good to say a bit more about the recommended values for the window size, as I think 32 will usually in most network not limit transmission (and the limiting value will be IFG) while with a size of 3, that's very conservative to not overload the network (and will be slow than the limits induced by IFG). Is my understanding correct?

Mirja



> On 19. Mar 2020, at 20:12, Pascal Thubert (pthubert) <pthubert@cisco.com> wrote:
> 
> Hello Mirja
> 
> 
> 
>> 
>>> 
>>> 
>>> Please see below the discussion:
>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> -
>>>>>> DISCUSS:
>>>>>> -----------------------------------------
>> 
>>> 
>>> Note that all classical wireless interfaces perform ARQ. On one hop, you get the same effect of multiplying the traffic in the air vs. what the transport see. The factor can be high, e.g., 64. On a mesh, we get the additional effect of multiple flows converging on a same node.
>> 
>> Yes but with only “one hop”/the network you are connected to directly, and there is usually also some kind of back-off mechanism that reacts to congestion/collision/contention on that layer.
>> 
>> 
>>> 
>>>> 
>>>> To be clear the request of this discuss is to give a normative recommendation
>>>> for the default value of the window size and/or inter-frame gap.
>>> 
>>> Yes, and since there is no great expectation that ECN will be implemented, that must be reasonable.
>>> Also we want to agree on the proposed mechanism to drop the window to 1 in case of congestion notification, or is that behind us already?
>> 
>> Dropping to 1 on CE mark is fine. However, when do you increase the window again? If you want to say something here, you have to specify that as well.
> 
> 
> If we keep things really simple it would not. Note that this applies to a single data gram and that’s usually just a few fragments.
> 
> We could double at each round trip but by the time it takes effect the datagram will be done...
> 
>> 
>>> 
>>> 
>>>> 
>>>> Further note, as you allow to adapt both the window and the inter-frame gap
>>>> dynamically, you actually implement two control mechanisms here. I actually
>>>> recommend to only use the inter-frame gap and don’t have window here. You
>>>> say a couple of times in your reply below, that the window determines the ask-
>>>> rate, however, it is not clear to me because the ack rate should be a parameter
>>>> at the receiver and not at the sender (maybe I don’t remember things correctly
>>>> because it’s a while back since I read the draft and I couple find anything about
>>>> this in the draft quickly). If the window size however does define the ack rate,
>>>> then maybe you should rename that parameter respectively.
>>> 
>>> The ack is not for flow control (as we do not have it) but in support of ARQ. The possibility to use it for congestion control is a desirable side effect. The fragmenting endpoint FSM may want to wait and see how things went for the fragments that it already sent. E.g., there's the case where the fragmenting endpoint would use an ack on the first fragment for  a number of reasons such as check that a path is available, that the MTU is OK or assess an initial RTT. It may maximize the number of fragments in flight for congestion control. But whether to do any of that is left to implementation (so far).
>>> 
>>> 
>>>> 
>>>> However, if there is really a need for a window, I still recommend to talk less
>>>> about adapting this value dynamically and make clear that having a fixed value
>>>> is the recommended default. Therefore I recommend to remove the parameter
>>>> MinWindowSize and MaxWindowSize because these should actually not be
>>>> parameters than can be configured but are actual bounds. If someone decides
>>>> to implement dynamic window adaption, they can decide to re-introduce these
>>>> parameter again and make them configurable but it doesn’t need to be part of
>>>> this spec.
>>> 
>>> I can see that, yes. I still like the idea to drop to 1 in case of ECN. 
>>> Do you suggest to remove that as well?
>>> If so, should we augment the inter-frame gap in case of ECN? 
>>> That may be better though not simpler to specify or implement.
>> 
>> That’s an option as well. Again when you reduce something you might as well need to specify when to increase it again and that means you are specifying basically a congestion control scheme.
> 
> My goal for this doc was to keep it dead simple, build it so we have the necessary basis with ecn and windowing, and then play with it and learn from it. only then we can do a valuable spec.
> 
> can we keep it at what the doc has now?
>> 
>>> 
>>> 
>>> 
>>>> 
>>>> So it could be something like:
>>>> 
>>>> "Window_Size:  Window_Size MUST be at least 1 and less than 33. If the inter-
>>>> frame gap is selected large enough to not overload the path and the one-way
>>> 
>>> I see the IFG as more efficient for flow control than for congestion control: Increasing IFG slows down the packets but as long as the result is faster than the bottleneck, it does not help much does it? So I'm still  unsatisfied on how to characterize an IFG that does not overload a path. But I'm not sure we can do better. I moved that piece to the IFG definition if that's OK?
>> 
>> So how do you currently set the IFG? Both IFG and window_size can be used for both flow as well as congestion control, it only depends who generated the signal that is used to adapt the value, either the endpoint/receiver or the network/nodes on the path.
>> 
> 
> This is specified in minimal fragments. The IFG ensures that the previous fragment is beyond interference range. In a single frequency mesh that is multiple hops away.
> 
>> Using a window would be a window-based congestion control. Using the IFG would be a rate-based congestion control. But the principle is the same.
>> 
>> 
> I’d love to chat about that at the next IETF to get your view. On paper the rate based does not guarantee the amount of bytes that this node will pack at the bottleneck. What I found implementing it years ago was that it is sensitive to when and how the congested node sets ECN. I ended up adjusting only once per RTT...
> 
>>> 
>>>> delay is known, the Window_Size SHOULD be set to the one-way trip time
>>>> divided by the inter-frame gap.  Otherwise a small value of X SHOULD be
>>>> configured. Note that the Window_Size determines the ack rate. If the
>>>> window_size is set this to 32, this means only the last Fragment is
>>>> acknowledged in the first round. If it is set to a smaller value, more acks are
>>>> generated but the load on the forward path will be lower. Window_Size MAY
>>>> be adapted dynamically to reduce load on the forward path in case of
>>>> congestion.”
>>> 
>>> The assertion that the load on the forwarding path will be lower is usually incorrect in a typical  LLN, since the radios are half duplex. In the example of 6TiSCH, an rfrag_ack consumes the exact same bandwidth as a fragment (one time slot). Also the path of the rfrag_ack is the reverse LSP, so it goes across the exact same links.
>> 
>> Okay. Yes makes sense.
>>> 
>>> The last sentence is already present above in the text above it, all quoted below, so I'm also trying to avoid duplication.
>>> 
>>>> Still you also need to say more about how to set and dynamically adapt the
>>>> inter-frame gap because that is probably the real limiting fact to avoid network
>>>> overload.
>>>> 
>>> 
>>> Yes, I see that tuning IFG impacts the rate and can help alleviate the congestion once you pass below the rate the bottleneck can give you. I've done some adaptive CIR long ago in IBM FR switches and it can be made to work, and though it was a lot of fun, it's not any easier than window-based flow control. And it really depends on the relay doing the right thing, e.g., reacting quickly on growing queue latency and applying fair sharing.
>>> 
>>>> Also below you remove the recommendation for using the number of hops as
>>>> window size but here you added it again. This is just incorrect. There is no
>>>> dependency between the number of hops and the window size: If there is no
>>>> bottleneck on the path, you can just send with line rate at the sender. 
>>> 
>>> The rationale was: If there is no bottleneck on the path and the window is less than the number of hops then the sender will be blocked and the maximum throughput cannot be achieved.. If the rfrag_ack is as slow as the frag, which is reasonable in an LLN, we're talking about a window of twice the number of hops to keep the fragments going.
>> 
>> I see that the idea was rather to get the frames flowing (that avoiding overload) under the assumption that there is no bottleneck on the path. However, in this case you don’t really need the window at all and using the IFG should be enough.
> 
> 
> Yep it is actually more constrained since IFG usually covers transmission over multiple hops. You found that I removed hop count throughout this time (hopefully) and followed your recommendation.
> 
>> 
>>> 
>>> I saw the number of hops as a starting point, but I'm (really) happy to stick to RTT/IFG which makes more sense considering the focus that you seem to recommend placing on IFG (and I agree with that too).
>>> 
>>>>                                                                                                  If there
>>>> is a bottleneck on the path and you send at a higher rate than the bottleneck
>>>> than soon or later the buffer at that hop will fill up completely. So the window
>>>> size depends only on the bottleneck rate and end-to-end delay (BDP) (which
>>>> let’s you calculate the number of packet in flight) plus the buffer size at the
>>>> bottleneck. The number of hops is irrelevant.
>>> 
>>> Yes, I understand that model. It was easier to apply some 25 years ago.
>>> So far the links in a LLN are usually the same and the PHYs are usually the same so it's still usable.
>>> But that is bound to change rapidly as even LLN radios are going to be agile WRT PHY rate. Meaning that the rate at the bottleneck will become hard to fathom and will change (rapidly) over time (same as Wi-Fi).
>>> 
>>> All in all we'd get:
>>> 
>>> "
>>> An implementation must control the rate at which it sends packets
>>> over the same path to allow the next hop to forward a packet before
>> 
>> What does “same" relate to here?
> 
> Same next hop (in TSCH) and possibly multiple hops but usually it does not know (requires link state which we usually do not have)
> 
> I can change over the same path to via te same next hop to keep it simple?
>> 
>> 
>>> it gets the next.  In a wireless network that uses the same frequency
>>> along a path, more time must be inserted to avoid hidden terminal
>>> issues between fragments (more in Section 4.2).  An implementation
>>> should consider the generic recommendations from the IETF in the
>>> matter of congestion control and rate management in [RFC5033].  An
>> 
>> Maybe RFC8085 is the better reference?
> 
> Happy to change 
> 
>> 
>>> implementation may perform a congestion control by using a dynamic
>>> value of the window size (Window_Size), adapting the fragment size
>>> (Fragment_Size), and may reduce the load by inserting an inter-frame
>>> gap that is longer than necessary.
>> 
>> This is a bit the part that I don’t understand fully. Why do you need three different ways to enable congestion control instead of just having one.
> 
> To enable experimenting. The text above is true all this is possible. But the only thing that we mandate is resetting the window.
> 
> 
>> 
>> You already have the IFG. What's the benefits of having a window based rate limit in addition?
>> 
> 
> Tuning IFG is complex. Implementation may prefer window over rate based. Once we have experience we’ll build the necessary specs.
> 
> 
>>> In a large network where nodes
>>> contend for the bandwidth, a larger Fragment_Size consumes less
>>> bandwidth but also reduces fluidity and incurs higher chances of loss
>>> in transmission.
>>> 
>>> This is controlled by the following parameters:
>>> 
>>> 
>>> inter-frame gap:  Indicates the minimum amount of time between
>>>    transmissions.  The inter-frame gap protects the propagation of
>>>    one transmission before the next one is triggered and creates a
>>>    duty cycle that controls the ratio of air time and memory in
>>>    intermediate nodes that a particular datagram will use.  The
>>>    inter-frame gap controls the rate at which fragments are sent and
>>>    SHOULD be selected large enough to protect the network.
>> 
>> I think you need to provide some (normative) recommendation for the default configuration of the IFG. If that is specified in draft-ietf-6lo-minimal-fragment a pointer would be good and more explanation.
>> 
> 
> It is and it I will. There’s already such reference but I can probably do better.
> 
> 
> 
>>> 
>>> 
>>> <snip>
>>> 
>>> Window_Size:  The Window_Size MUST be at least 1 and less than 33.
>>> 
>>>    *  If the round-trip time is known, the Window_Size SHOULD be set
>>>       to the round-trip time divided by the time per fragment, that
>>>       is the time to transmit a fragment plus the inter-frame gap.
>>> 
>>>    Otherwise:
>>> 
>>>    *  Setting the window_size to 32 is to be understood as only the
>>>       last Fragment is acknowledged in each round.  This is the
>>>       RECOMMENDED value in a half-duplex LLN where the fragment
>>>       acknowledgment consumes roughly the same bandwidth on the same
>>>       links as the fragments themselves
>>> 
>>>    *  If it is set to a smaller value, more acks are generated.  In a
>>>       full-duplex network, the load on the forward path will be
>>>       lower, and a small value of 3 SHOULD be configured.
>>> “
>>> 
>> 
>> If having the window is still useful, this is fine I think.
>> 
> 
> I don’t know but we’ll learn. Please let me know if we agree on the above and I’ll update tomorrow (it’s family time now in CET).
> 
> Take care,
> 
> Pascal 
> 
>> Mirja
>> 
>> 
>> 
>>> 
>>