Re: [6tisch] Benjamin Kaduk's Discuss on draft-ietf-6tisch-msf-12: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Fri, 03 April 2020 03:37 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: 6tisch@ietfa.amsl.com
Delivered-To: 6tisch@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DF0013A0D76; Thu, 2 Apr 2020 20:37:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id W6vIyy0JsTcP; Thu, 2 Apr 2020 20:37:23 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D7DEC3A0D74; Thu, 2 Apr 2020 20:37:22 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 0333bEw5013387 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 2 Apr 2020 23:37:16 -0400
Date: Thu, 02 Apr 2020 20:37:14 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: "Pascal Thubert (pthubert)" <pthubert@cisco.com>
Cc: Tengfei Chang <tengfei.chang@gmail.com>, The IESG <iesg@ietf.org>, "draft-ietf-6tisch-msf@ietf.org" <draft-ietf-6tisch-msf@ietf.org>, 6tisch <6tisch@ietf.org>, "6tisch-chairs@ietf.org" <6tisch-chairs@ietf.org>
Message-ID: <20200403033714.GF88064@kduck.mit.edu>
References: <158394932747.1671.4699004253009791924@ietfa.amsl.com> <CAAdgstSMOf7wDSfbWMv5tEzpx1=otQZX_TZ+Xevm77f-1ZztNw@mail.gmail.com> <20200324192510.GE50174@kduck.mit.edu> <CAAdgstTzBTwncFgaEao62X2s4J610_spy4oFsTYNkB1KUKhHww@mail.gmail.com> <20200331235734.GU50174@kduck.mit.edu> <MN2PR11MB3565D75EADFF279B8A03B577D8C60@MN2PR11MB3565.namprd11.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <MN2PR11MB3565D75EADFF279B8A03B577D8C60@MN2PR11MB3565.namprd11.prod.outlook.com>
User-Agent: Mutt/1.12.1 (2019-06-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/6tisch/UzQa0IoZQPn3YdDr9SmAJayOQPk>
Subject: Re: [6tisch] Benjamin Kaduk's Discuss on draft-ietf-6tisch-msf-12: (with DISCUSS and COMMENT)
X-BeenThere: 6tisch@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discuss link layer model for Deterministic IPv6 over the TSCH mode of IEEE 802.15.4e, and impacts on RPL and 6LoWPAN such as resource allocation" <6tisch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/6tisch>, <mailto:6tisch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/6tisch/>
List-Post: <mailto:6tisch@ietf.org>
List-Help: <mailto:6tisch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/6tisch>, <mailto:6tisch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Apr 2020 03:37:25 -0000

Hi Pascal,

On Thu, Apr 02, 2020 at 09:57:07AM +0000, Pascal Thubert (pthubert) wrote:
> Hello Benjamin and authors:
> 
> Thomas is really the MAC expert, (virtually) collocated with him, and more fluent in English than I'll ever be. But I'll take a chance at it.
> 
> I understand that the problem you're concerned with happens when the automatic slot offset for node A ends up the same as that for node B. Note that it does not take a hash collision, because the problem is independent of the 
> channel offset. All it takes is a collision in time, because if I'm busy transmitting I cannot receive on a half duplex radio. The chance of this to occur is 1/slotframe_size for a random pick for which the hash is an approximation.  
> 
> In that case (same auto slot for A and B), an arbitrary node  C (which could be A or any other node) will fail to talk to B if B has something to transmit to A, because if B xmit has precedence, and when B's radio is busy and it will not to receive from C, even on a different channel (it's half duplex). So if B keeps talking a lot to A, that will make it mostly deaf. But if B is a relay as opposed to the generator of traffic, the xmit queue will dry out and B will listen again. 
> 
> If C is A, meaning that A sends to B and B sends to A at the same time, and in the absence of other traffic, the retries will happen at different times and will not collide, all good. If the traffic in both direction becomes intense, the retries will cause even more collisions and we'll end in congestion collapse.

The "retries will happen at different times" is the part that I was
missing, and Tengfei just added a description of that in the -16, so I
think I'm all set as far as this goes.

Thanks for the detailed explanation!

> Potential defenses include:
> - MSF could be more aggressive at establishing cells to nodes that end with the same automatic slotOffset as self
> - In densely used networks (which are unusual for battery operated networks), define multiple timeslots with different auto slotOffset for each node, using different seeds for the hash (like a nmber 1 ..n),  and have the sender use a random one for Tx.
> 
> Is that what you are after?

It's not clear to me that we need to mandate either of these as part of a
*minimal* scheduling function, though having a description of the potential
issues would be reasonable.  (Should we expect higher-layer protocols to
eventually back off and remedy the congestion collapse?)

I'm going to go clear my Discuss now, as those points are resolved, which
of course does not preclude making further changes.

-Ben

> Thomas (or Simon): I'm sure I'm missing stuff, what more is there?
> 
> You all keep safe
> 
> Pascal
> 
> 
> 
> > > > >      DISCUSS:
> > > > >
> > > > --------------------------------------------------------------------
> > > > --
> > > > >
> > > > >      I'm concerned that the scheduling function for autonomous cells can
> > > > >      cause an infinite loop in the case of hash collision -- Section 3
> > > > >      specifies that AutoTxCell always takes precedence over
> > > > > AutoRxCell,
> > > > but
> > > > >      if those two cells collide, the corresponding cells on the peer in
> > > > >      question will also collide.  If both peers try to send at the
> > > > > same
> > > > time
> > > > >      and the hashes collide, they will both attempt to transmit
> > > > indefinitely
> > > > >      and never be received.
> > > > >
> > > > >
> > > > >    >. Notice that the AutoTxCell  is a shared cell, where the back-off
> > > > >    mechanism is applied.
> > > > >    > In case there is a collision on that cell, a back-off with different
> > > > >    exponent will be used on each side.
> > > > >    > The cell will be used AutoTxCell on each side at different timing.
> > > >
> > > > Ah, it seems I was misinterpreting "take precedence over" to apply
> > > > to the entire local scheduling, not merely the case when independent
> > > > tx and rx scheduling land on the same cell.  Thanks for clarifying
> > > > here; is there anything useful to say in the document about how even
> > > > if there is a collision in the assigned slot there's still a Tx
> > > > backoff, so the cell is usable for Rx some of the time?
> > > >
> > >
> > > * RESPONSE: * We could add the following sentence right after the
> > > hashing collision statement:
> > >
> > > Notice AutoTxCell is a shared type cell which applies back off mechanism.
> > > When the AutoTxCell and AutoRxCell are collided,  AutoTxCell takes
> > > precedence if there is a packet to transmit.
> > > In case in a back-off period, AutoRxCell is used.
> > 
> 
>