Re: [babel] Mirja Kühlewind's Discuss on draft-ietf-babel-rfc6126bis-12: (with DISCUSS and COMMENT)

Mirja Kuehlewind <ietf@kuehlewind.net> Wed, 14 August 2019 15:11 UTC

Return-Path: <ietf@kuehlewind.net>
X-Original-To: babel@ietfa.amsl.com
Delivered-To: babel@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1154312023E; Wed, 14 Aug 2019 08:11:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mQsOZ1dDdakp; Wed, 14 Aug 2019 08:11:01 -0700 (PDT)
Received: from wp513.webpack.hosteurope.de (wp513.webpack.hosteurope.de [IPv6:2a01:488:42:1000:50ed:8223::]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4C5DA120871; Wed, 14 Aug 2019 08:11:01 -0700 (PDT)
Received: from [129.192.10.3] (helo=[10.149.1.218]); authenticated by wp513.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) id 1hxuvQ-00053h-DG; Wed, 14 Aug 2019 17:10:56 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Mirja Kuehlewind <ietf@kuehlewind.net>
In-Reply-To: <877e7m8b88.wl-jch@irif.fr>
Date: Wed, 14 Aug 2019 17:10:55 +0200
Cc: draft-ietf-babel-rfc6126bis@ietf.org, d3e3e3@gmail.com, babel-chairs@ietf.org, The IESG <iesg@ietf.org>, babel@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <1A2B2C1B-1536-4E75-A8D7-C5612FB8AEDA@kuehlewind.net>
References: <156517737995.8257.5538554979559246700.idtracker@ietfa.amsl.com> <877e7m8b88.wl-jch@irif.fr>
To: Juliusz Chroboczek <jch@irif.fr>
X-Mailer: Apple Mail (2.3445.104.11)
X-bounce-key: webpack.hosteurope.de;ietf@kuehlewind.net;1565795461;d7e73446;
X-HE-SMSGID: 1hxuvQ-00053h-DG
Archived-At: <https://mailarchive.ietf.org/arch/msg/babel/MjppKhK0hfpEQCkFcjzBEHFXw3U>
Subject: Re: [babel] Mirja Kühlewind's Discuss on draft-ietf-babel-rfc6126bis-12: (with DISCUSS and COMMENT)
X-BeenThere: babel@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the Babel Routing Protocol." <babel.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/babel>, <mailto:babel-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/babel/>
List-Post: <mailto:babel@ietf.org>
List-Help: <mailto:babel-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/babel>, <mailto:babel-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Aug 2019 15:11:04 -0000

Hi Juliusz,

Thanks for the work and relies. Please see inline. 

> On 9. Aug 2019, at 20:06, Juliusz Chroboczek <jch@irif.fr> wrote:
> 
> Dear Mirja,
> 
> The replies in this mail concern -13, which I haven't submitted yet (still
> working on it).  My working copy is on
> 
>  https://github.com/jech/babel-drafts
> 
>> DISCUSS:
>> ----------------------------------------------------------------------
> 
>> I have a couple of points that needs addressing before this document can move
>> forward. Most of them should the straight forward to address. My main point is
>> about network load.
> 
> I agree, that's an important point, especially when running over 802.11.
> 
>> Note that RFC8085 recommend a minimal interval of 3 seconds which
>> probably is also a good hard boundary here.
> 
> As mentioned in my previous mail, RFC 8085 deals with UDP packets sent
> across the Internet.  What we are discussing here is a link-local
> protocol, where there are no intermediary routers to congest.  Thus, on
> robust link technologies (such as Ethernet) it is enough to ensure that we
> don't overwhelm sender and receiver queues; on more fragile technologies
> (such as 802.11), we need to ensure we don't overwhelm the MAC.
> 
> In short, if the 3 second limitation is to be enforced, then OSPF cannot exist.

I really don’t want to request to enforce a 3 second limit. However, I would like this draft to specify a limit or at minimum discuss suitable values for specific scenario. I was just referencing to the 3 second in absence of a better value, however, I’m sure you know better what would make sense.

> 
>> More concretely I think there are these cases that need more guidance:
> 
> I agree.  I've added a short discussion of packet pacing at the end of
> 3.1, and I refer to it at suitable places.

Thanks. I was also hoping that you could make any recommendation on how to implement that e.g. a fixed delay of a certain (default) value or based on some other knowledge. If that is a SHOULD and no further implementation example is given, I would be afraid that the risk is high that people simple don’t implement this part.
> 
>> - Section 3.7.2. (Triggered Updates) advises to send a message multiple
>> times for redundancy in case of loss. 5 and 2 are mentioned as example
>> values. Please provide a normative default value and a normative maximum
>> value here. Moreover the spec should also require to pace out these
>> messages and avoid "tail loss" by overloading the local queue.  (See
>> also section 3.8.2.1)
> 
> Done for the normative max and recommendation to avoid tail loss.
> I haven't made the default values normative.

Why not? 

> 
>> - Section  3.8.1.1.  (Route Requests) says: "Full route dumps MAY be
>> rate-limited, especially
>>   if they are sent over multicast."
>> I think this should at least be a SHOULD.
> 
> Agreed.

Good.
> 
>> Please also provide further guidance about to appropriately rate limit
>> and think about other cases where a recommend to implement rate-limiting
>> could make sense.
> 
> Done.

Thanks!
> 
>> - In section 4.1.1 the update interval needs a lower limit (e.g. 3 seconds)
> 
> I strongly disagree.  Sub-second convergence after a mobility event is
> required in some networks.
> 
> To put things into perspective, a full-size Ethernet frame is able to
> carry over 60 Babel updates (assuming 50% IPv4 + 50% IPv6 and reasonably
> successful IPv6 prefix compression).  Thus, in a network with 1000 routes,
> a full update occupies 16 packets.  With an update interval of 0.1
> seconds, we are sending an average of 160 packets per second, which is
> very reasonable for a number link technologies.

(See above) Maybe then 0.1 seconds is a suitable minimum value…?
> 
>> and a recommend default value would be could as well (Note that there
>> are other part in section 3 where the update value is discussed as
>> well).
> 
> Appendix B.

I think this needs normative language in the body of the document.

> 
>> - Section 3.8.2.4. mentions network load when requests are sent to all
>> neighbours after reboot. Please provide more guidance about how to pace out
>> these requests.
> 
> I've removed this section altogether.

Why?

> 
>> - Section 3.8.1.2.  (Seqno Requests) discusses hop count values but
>> could maybe also give more concrete guidance. I would assume that the
>> hop count value of the current active route is usually know. Maybe that
>> knowledge could be used to pick an appropriate value?
> 
> The hop-count is a last resort mechanism intended to save your network
> from catastrophic failure in case everything goes horribly wrong.  It
> never triggers in normal usage.
> 
> Any value will do.  I've made that clear, and suggested the value 64
> (non-normatively).

Okay.

> 
>> Two other smaller discuss points/questions/comments:
> 
>> 1) Sec 4.6.8. (Next Hop): If I interpret this correctly, address compression is
>> allowed for the next hop field and therefore this TLV would actually not be
>> self-terminating. What do I miss?
> 
> Address compression is only allowed in Update TLVs (the only compression
> mechanism allowed in NH is AE 3).  I've clarified that.

Okay. Thanks!
> 
>> 2) This document needs to specify a registration policy also for each of  the
>> already existing registries given this document obsoletes RFC7557.
> 
> Ok.

Great!

> 
>> ----------------------------------------------------------------------
>> COMMENT:
>> ----------------------------------------------------------------------
> 
>> 1) While this point might not raise discuss-level, it would probably also be
>> good to provide more concrete advise on how to implement jitter: Sec 3.1.: “  
> 
> Expanded this in 3.1.  Removed from Section 4.

Similar, as my comment on pacing, I’m wondering if you can be even more concrete in order to make it easy for people to implement this correctly. 
> 
>> 2) Sec 4.1.2. (Router-Id) should probably state again that the router-id is
>> assumed to be unique within a domain.
> 
> No, this section only defines the datatype, which is carried by Router-ID
> TLVs.  It does not define the local Router-ID field, which is part of the
> data structures.

Okay. Maybe just provide a pointer then?

> 
>> 3) Sec 4: “The most-significant bit of the sub-TLV, called the mandatory bit,
>>   indicates how to handle unknown sub-TLVs.”
> 
> This has been clarified.

Thanks!

> 
>> I would recommend to also indicate this bit in the image.
> 
> The mandatory bit is part of the TLV Type (see the discussion with
> Alvaro), this has been clarified.  I am not aware of a way to describe
> that in a packet diagram.

Ah I entirely missed that.

> 
>> 4) Sec 4.4: “If a TLV has a self-terminating format, then it MAY allow
>> a sequence of sub-TLVs to follow the body.”  Initially I wasn’t quite
>> sure what you wanted to say here. I guess you say that the length would
>> indicate a larger value that needed for the body and therefore a subTLV
>> might be present? I recommend to clarify this here a bit.
> 
> Done.

Great.

Thanks!
Mirja


> 
> Thanks,
> 
> -- Juliusz
> 
>