Re: [babel] Zaheduzzaman Sarker's Discuss on draft-ietf-babel-rtt-extension-05: (with DISCUSS and COMMENT)

Zaheduzzaman Sarker <zahed.sarker.ietf@gmail.com> Mon, 19 February 2024 11:44 UTC

Return-Path: <zahed.sarker.ietf@gmail.com>
X-Original-To: babel@ietfa.amsl.com
Delivered-To: babel@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C0696C14F6E2; Mon, 19 Feb 2024 03:44:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.107
X-Spam-Level:
X-Spam-Status: No, score=-7.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JTSyd_55W_4x; Mon, 19 Feb 2024 03:44:25 -0800 (PST)
Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 001BBC14F6BA; Mon, 19 Feb 2024 03:44:24 -0800 (PST)
Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-299354e5f01so1830284a91.1; Mon, 19 Feb 2024 03:44:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708343064; x=1708947864; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=umeMaBQaCYIm6I4vZO9vl9wlNUxEcDbyJAjJNV0MpPU=; b=d/TMwa9PLsyYqZUpYZla0vk/N9r70oUnPUWDQQ/y7VGi+GmI1jbpHL96Lky0BAOTGg jXpk2DhRSBZdOENEL//GTpGqlHb/L3nWBXS2vL3OUwfAaGSDdCvGDKf4ugeZIohkcv/x sous53HQjCeOoD8LMqGg4LhSOdqbkIiKp1cvq/Y/edxGxkau8l6KM8Cr0hNNJloDsRwi sW512SDjS/nsYeXHuOOzAXEq50c6jlFosUXNw7VaGkyNcfqKI4xQWku3p4ItK1mibPoy DPHv5iDcLGHUCZICJ4w30QU8+GbW8UwJq6ugTMH8K3MoxL1hfYaBHtrR33QJgcvw8GxB bF3g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708343064; x=1708947864; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=umeMaBQaCYIm6I4vZO9vl9wlNUxEcDbyJAjJNV0MpPU=; b=rv1dTh0QnWcd9mmvKle+tsDGYR6+Us4CiuZwmiPzd5Dv7P430om/4dnTNxTexuukOh VyDv2PSiPRXXhj9/CamHcUDsez3ziUxKJbvprsIUQZF2wZCyah4Bxg96oW9M7kVdmJPQ MLh5VoC6pp1GqTZ3cloWLylXHsU1W2CxLmkg6LYTU6+fX1pH/RBoL7FTmjxfia/fIijt JLDdP8ycVcfEu40dkZ76kjwr4ZwRI60l9XJ+ANb70z+ehZ3Lm3T8vqGBPRfYrEcsKLwC 8ZJZ5IEtQxznYeZ94pefAz5fwOnQZceJci2ccbNH0FOqADfi6L/nKhDDFkYiybt7c2Nw SO4w==
X-Forwarded-Encrypted: i=1; AJvYcCWjbSEmzJ4omx87UE5cbGE/2s0scqo27L7OWhv7wN8WgFomIsmLAtFp+FvLvSUPljFota/6yH0Vjdf0FvFk1zoyF3jD6vvHiX2RubrrznbCi7eIFF4WCFCIf6Bgo51D/Ghgpax1xm12t/Gth4b23Mlv/eLYXqKy4MfzMIF9mwI=
X-Gm-Message-State: AOJu0YxJGehAs7wmNTsECljUy68unNHuIrL7pKXG+zGVn6N3kZQDMYBh tbXrNqiRCWaMd/JcEbCl0bLhvmoagPzMu8PZH9HPNK2qBd0qvjh0PyuNRWtRHXXkjd3fCas31BX tcQ31X6G7skgDm5BVrOH1N5l++QM=
X-Google-Smtp-Source: AGHT+IEtDYE76BTexg7xuvw5DLS+0B5qJnfUk5VQ2NA2bWKPqJZsnG3JZoGoCHI5MVFV8Y7Z2qYmoCTah8StUuNagyk=
X-Received: by 2002:a17:90b:305:b0:299:1802:4c54 with SMTP id ay5-20020a17090b030500b0029918024c54mr14632103pjb.16.1708343064174; Mon, 19 Feb 2024 03:44:24 -0800 (PST)
MIME-Version: 1.0
References: <170787401277.9987.12424865727760301020@ietfa.amsl.com> <87ttm8wqbe.wl-jch@irif.fr>
In-Reply-To: <87ttm8wqbe.wl-jch@irif.fr>
From: Zaheduzzaman Sarker <zahed.sarker.ietf@gmail.com>
Date: Mon, 19 Feb 2024 12:44:13 +0100
Message-ID: <CAEh=tce+G6ddYeZQxVHwHjQVsBs-_BPBSRJqh4OH5BuoFAHxNQ@mail.gmail.com>
To: Juliusz Chroboczek <jch@irif.fr>
Cc: The IESG <iesg@ietf.org>, draft-ietf-babel-rtt-extension@ietf.org, babel-chairs@ietf.org, babel@ietf.org, Donald Eastlake <d3e3e3@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000007ea4070611ba9dbe"
Archived-At: <https://mailarchive.ietf.org/arch/msg/babel/5HOPyCUelteGife-VODrhy29QSU>
Subject: Re: [babel] Zaheduzzaman Sarker's Discuss on draft-ietf-babel-rtt-extension-05: (with DISCUSS and COMMENT)
X-BeenThere: babel@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "A list for discussion of the Babel Routing Protocol." <babel.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/babel>, <mailto:babel-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/babel/>
List-Post: <mailto:babel@ietf.org>
List-Help: <mailto:babel-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/babel>, <mailto:babel-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Feb 2024 11:44:25 -0000

Hi Juliusz,

Thanks for your response.. and it seems like my discuss points will be
easily resolved by either putting emphasis on the extension specification
or pointing to the right section of RFC8966.

Please find some more reflections inline.

//Zahed

On Fri, Feb 16, 2024 at 1:23 PM Juliusz Chroboczek <jch@irif.fr> wrote:

> >  # I support Rob's discuss that it is not clear why this is published as
> >  standard track document. Apart from what Rob pointed out, there is
> another
> >  place where the experimental nature of this specification is obvious. In
> >  section 1 it says -
> >
> >     "We believe that this protocol may be useful in other situations
> than the
> >     one described above, such as when running Babel in a congested
> wireless
> >     mesh network or over a complex link layer that performs its own
> routing;
> >     the fine granularity of the timestamps used (1盜) should make it
> possible
> >     to experiment with RTT-based metrics on this kind of link layers."
>
> I'm very confused by this argument.
>
> As David explained, this document proposes an algorithm that we know to be
> suitable for a specific application, large overlay networks, as described
> in Section 1 of the document.  We have a lot of experience running this
> algorithm in such environments (extensive experimentation in simulation
> followed by 10 years of production deployment).
>
> What we say in the paragraph that you quote is that we do not know whether
> the algorithm has other applications, but we believe it might have.  As
> David explained, it does not apply to the application for which we propose
> the algorithm, it is just a side note.
>
> If you think that this paragraph hampers intelligibility by those who have
> only read the document superficially, I'm open to removing the whole
> paragraph, even though it will make the document less informative.  Please
> let me know what to do.
>

I don't see that paragraph as a very essential part of this specification
and suggest to remove it. I has certainly gave me a vibe that more
experiments are necessary even for this extension to be used.


>
> >    This shows lack of confidence on the results
>
> I'm very confused by this sentence. Please explain how you came to this
> conclusion, given that the paragraph you quote is just a side comment
> about potential future research.
>

I think this is because how it was put in the section, one could interpret
like I did... .

>
> >    RTT-based route selection can end up having negative impacts by
> >    overloading and congesting low RTT routes,
>
> I don't see how this is different from hop-count routing, which runs the
> risk of overloading low-hop-count routes.  This is why we perform
> congestion control at the transport layer: so that the network layer is
> not reponsible for congestion avoidance.
>

I would be good to give emphasis on the fact that congestion control at the
transport layer would help here to avoid congesting the low RTT routes. But
I though after reading the RFC 8966 again that this feature is build-in the
BUBLE algorithm, so we can point to that.


>
> >  # This specification does not specify the relation to other loss-based
> metric
> >  and hop-count metric based strategies. I can imagine a network where
> low RTT
> >  can be emitted at the cost of packet loss. Will this RTT-based strategy
> be
> >  safe to use?
>
> It will be safe to use as long as the resulting metric satisfies the
> properties in Section 3.5.2 of RFC 8966.
>

I would then point to the that section of RFC8966.


>
> >  # How would this RTT-based strategy will co-exists with other
> strategies those
> >  are deployed already as claimed in this specification? This
> specification need
> >  to guide the implementers about what to consider when selecting the
> routing
> >  strategy and how the strategies can co-exits.
>
> That's what Sections 3.5.1 and 3.5.2 of RFC 8966 do: they describe the
> general conditions under which a combination of cost and metric
> computation strategies are safe in Babel.
>

As I wrote in the response to David. we can just mention that the RTT based
strategy does not change what it already described in RFC8966 for the
implementers.


>
> >  # The periodicity of HELLO message is not clear to me.
>
> The document says:
>
>    the only change to Babel's message scheduling is the requirement that
>    a packet containing an IHU also contains a Hello.
>
> Recommended scheduling of Hellos is described in Appendix B of RFC 8966.
>
> >  This is an important piece of information that should be derived from
> >  proper experiments as we don't want the HELLO message to overload the
> >  route or path.
>
> Appendix B of RFC 8966 recommends a default of one Hello/IHU exchange
> every 4s.  This default is deliberately very conservative, so that the
> protocol works well on poor wireless links.  Please let me know if you
> need links to publications about Babel's behaviour in hostile environments.
>

Ok, then a pointer to the Appendix B would be great here.


>
> >  The discussion on when to stop sending those HEllO messages is
> >  required.
>
> I'm very confused by this sentence.  Please see Section 2.5 of RFC 8966,
> which says:
>
>   A Babel node periodically sends Hello messages to all of its neighbours;
>   it also periodically sends an IHU ("I Heard You") message to every
>   neighbour from which it has recently heard a Hello.
>
> The exact specification is in Sections 3.4.1 and 3.4.2 of RFC 8966.
>
> >  Also the frequency of the HELLO message might help adjusting the clock
> >  drift, as it is an important aspect of the accuracy of the algorithm.
>
> The document says:
>
>    However, t2' - t1' is usually on the order of seconds, and significant
>    clock drift is unlikely to happen at that time scale.
>

I would say we can point to he sections in RFC8966 for clarity here.

//Zahed


>
> A typical low-cost crystal oscillator has drift under 30ppm.  30ppm of 4s
> is 120 microseconds.
>
> -- Juliusz
>
>
>