Re: [Ntp] comments on draft-mlichvar-ntp-ntpv5-03 / Extension fields

Dan Drown <dan-ntp@drown.org> Tue, 30 November 2021 05:04 UTC

Return-Path: <dan-ntp@drown.org>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 75C103A0F81 for <ntp@ietfa.amsl.com>; Mon, 29 Nov 2021 21:04:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cR4s34gGEU2t for <ntp@ietfa.amsl.com>; Mon, 29 Nov 2021 21:04:03 -0800 (PST)
Received: from vps3.drown.org (vps3.drown.org [96.126.122.39]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E2B9E3A0D76 for <ntp@ietf.org>; Mon, 29 Nov 2021 21:04:02 -0800 (PST)
Received: by vps3.drown.org (Postfix, from userid 48) id A7F342FC479; Mon, 29 Nov 2021 23:04:01 -0600 (CST)
Received: from 2603-8080-2709-c400-8d88-6649-85f3-d6ab.res6.spectrum.com (2603-8080-2709-c400-8d88-6649-85f3-d6ab.res6.spectrum.com [2603:8080:2709:c400:8d88:6649:85f3:d6ab]) by mail.drown.org (Horde Framework) with HTTPS; Mon, 29 Nov 2021 23:04:01 -0600
Date: Mon, 29 Nov 2021 23:04:01 -0600
Message-ID: <20211129230401.Horde.bPTyNTr6teU3hrkyheUmwk6@mail.drown.org>
From: Dan Drown <dan-ntp@drown.org>
To: Miroslav Lichvar <mlichvar@redhat.com>
Cc: ntp@ietf.org
References: <20211123131501.Horde.ErUH7VWw3Nr2PFkAGzGIEuI@mail.drown.org> <20211125214748.Horde.K2Fa5qir5iPLYRvfQJBMx8m@mail.drown.org> <20211126214820.Horde.ErbRZcjuVf-yEn55FGUhEZP@mail.drown.org> <YaSkKrfdgJmOfKyJ@localhost>
In-Reply-To: <YaSkKrfdgJmOfKyJ@localhost>
User-Agent: Horde Application Framework 5
Content-Type: text/plain; charset="utf-8"; format="flowed"; DelSp="Yes"
MIME-Version: 1.0
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/yUklyIojHMeARe-_eYZ9y2n4mNY>
Subject: Re: [Ntp] comments on draft-mlichvar-ntp-ntpv5-03 / Extension fields
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Network Time Protocol <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Nov 2021 05:04:05 -0000

Quoting Miroslav Lichvar <mlichvar@redhat.com>:
> On Fri, Nov 26, 2021 at 09:48:20PM -0600, Dan Drown wrote:
>> After a day of thinking about my proposal above, I have the following
>> concerns:
>>
>> * The added packet size, 20+4 bytes is very large for NTP
>
> What would you consider the maximum acceptable size? Please note that
> only clients serving time to other servers would need to use it.
> Clients that don't serve time, which I think is the most common case,
> don't need to care about loops.

Taking an embedded NTP server with a 100M ethernet link for example,  
my concern is that adding 24 bytes reduces maximum packets per second  
from 109.6kpps to 90.6kpps (17%). The number of NTP servers on a 100M  
link doing near maximum packet rate is probably a short list, but it  
is a tradeoff to consider. I don't really have a set acceptable size  
and I am open to increasing the size by some amount for this  
functionality.

I agree that only NTP servers would need to care about loop  
prevention. So that would help the average packet size needed given a  
mix of clients and downstream servers.

> As you pointed out, it should be even much larger if we want a lower
> false positive rate. But it doesn't have to be transmitted all at
> once. The client could get it in multiple parts over multiple requests
> (e.g. 4 or 8). This would add a delay to the loop detection.

I'd be worried about keeping that data in sync between systems. Having  
a "generation" identifier would be one option to handle that. If we  
went that route, why not just have static IDs for each stratum instead  
of a bloom filter?

Something like:

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|             Type              |             Length=12         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| generation    | stratum       | bits47..32                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| bits31..0                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

request would set everything to 0 and then give the stratum it wanted (1..16).

response would set a generation ID that changes every time the  
upstream clock changes. The generation ID changing would signal to the  
downstreams that they need to refetch all the refIDs. The downstreams  
could cache the data as long as the generation ID stayed the same.  
Each stratum would have a random 48 bit number identifying it.

For generation ID size, I picked 8 bits arbitrarily. That risks  
wrapping when the requestor is polling at 1024 seconds and the  
responder is changing references every 4 seconds. Stratum only needs  
to be 4 bits (0..15), so generation could be 12 bits instead, trading  
longer wrap times for slightly more implementation complexity.

If I properly understand the birthday problem in statistics,  
1-e^(-n*(n-1)/(2^(48+1)))>0.001 is the estimate for how many servers  
you need before you have a 0.1% chance of two of those servers picking  
the same 48 bit number (assuming fully random id selection). Which is  
750,488 servers, so that seems unique enough to me.

A 12 bytes extension field is a 10% reduction in maximum packets per  
second vs 24 bytes' 17%.

There would have to be some sort of reserved ID for the case where a  
responder didn't know the data of a given stratum yet (or NTPv4 was  
involved in the path). Maybe 0xffffffffffff.

I'm not sure if I've fully explained this, let me know if there are  
any questions.

>> * I need to consider compatibility between NTPv4 refid and v5 extension.
>> Specifically, what translations happen when switching between versions
>
> Most of the information would be lost if an NTPv4 server was in the
> chain. This would be common in the beginning. For NTPv4->NTPv5
> compatibility, we could specify a mapping of the 32-bit refids to the
> bloom filter.

It could be mapped directly onto bits31..0

>> * Packet logging for monitoring and management would be challenging
>
> You mean manually checking whether a server is in the filter?

For diagnosing "why did things break" after the fact, it would be nice  
to have a log of this data somewhere

>> * Operators would probably want some way to make this more deterministic,
>> maybe a configured local ref id
>
> It could be saved to disk to not change across service restarts.

I guess I'd rather see it in a config file than a generated state  
file. If no config for the ID is present, it would be automatically  
generated each time.