Re: [Ntp] NTPv5 draft

Warner Losh <imp@bsdimp.com> Wed, 09 December 2020 05:30 UTC

Return-Path: <wlosh@bsdimp.com>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4E45D3A0C26 for <ntp@ietfa.amsl.com>; Tue, 8 Dec 2020 21:30:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bsdimp-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rXaOfPY0kO0V for <ntp@ietfa.amsl.com>; Tue, 8 Dec 2020 21:30:11 -0800 (PST)
Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 387B43A0C24 for <ntp@ietf.org>; Tue, 8 Dec 2020 21:30:10 -0800 (PST)
Received: by mail-qk1-x733.google.com with SMTP id 143so268913qke.10 for <ntp@ietf.org>; Tue, 08 Dec 2020 21:30:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=b432I4ElM6GzPD5UxrlMqBK6NYDGQISKghh2AcVZN4A=; b=uFI/XwBHgaqwiDpqZEOtLe5D/VDZl8xnUZ3Wtq5CE8+WZ7PXYW5kuqR5o+OpM7ht01 VoHrKzpABkDtTytLUXiKXJeVNpwwRN5K1A8N81iOdZAOXW1Xe24MxVPQl+wCKjBs9+nr 4fcafzS2HCmjWva/wJ4euUD2yyBVCjqouVQo2fdNa2JLwgQSJHj54zj/BK8xM7EzmZ6v FXJGPo+gQ6BxmvBUoWZZ1g9xn2zm6K0uwuIHaDsmGgVPmtzqg1SDKjlV8uSfOiGwTqR6 BlWm+cpVzRrwUrQRHzTTgAtONphy+8BNjna12geNOLmyOJAj+Y1mtqqv8gEyjfsg0eEf NnkQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=b432I4ElM6GzPD5UxrlMqBK6NYDGQISKghh2AcVZN4A=; b=pGHPnUemNabl7Gq2iagHGL4vEL3i6t98UEjnSe+7oF1j2J+6yFoVjQJyAdavSBfMEj dEksacYTHooaIp7MKOv880O7hlsdX9b4XFiHqji4rOCl9HuNljp10/eH1zgLPApanfJi 2vmLoZryUK58cKvOONk4uM/fUjuvvzKJ+HUYUnB3l4R9/Q/XoyzrNhlb9nvhqy7wc0k0 P71yqQEDt95VXP/W/oz3xb4S5Rqo0vHBqt8vX5o6q7ktSND+WUEyAoSKlg7FN5wTeUcc rqHfQGy+8K8q27DxkGmjTMRxoryUb3E111tc0jzTF/f7RE0xtNh5qDYAwbb7ZGQMx+HL QrfA==
X-Gm-Message-State: AOAM533Uu7DiJ07uSSaBfHcBPsRY4l7Y3CJIHEwkQe2KpuIUHnYjoM9q c3w5hUkF5pl0J07GcKYn38SJRC7ETOwUJ/4lBnksEQ3t+T8=
X-Google-Smtp-Source: ABdhPJyDnPNP84a8bFwDFt8Y8SNCxupuM0FpUWgg7QyKQdVdLpqux07MneDQvP18FQNCoPZO7o69mJ8xa1smL1EUfd8=
X-Received: by 2002:a37:bf86:: with SMTP id p128mr1184301qkf.44.1607491809848; Tue, 08 Dec 2020 21:30:09 -0800 (PST)
MIME-Version: 1.0
References: <20201111161947.GG1559650@localhost> <AA848C67-CFB7-43FC-B190-FD3911360373@gmail.com> <20201201081203.GB1900232@localhost> <2B8C7410-DFA7-4A87-A33E-F50FFA96D0F9@gmail.com> <20201201100305.GK1900232@localhost> <F62C1325-8409-474C-9650-FA96405D0F4B@gmail.com> <20201207104541.GE2352378@localhost> <E0159612-5D83-4A0E-BBD1-1D75C0B49226@akamai.com> <20201207153444.GO2352378@localhost> <1204B871-7728-45DA-B628-8F79BD074A96@akamai.com> <20201208095046.GT2352378@localhost> <D15AF5B4-F976-44D6-B8E7-986E3B8CE23D@akamai.com> <3314193a-a430-8db8-b72c-8443dcc1f125@dansarie.se> <4ab54344-fa4d-5719-db63-0555ce190643@rubidium.se> <CANCZdfprZSNX-GNN7KOVhj3k3jU1t2KiNUHTqRrDB+_g2OCw3A@mail.gmail.com> <61baa7ff-d512-3267-a12f-e8552789c0c2@rubidium.se>
In-Reply-To: <61baa7ff-d512-3267-a12f-e8552789c0c2@rubidium.se>
From: Warner Losh <imp@bsdimp.com>
Date: Tue, 08 Dec 2020 22:29:58 -0700
Message-ID: <CANCZdfqT7Wo5jFCU1+emDpAmdTq1XeMuZEJgu9Y4_uBLLtbZag@mail.gmail.com>
To: Magnus Danielson <magnus@rubidium.se>
Cc: NTP WG <ntp@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000004e4fbd05b6015a52"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/VfFGB7sofRVgt0BPIBR8gStmkTA>
Subject: Re: [Ntp] NTPv5 draft
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Dec 2020 05:30:13 -0000

On Tue, Dec 8, 2020 at 4:43 PM Magnus Danielson <magnus@rubidium.se> wrote:

> Warner,
> On 2020-12-08 21:11, Warner Losh wrote:
>
> Magnus,
>
> On Tue, Dec 8, 2020 at 8:53 AM Magnus Danielson <magnus@rubidium.se>
> wrote:
>
>> Agree fully. You bring up a point which I think not all is appreciating.
>> While we may develop protocols for the Internet, does not mean that they
>> actually have access to the network, or that such access is open in any
>> normal sense. For such air-gaped networks update cycles can also be long
>> enough that for instance leap-second file cannot be sent along each
>> upgrade to ensure correct leap-second list that way.
>>
>
> When I implemented the LORAN-C timing system refresh at Timing Solutions,
> this issue was the biggest issue we had. We could get leap seconds from the
> GPS, and had to resort to a number of ad-hoc methods to distribute from
> there. Especially since we needed to cope with the scenario where a
> computer fails and is replaced by a spare that had been on the shelf for 5
> years, but still had the requirement it couldn't get setting the UTC time
> of the atomic clock wrong (since LORAN TOC was timed vs UTC time scale, so
> a wrong time meant wrong timing signals to the LORAN transmitter), nor wait
> for a full GPS almanac to download.... We used a number of ad-hoc methods
> to do this since there was nothing standard and in practice they proved to
> be less robust than one would have liked. These machines were networked,
> but on a private network that existed at each LORAN station only with only
> minimal connectivity to the outside world (if any).
>
> Leap seconds sound easy and simple, but the logistical issues surrounding
> them are legend, especially when one must engineer for discontinuous use
> cases.
>
> Leap seconds are trivial if you have perfect knowledge. Sadly, history has
> shown that this knowledge to be somewhat less than perfect...
>
> It's not only that, but also depends on over what system they are conveyed
> over. Some work better than others. Some show for sure that they where not
> designed with leap-second handling up-front, and the patch work create
> various pains. As we look forward for NTP for instance, we should make sure
> that the leap-second difference and related mechanisms goes in with it up
> front.
>
> As I've done work recently in a related effort, I've found that while
> operating system may have needed capabilities, knowledge of what comes in
> the box as well as configure it up properly to build a capable system
> remains the challenge. What we can learn is to make sure that we remove as
> much as possible of detailed configurations to make it work properly, but
> make it the default behavior needing minimal configuration that works. As
> you step the major version, you are allowed to break some old behaviors in
> order to establish new default behavior.
>
I'd love that. Too few things care to get leap seconds right, leading to
the balkanization of efforts and generally a dog's breakfast of what works
and what doesn't. Bravo!

> Once TAI, UTC, GPS, PTP or NTP time is known alongside the leap-second
> info, producing the other time-scales is fairly simple and
> straight-forward, including leap-second steps. What keeps confusing people
> is that UNIX and Linux time-scales jump differently.
>

Yea, you need both. It's thinking you can get by with only one if you can't
quickly lay hands on the other. It's the last bit I ran into trouble with,
even though the systems I worked on were redundant, sharing that
information in a pinch was a pain. Whose copy was right, what to do when
they disagreed etc got off into the weeds quickly...

The only IRIG-B version I've seen that do leap-second handling fairly well
> is that of the IEEE C37.118.1 variant. Several of the IRIG variants is not
> even close to do leap-second handling very well. The power-grid folks is
> very aware of it and testing has been done to improve the situation. That
> remains the key point, sufficient support and testing.
>
Yes. I've only ever seen it in an ancient product at Timing Solutions that
seemed to either be a one-off or a checklist item since it never seemed to
work and even if it did long term clients didn't get enough warning because
the lead time was < 1000 seconds...

 Warner