Re: [Ntp] NTPv5 draft
Magnus Danielson <magnus@rubidium.se> Wed, 09 December 2020 12:26 UTC
Return-Path: <magnus@rubidium.se>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CC50A3A1B9E for <ntp@ietfa.amsl.com>; Wed, 9 Dec 2020 04:26:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rubidium.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eIYyaY4RyEuN for <ntp@ietfa.amsl.com>; Wed, 9 Dec 2020 04:26:40 -0800 (PST)
Received: from pio-pvt-msa1.bahnhof.se (pio-pvt-msa1.bahnhof.se [79.136.2.40]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 07F053A1B9C for <ntp@ietf.org>; Wed, 9 Dec 2020 04:26:35 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTP id EA7373F612; Wed, 9 Dec 2020 13:26:32 +0100 (CET)
Authentication-Results: pio-pvt-msa1.bahnhof.se; dkim=pass (2048-bit key; secure) header.d=rubidium.se header.i=@rubidium.se header.b="hc9MignH"; dkim-atps=neutral
X-Virus-Scanned: Debian amavisd-new at bahnhof.se
Received: from pio-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aNF547kALd-t; Wed, 9 Dec 2020 13:26:30 +0100 (CET)
Received: by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 1A4073F57F; Wed, 9 Dec 2020 13:26:29 +0100 (CET)
Received: from machine.local (unknown [192.168.0.15]) by magda-gw (Postfix) with ESMTPSA id A73059A04F9; Wed, 9 Dec 2020 13:26:29 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=rubidium.se; s=rubidium; t=1607516789; bh=nEh/PjqbqM2fJ8Xhfsjvjph0yUp7c92IVSklqemywNg=; h=Cc:Subject:To:References:From:Date:In-Reply-To:From; b=hc9MignHOCiWYNr2txvh5FGQKaClbzugUrBj/fcDyBpti78DK8y3yKbj8fb2ga6Ep fNcnO55MOt4cts8R4yZow3H+mvH3NkpRpgM1TeFjXuj03cmxnjR57qUfAOwNwqUWwf wxSaAEMFSG/KAyZaiuCw8XAE3ZQILyno2X037SZIDvHSLWfloe5AL/jH3wul8Nwyxl T5pSowuCq3Hsg5AeIEup6iuylyUbNSGhphT7Dp9sFMjQ+mE3eioRlrCjBod2MIgZX5 27b8ML/vXiha8t+ocbKfDIvZadeJHbfTVx4oXZb3fMLSTtpaEj0mZ2NK8m9XDxCYV7 dAprSNA0+5iQQ==
Cc: magnus@rubidium.se, NTP WG <ntp@ietf.org>
To: Warner Losh <imp@bsdimp.com>
References: <20201111161947.GG1559650@localhost> <AA848C67-CFB7-43FC-B190-FD3911360373@gmail.com> <20201201081203.GB1900232@localhost> <2B8C7410-DFA7-4A87-A33E-F50FFA96D0F9@gmail.com> <20201201100305.GK1900232@localhost> <F62C1325-8409-474C-9650-FA96405D0F4B@gmail.com> <20201207104541.GE2352378@localhost> <E0159612-5D83-4A0E-BBD1-1D75C0B49226@akamai.com> <20201207153444.GO2352378@localhost> <1204B871-7728-45DA-B628-8F79BD074A96@akamai.com> <20201208095046.GT2352378@localhost> <D15AF5B4-F976-44D6-B8E7-986E3B8CE23D@akamai.com> <3314193a-a430-8db8-b72c-8443dcc1f125@dansarie.se> <4ab54344-fa4d-5719-db63-0555ce190643@rubidium.se> <CANCZdfprZSNX-GNN7KOVhj3k3jU1t2KiNUHTqRrDB+_g2OCw3A@mail.gmail.com> <61baa7ff-d512-3267-a12f-e8552789c0c2@rubidium.se> <CANCZdfqT7Wo5jFCU1+emDpAmdTq1XeMuZEJgu9Y4_uBLLtbZag@mail.gmail.com>
From: Magnus Danielson <magnus@rubidium.se>
Message-ID: <05cd5f02-91fd-025e-0cb2-cef53e32ee4b@rubidium.se>
Date: Wed, 09 Dec 2020 13:26:28 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.5.1
MIME-Version: 1.0
In-Reply-To: <CANCZdfqT7Wo5jFCU1+emDpAmdTq1XeMuZEJgu9Y4_uBLLtbZag@mail.gmail.com>
Content-Type: multipart/mixed; boundary="------------2DC7B7C08DCD2A1A9FE64694"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/ncqQYHNIKllh5PLw-f-szG3HDYM>
Subject: Re: [Ntp] NTPv5 draft
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Dec 2020 12:26:45 -0000
Warner, On 2020-12-09 06:29, Warner Losh wrote: > > > On Tue, Dec 8, 2020 at 4:43 PM Magnus Danielson <magnus@rubidium.se > <mailto:magnus@rubidium.se>> wrote: > > Warner, > > On 2020-12-08 21:11, Warner Losh wrote: >> Magnus, >> >> On Tue, Dec 8, 2020 at 8:53 AM Magnus Danielson >> <magnus@rubidium.se <mailto:magnus@rubidium.se>> wrote: >> >> Agree fully. You bring up a point which I think not all is >> appreciating. >> While we may develop protocols for the Internet, does not >> mean that they >> actually have access to the network, or that such access is >> open in any >> normal sense. For such air-gaped networks update cycles can >> also be long >> enough that for instance leap-second file cannot be sent >> along each >> upgrade to ensure correct leap-second list that way. >> >> >> When I implemented the LORAN-C timing system refresh at Timing >> Solutions, this issue was the biggest issue we had. We could get >> leap seconds from the GPS, and had to resort to a number of >> ad-hoc methods to distribute from there. Especially since we >> needed to cope with the scenario where a computer fails and is >> replaced by a spare that had been on the shelf for 5 years, but >> still had the requirement it couldn't get setting the UTC time of >> the atomic clock wrong (since LORAN TOC was timed vs UTC time >> scale, so a wrong time meant wrong timing signals to the LORAN >> transmitter), nor wait for a full GPS almanac to download.... We >> used a number of ad-hoc methods to do this since there was >> nothing standard and in practice they proved to be less robust >> than one would have liked. These machines were networked, but on >> a private network that existed at each LORAN station only with >> only minimal connectivity to the outside world (if any). >> >> Leap seconds sound easy and simple, but the logistical issues >> surrounding them are legend, especially when one must engineer >> for discontinuous use cases. >> >> Leap seconds are trivial if you have perfect knowledge. Sadly, >> history has shown that this knowledge to be somewhat less than >> perfect... > > It's not only that, but also depends on over what system they are > conveyed over. Some work better than others. Some show for sure > that they where not designed with leap-second handling up-front, > and the patch work create various pains. As we look forward for > NTP for instance, we should make sure that the leap-second > difference and related mechanisms goes in with it up front. > > As I've done work recently in a related effort, I've found that > while operating system may have needed capabilities, knowledge of > what comes in the box as well as configure it up properly to build > a capable system remains the challenge. What we can learn is to > make sure that we remove as much as possible of detailed > configurations to make it work properly, but make it the default > behavior needing minimal configuration that works. As you step the > major version, you are allowed to break some old behaviors in > order to establish new default behavior. > > I'd love that. Too few things care to get leap seconds right, leading > to the balkanization of efforts and generally a dog's breakfast of > what works and what doesn't. Bravo! Thanks. So, if you look at Linux today, you do have the NTP nano-kernel support, you have POSIX support for monotonic clock as well as TAI clock. You can get the TAI-UTC leap-second difference in and out of the kernel over the nano-kernel interface. The code in the kernel is able to do the leap-second. What remains for us to do is to make sure we provide the leap-second information to it, and then for other applications they request time such that they get the needed leap-second information. This is not as much a technology issue as much as documentation and educational issue. Long gone is the time when I needed to patch the kernel up and recompile it and stuff like that. The technology support we could potentially potentially need that libc would read out the TAI-UTC difference and output the right format of time as result. The needed logic isn't all that hard. From there there is a ripple effect, but not too hard. The one thing to fix in NTP is that the TAI-UTC leap-second difference should never have been made tied so hard to the authentication mechanism as it was. It was actually three components in the same RFC, the authentication, the NTP extension field mechanism and the TAI-UTC leap-second extension. As the authentication was being thrown out, so was the extension field mechanism and the TAI-UTC leap-second extension. I did see a draft for saving the extension field mechanism, and I think it ended up as RFC, but the TAI-UTC leap-second extension I can't recall being "saved". We need to ensure we maintain the TAI-UTC leap-second extension and I would even prefer if it became a mandatory component as we go forward, as I see it as a crucial component. Systems such as UTC and PTP provide it, and it is a game-changer in being able to properly handle things. The prewarning system approach to a step helps, and actually is needed in parallel for best function as repeatedly demonstrated, but ends up being vulnerable to corner cases. I would make sure that a future NTPv5 for sure has the TAI-UTC leap-second extension included. We can improve the state of things by documentation, potentially as a informational RFC, best current practices for how systems and implementations to handle time-scales. I've even found such RFCs that to some degree is misleading, so in the process providing an update for that RFC may be a side-consequence. > Once TAI, UTC, GPS, PTP or NTP time is known alongside the > leap-second info, producing the other time-scales is fairly simple > and straight-forward, including leap-second steps. What keeps > confusing people is that UNIX and Linux time-scales jump differently. > > > Yea, you need both. It's thinking you can get by with only one if you > can't quickly lay hands on the other. It's the last bit I ran into > trouble with, even though the systems I worked on were redundant, > sharing that information in a pinch was a pain. Whose copy was right, > what to do when they disagreed etc got off into the weeds quickly... Yes, that is exactly my experience too. Essentially when you are caught with the moment of surprise it will be hard to backtrack nicely and this only comes down to not having prewarning and proper offset values available. If you have those values, it's not too hard. Another thing which keeps confusing people is that they think UNIX or POSIX clock behavior also applies to Linux clock, and it doesn't, so they have different leap-second behavior, but if you have the information in advance and know how they behave, properly traversing the leap-second becomes fairly dull algebra. I made a small piece of code (attached for whoever is interested) that outputs this: ntp 3692217598 ptp 1483228834 gps 1167264015 tai 00:00:34 utc 23:59:58 gps 00:00:15 unix 23:59:58 linux 23:59:58 tai-utc 36 leapsec 0 tai-utc(post) 36 ntp 3692217599 ptp 1483228835 gps 1167264016 tai 00:00:35 utc 23:59:59 gps 00:00:16 unix 23:59:59 linux 23:59:59 tai-utc 36 leapsec 0 tai-utc(post) 36 ntp 3692217600 ptp 1483228836 gps 1167264017 tai 00:00:36 utc 23:59:60 gps 00:00:17 unix 00:00:00 linux 23:59:59 tai-utc 37 leapsec 1 tai-utc(post) 36 ntp 3692217600 ptp 1483228837 gps 1167264018 tai 00:00:37 utc 00:00:00 gps 00:00:18 unix 00:00:00 linux 00:00:00 tai-utc 37 leapsec 0 tai-utc(post) 37 ntp 3692217601 ptp 1483228838 gps 1167264019 tai 00:00:38 utc 00:00:01 gps 00:00:19 unix 00:00:01 linux 00:00:01 tai-utc 37 leapsec 0 tai-utc(post) 37 Building on a number of relationships. In the process, I found that there was an unfortunate wording in the NIST leapsecond file, and Judah Levine thanked me for the observation and said he would adjust wording for the next update. The issue is that one needs to understand it as being related to NTP context and thus the behavior of NTP time-stamp. Some useful relationships is found in the leapsecond file, some in the NTP documentation and books, some in the IEEE Std 1588 definition and annex B. This scattered set of references may need to be collected at some more proper place. > > The only IRIG-B version I've seen that do leap-second handling > fairly well is that of the IEEE C37.118.1 variant. Several of the > IRIG variants is not even close to do leap-second handling very > well. The power-grid folks is very aware of it and testing has > been done to improve the situation. That remains the key point, > sufficient support and testing. > > Yes. I've only ever seen it in an ancient product at Timing Solutions > that seemed to either be a one-off or a checklist item since it never > seemed to work and even if it did long term clients didn't get enough > warning because the lead time was < 1000 seconds... The IEEE C37.118.1 provides an extension that provides the prewarning mechanism and local clock offset, which is much more helpful than suddenly finding yourself with the time of 23:59:60 (unknown local clock offset) or 00:00:60 (unknown local clock offset) to get a pre-warning (albeit not with huge time before) that there is an upcoming leap-second and with +1.00 hour local time-offset you can expect the 00:00:60+1:00 to occur and also get information that this second is this leap-second event. It's needed because as you measure phase and time-stamp phase of power grid 50 or 60 Hz, just a tad of frequency offset of the grid and a leap-second here and there will create false indication of massive phase error. Those interested in evidence of that is recommended to check out NASPI, and in particular the NASTP TSTF report from 2017 that I helped with. So, while not perfect, it illustrates the importance of providing the key information. IRIG-B has the other significant draw-back that the time may, for regulatory or operational reasons, be set to the local time rather than UTC, so one also needs to carry the local time offset such that the underlying UTC time can be retrieved, but also to indirectly follow the TAI as the phase-comparison is actually to TAI. This goes to show just how an application can need to have all three times, and how crutial the full support has become for a number of applications. I see in fact more and more applications where leap-second smearing is creating worse problems than just address the leap-second problem up-front and be done with it. This turns out to be just about the same problem as when UTC was using frequency corrections prior to 1972, so we end up suffering from the same basic problem. The work I had to do was relating to the AES 67 standard and applicable to WAN and also cloud environments. We concluded that strictly requiring PTP access can not generally be assumed to be available, either because the network is not able to convey PTP, which you cannot assume in a WAN setting, or for that mater that the host environment is not able to provide access to PTP, or similar time-service, which can exist in mobile phone devices as well as in the virtual hosts in a cloud environment. Thus, the need to indirect being able to access the PTP time-scale through other means, as it forms the basis for the AES67 media clock. So, I ended up having to revisit this particular rabbit hole. I think we can improve the state of things to guide people right, and I think we can also learn from the real life challenges the things we need to consider as we design these things and then document them such that we quicker get people up to speed. At the same time I notice that for each leap-second, the consequences becomes smaller, bugs is being fixed, testing has improved. Cheers, Magnus
- [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Dieter Sibold
- Re: [Ntp] NTPv5 draft Kurt Roeckx
- Re: [Ntp] NTPv5 draft Steven Sommars
- Re: [Ntp] NTPv5 draft Doug Arnold
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft Philip Prindeville
- Re: [Ntp] NTPv5 draft Philip Prindeville
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] Antw: [EXT] Re: NTPv5 draft Miroslav Lichvar
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] NTPv5 draft Kurt Roeckx
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] Antw: [EXT] Re: NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft James
- [Ntp] Antw: Re: Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Dieter Sibold
- Re: [Ntp] Antw: Re: Antw: [EXT] Re: NTPv5 draft Hal Murray
- Re: [Ntp] Antw: Re: Antw: [EXT] Re: NTPv5 draft Miroslav Lichvar
- Re: [Ntp] Antw: Re: Antw: [EXT] Re: NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Kurt Roeckx
- Re: [Ntp] NTPv5 draft Hal Murray
- [Ntp] Antw: Re: Antw: Re: Antw: [EXT] Re: NTPv5 d… Ulrich Windl
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] NTPv5 draft Doug Arnold
- Re: [Ntp] Antw: Re: Antw: Re: Antw: [EXT] Re: NTP… Hal Murray
- Re: [Ntp] Antw: [EXT] Re: NTPv5 draft Salz, Rich
- Re: [Ntp] Antw: [EXT] Re: NTPv5 draft Doug Arnold
- Re: [Ntp] Antw: [EXT] Re: NTPv5 draft Kurt Roeckx
- Re: [Ntp] NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Doug Arnold
- Re: [Ntp] NTPv5 draft Dieter Sibold
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft James
- Re: [Ntp] NTPv5 draft Warner Losh
- Re: [Ntp] NTPv5 draft Philip Prindeville
- Re: [Ntp] NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Warner Losh
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Philip Prindeville
- Re: [Ntp] NTPv5 draft Doug Arnold
- Re: [Ntp] NTPv5 draft Doug Arnold
- Re: [Ntp] NTPv5 draft Doug Arnold
- Re: [Ntp] NTPv5 draft Philip Prindeville
- Re: [Ntp] NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft Salz, Rich
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft Marcus Dansarie
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft James
- Re: [Ntp] NTPv5 draft Magnus Danielson
- Re: [Ntp] NTPv5 draft Warner Losh
- Re: [Ntp] NTPv5 draft Magnus Danielson
- Re: [Ntp] NTPv5 draft Warner Losh
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Magnus Danielson
- Re: [Ntp] NTPv5 draft James
- Re: [Ntp] NTPv5 draft Salz, Rich
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- [Ntp] Antw: [EXT] Re: NTPv5 draft Ulrich Windl
- [Ntp] Antwort: Antw: [EXT] Re: NTPv5 draft kristof.teichel
- Re: [Ntp] Antwort: Antw: [EXT] Re: NTPv5 draft Hal Murray
- Re: [Ntp] NTPv5 draft James
- Re: [Ntp] Antwort: Antw: [EXT] Re: NTPv5 draft Miroslav Lichvar
- Re: [Ntp] NTPv5 draft Miroslav Lichvar
- Re: [Ntp] Antwort: Antw: [EXT] Re: NTPv5 draft Christer Weinigel