Re: [Ntp] CLOCK_TAI (was NTPv5: big picture)

Magnus Danielson <magnus@rubidium.se> Fri, 08 January 2021 01:38 UTC

Return-Path: <magnus@rubidium.se>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E77C3A1002 for <ntp@ietfa.amsl.com>; Thu, 7 Jan 2021 17:38:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.361
X-Spam-Level:
X-Spam-Status: No, score=-2.361 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.262, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rubidium.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AZMEqv4Uv35D for <ntp@ietfa.amsl.com>; Thu, 7 Jan 2021 17:38:46 -0800 (PST)
Received: from ste-pvt-msa1.bahnhof.se (ste-pvt-msa1.bahnhof.se [213.80.101.70]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 03FF93A1001 for <ntp@ietf.org>; Thu, 7 Jan 2021 17:38:43 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTP id E05E440990; Fri, 8 Jan 2021 02:38:41 +0100 (CET)
Authentication-Results: ste-pvt-msa1.bahnhof.se; dkim=pass (2048-bit key; unprotected) header.d=rubidium.se header.i=@rubidium.se header.b=gQ4uQmI2; dkim-atps=neutral
X-Virus-Scanned: Debian amavisd-new at bahnhof.se
Received: from ste-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (ste-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sVVnNfFwVZ_a; Fri, 8 Jan 2021 02:38:40 +0100 (CET)
Received: by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 3F6224094B; Fri, 8 Jan 2021 02:38:39 +0100 (CET)
Received: from machine.local (unknown [192.168.0.15]) by magda-gw (Postfix) with ESMTPSA id 2549E9A0523; Fri, 8 Jan 2021 02:38:39 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=rubidium.se; s=rubidium; t=1610069919; bh=hHKhZgj3Dw6SR16CzK2apgUtFwzqm7e1nrfukUgu3qM=; h=Cc:Subject:To:References:From:Date:In-Reply-To:From; b=gQ4uQmI2MpjSdM8S7v9vLT1kBWTwM35N+I9jxB+HQEMAvpr/FdYik+RThL83/8HWB 713aWYziQMaQSfz8x+yUTxqpJmWP9VEwBovSxZu/3xoSUhuoOw/vQQo9OiEnztIbj2 yE+DbPWMvd/zqcRpqaw+DjUBUHv39o8Y6v8RYzMujDDChwU0aDayw2XSLyMCDW6aS3 HltwPpV+gmCQ9r9oFCku/KLUUy3cja5bodo4mUt++uWarm2k0dhSQGJM5PWuDUCKCh kL8sZGOcCSWNEZMhAQQA/N867dZlmK2rv7DEjRKABck4Zf4oce1Us6GsF/05xq5gIj CkXayRj4DJseQ==
Cc: magnus@rubidium.se, ntp@ietf.org
To: Martin Burnicki <martin.burnicki@meinberg.de>, Miroslav Lichvar <mlichvar@redhat.com>
References: <20210102081603.1F63C40605C@ip-64-139-1-69.sjc.megapath.net> <cecaf661-92af-8b35-4c53-2f025c928144@rubidium.se> <20210104164449.GE2992437@localhost> <b1e61f7d-6cea-5e99-69f0-7eae815d9e19@rubidium.se> <20210105083328.GA3008666@localhost> <ba5d2cde-6b5e-d9b6-1877-c4060bf43e80@rubidium.se> <f8a1b9fa-887f-3402-d6e9-19dd4fa98e33@meinberg.de> <75348282-d6aa-e1f1-0ab1-4dfbc1379ff4@rubidium.se> <39e28d2c-454d-43f1-ee58-b136187212b1@meinberg.de> <f1592fa2-3922-e2ac-a9d4-6dfccaa17c36@rubidium.se> <b835a9bf-510d-c1a4-52f7-29607cff3a5b@meinberg.de>
From: Magnus Danielson <magnus@rubidium.se>
Message-ID: <881dd23a-39a4-c5a8-04f3-bc8686aa7ccb@rubidium.se>
Date: Fri, 08 Jan 2021 02:38:38 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.6.0
MIME-Version: 1.0
In-Reply-To: <b835a9bf-510d-c1a4-52f7-29607cff3a5b@meinberg.de>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/gytaIATBWAru2siIIZdkmH0LHRs>
Subject: Re: [Ntp] CLOCK_TAI (was NTPv5: big picture)
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Jan 2021 01:38:50 -0000

Martin,

On 2021-01-07 17:20, Martin Burnicki wrote:
> Magnus Danielson wrote:
>> Martin,
>>
>> On 2021-01-06 11:24, Martin Burnicki wrote:
>>> Magnus Danielson wrote:
>>>> Martin,
>>>>
>>>> On 2021-01-05 17:24, Martin Burnicki wrote:
>>>>> Magnus Danielson wrote:
>>>>>> I agree that NTPv5 cannot support only TAI.
>>>>>>
>>>>>> However, there comes the first challenge, do the core time-stamp format
>>>>>> be that of a single format or of any timescale format. I think we agree
>>>>>> that formats having the POSIX, Linux and NTPv4 type of leap-second jumps
>>>>>> is to be avoided.
>>>>> I really wonder what the "NTPv4 type of leap second jumps" is. If the
>>>>> system supports leap seconds at all, at least ntpd just passes the leap
>>>>> second announcement to the kernel. And the daemon relies on the
>>>>> timestamps returned by the kernel.
>>>>>
>>>>> If the kernel handles this in a way we don't consider "appropriate",
>>>>> what could an NTP (or PTP or whatever) daemon do to fix it? Applications
>>>>> will still suffer from limitations of the operating system.
>>>> Sure. It's a multi-front battle.
>>>>
>>>> With "NTPv4 type leap-second jump" i mean the way that the NTPv4 clock
>>>> ticks in combination with the leap second flag.
>>> Hm, isn't that a limitation of the system clock, depending on the
>>> operating system?
>> It's both. It has affected both the on-wire protocol and how some
>> operating systems happens to behave. I think we need those as separate
>> problems. It will be a benefit if we can abondon it from the on-wire
>> protocol while ensuring we can maintain the mapping over to whatever
>> operating system mechanism used in a particular machine.
>>
>>>> It's not a very neat
>>>> format, even if it works. Alone it does not give us TAI-UTC difference,
>>>> but if we setup that correctly, we have padded the leap-second and a
>>>> good implementation can get things right.
>>> Right, and if we stick with the timestamps in the scale that is
>>> currently used (as I proposed earlier), the protocol is useful for
>>> applications that want/need both UTC and TAI, as well for UTC-only
>>> applications of which there are zillions.
>> I do not really agree with that. It does come with a burden that is
>> actually being challenged if we want to keep bearing as it makes many
>> parts of the core processing more complex, as is adaptation processing.
>> There is a will to simplify, and if you want to simplify you can do
>> that, but then one has to face some tough decisions. Personally, I think
>> those steps are worth taking, and I think there can be good ways to
>> mitigate many of the concerns, but I do understand that it can be hard
>> to take in all the various aspects.
> My opinion is that if you simplify here, you just shift the problem
> elsewhere, and you probably get even more problems if you have to care
> about that.

But that is the whole point of the exercise! For sure the issue do not
disappear, but the hope is that it can move into a corner where it makes
less harm.

>
>>>>>> Will that satisfy all needs? Maybe not. OK, but will this provide a
>>>>>> vehicle for more variants. Seems likely if we have a standard way of
>>>>>> augmenting the core timing with additional parameters and users add the
>>>>>> mapping.
>>>>> That's also fine, but by default it should IMO be possible to provide
>>>>> simple clients with time in a way as compatible as possible.
>>>> Indeed. I am confident that the client can be very simple and provide
>>>> the right time, even with leap seconds occurring.
>>> Of course. But the limitation is in the server. Right now, it is
>>> sufficient to have a time source like DCF77. If you enforce using of
>>> TAI, that's not sufficient anymore.
>>>
>>> You only need the UTC/TAI offset if your system supports TAI, but there
>>> are many applications where this isn't supported, and where it's not
>>> even a requirement.
>> Actually, your logic is NTPv4 centric.
> No, my logic is based on the application and usability, for systems with
> high requirements as well as for simply systems, of which there are lots.
How then is leap-seconds handled, easily. I fail to see that being done
easily.
>
>> Now, I tried to make very clear
>> it would push requirements onto servers if you would take the suggested
>> path. I'm not going to hide that, rather, I want people to understand
>> that this would be the logical consequence. However, once that is taken,
>> the other parts would become much easier.
> At the server side, if you want or need to provide time synchronization
> for TAI-based systems and for systems that don't need/use it, in any
> case you have to take care to get a timestamp and TAI/UTC offset
> *consistently*.
>
> The only question is whether you put TAI into the base packet and apply
> the offset to yield UTC, or vice versa, where IMO the latter is much
> more appropriate for simple systems.

Except that when you do your core processing, you need that to handle
the occurrence of leap-seconds in all it's processing. If you use a
TAI-like time-base in the core. There is a whole line of checks and
balances in that core which just never add to their complexity... in a
simple system. It's this which is the actual point of achieving that.
But then that will come at a cost. It will have the side-consequence of
servers needing to know what to do. Simple clients will shift some of
their complexity from core processing to the output adaptations, sure.
It is only by looking at all those checks and balances we can make the
informed decision.

>
>> If you attempt to go the other route, in which you have a proliferation
>> of how many time-scales servers can support, you then push out to
>> intermediary and clients nodes to handle the hurdles. There are many
>> part of the algorithms we are used to use that will become complex, or
>> you would have to push to the users only to choose servers with
>> compatible time-scales, which would be even worse as it would deepen the
>> division we already seeing.
>>
>>>>>> Sure, I am advocating for a particular part of the solution space. But I
>>>>>> think it is doable without any of the parts becoming cumbersomely
>>>>>> complex to test and verify. An example is how PTP has extension fields
>>>>>> extended by SMPTE 2059-2 and the various output mappings documented
>>>>>> separately in SMPTE 2059-1 to show how the transported parameters
>>>>>> generate all the legacy timing things. SMPTE 2110-10 then extends SMPTE
>>>>>> 2059-2 and 2059-1 for the application within the SMPTE 2110 transport
>>>>>> protocols and timing model of media used there. Anyway, I think it is a
>>>>>> fair model that seems to work.
>>>>> If a *simple* client needs to do this just to derive UTC (what's
>>>>> probably the case for most not telco devices) than it's a wrong approach
>>>>> to provide TAI and derive UTC from it.
>>>> Well, it may seem so at first, but if we taken on the burden of getting
>>>> TAI and TAI-UTC and transport that, the client can be very simple and
>>>> provide TAI, UTC, POSIX, LINUX, PTP, GPS time scale replicas using very
>>>> little code or complexity. The mappings provided will not require many
>>>> pages and it will be easy to implement them all.
>>>>
>>>> Check your inbox for a separate example.
>>> I've seen it. I know conversion between different time scales can be
>>> done easily *if* all required information is available. I've also
>>> written lots of functions that convert between different time scales.
>> Yes, I know you know that field very well. I just wanted to be explicit
>> so we where agreeing what we where talking about.
>>> AFAIK, there's no current OS that doesn't support UTC, but there are
>>> systems that don't support TAI. So why not use the time scale that is
>>> most supported, and support other scales if they are supported/required?
>> Exactly what do you mean with "support UTC"? Does all OSes you know
>> support time as 23:59:60Z? If they don't, they do not fall into "support
>> UTC" in my book. Some do support mechanisms that enables user layer to
>> print 23:59:60Z, but far from all. Those that does just renumber UTC
>> leap second to either 23:59:59 or 00:00:00 with no other indication does
>> not support UTC in my book.
> OK, then let us call it POSIX time. We all know that you'll never see a
> second 60 in the kernel clock because it only counts binary seconds anyway.
POSIX time is one such time, yes.
> A huge part of the whole problem is that an inserted leap second is
> originally defined as a numbering of seconds like 58, 59, 60, 0 in
> human-readable format, which is totally unsuitable for timekeeping in an
> OS kernel.
Agree.
> However, you can convert this properly from the binary format to a
> second "60" on a display *if* you have some associated status
> information available with the timestamps, *and* the conversion routines
> in the runtime library evaluate that status.
Completely agree.
> For example, the API calls for Meinberg PCI cards return seconds in
> POSIX format *and* an associated status that tells you the current
> timestamps *is* part of the leap second.
Which is just what a well designed interface do, and I am not surprised
you did it well, rather I expect it without checking the details.
> If that status information is not available, you just pulsh the problem
> to a different location, but you don't solve it.

Which is part of the problem that needs to be solved one way or another.
I am very aware of these problems.

Now, the trouble is that people confuse POSIX time_t to represent UTC
and think that any POSIX time_t like system will do UTC on it's own, and
that does not work out if you do not one way or another have that extra
information. Also, you need to have heads-up. Different systems enables
or requires (max and min time) pre-information about pending
leap-second. Some limited have ability to indicate without heads-up. So,
if you now needs heads-up, and previous NTPv4 had 1 day heads-up, for
NTPv5 we seem to opt for allowing more, you need to know that the
TAI-UTC number is to shift. For NTP on wire protocol we have to recall,
the leap-second can occur between a pair of packets, so we need to know
in advance. So, if you need to know of the shift, you have almost the
same problem as knowing which TAI-UTC you have and when that will step.

The whole problem of leap-second handling to support UTC thus require
heads-up already at the server, and for all the intermediary nodes all
the way to the client, so that the client can do the right thing for
it's kernel and user process when the leap second do occur. There is no
way around that part.

If you say that only non-leapsecond time-scales is supported, you can do
much simpler things. Then again, you have ended up ruling out the proper
UTC support. So either we make these properties split fully, or we fuse
them together in such way we agree it will be the simplest and most
robust way, and then certain UTC specific consequences will end up
defining a number of properties.

Then, there is a strong wish to keep the core time-processing as clean
as possible, and the properties of a TAI-like timestamp format is
attractive. That will create challenges for servers and clients.

However, making all servers provide compatible time has additional
advantages as one wants to entertain the capability of combining
responses from servers. Now, traditionally that have not been an issue.
However, we see servers doing UT1 and others doing smeared leap-seconds
in parallel with thus providing UTC based time (more or less good,
depending on implementation, driver etc). In fact, people wish there to
be a more open bag of time-scales. That will end up driving further
division rather than improving strength.

So, we either have to take favorite topics of the table, or find a way
to fuse them into a system that achieves all he goals, and with the wish
list I have seen, using TAI-like base and provide mappings in and out is
the only one I can forsee to be fruitful, and even that will come with
it's costs, where the servers will hurt most, but I am greatly
considering that being worth the cost. Yes, a simple client will need to
do a little more to convert time, but that will be the small pain
compared to the server side, which I still consider worth having.

Cheers,
Magnus