Re: [Ntp] Timescales, leapseconds and smearing
Martin Burnicki <martin.burnicki@meinberg.de> Tue, 08 December 2020 17:29 UTC
Return-Path: <martin.burnicki@meinberg.de>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 79E6F3A10BD for <ntp@ietfa.amsl.com>; Tue, 8 Dec 2020 09:29:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=meinberg.de
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2iG-Zv7-biq4 for <ntp@ietfa.amsl.com>; Tue, 8 Dec 2020 09:29:33 -0800 (PST)
Received: from server1a.meinberg.de (server1a.meinberg.de [176.9.44.212]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 12DBE3A105D for <ntp@ietf.org>; Tue, 8 Dec 2020 09:29:32 -0800 (PST)
Received: from srv-kerioconnect.py.meinberg.de (unknown [193.158.22.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by server1a.meinberg.de (Postfix) with ESMTPSA id 162BD71C107D; Tue, 8 Dec 2020 18:29:28 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meinberg.de; s=dkim; t=1607448568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rIGbvgKg7KLtwVKWk0DOQtIiMNBrF58RB1lduLbmTpw=; b=S3RlPKwTQKfcNi8zVzoE7fqAVC0/v4yWWsnjtde93Pk1Ni/YEnpDH0rYhLBNJmsle4E4fy h7JADFqjdOzXfSpniBstg5Q9MsH78jiE79RnlllimdPDYhC5vG8oumYkc5AUb9pFKHwvS/ 48cxkrdkzAdjr5dMIlz1cB43Wqgc92vUFfkILatiHIL8v4okqwru41bQsUEHFJGoae+o7z h5UYyjmOPQDxPnb6YuDt2wKXKvKVmDtDB6c6FNuyPX7s+QM61z+BskXCiUIciOhcbJXkss fMW0dSN76PUfDuhnDZL9FeWul68xLdZf7H9gNvX6g5q3DUkbYc8/fQ1X1OhTaw==
X-Footer: bWVpbmJlcmcuZGU=
Received: from localhost ([127.0.0.1]) by srv-kerioconnect.py.meinberg.de with ESMTPSA (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)); Tue, 8 Dec 2020 18:29:25 +0100
To: Kurt Roeckx <kurt@roeckx.be>, ntp@ietf.org
References: <X86sVykHUqlkXP96@roeckx.be>
From: Martin Burnicki <martin.burnicki@meinberg.de>
Organization: Meinberg Funkuhren GmbH & Co. KG, Bad Pyrmont, Germany
Message-ID: <f809b97a-91e1-751d-889d-cf832625f052@meinberg.de>
Date: Tue, 08 Dec 2020 18:29:25 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0
MIME-Version: 1.0
In-Reply-To: <X86sVykHUqlkXP96@roeckx.be>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/bXiNFrpZk_HYj4PL2Bcl1RzkeQg>
Subject: Re: [Ntp] Timescales, leapseconds and smearing
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Dec 2020 17:29:45 -0000
Kurt Roeckx wrote: [...] > One problem that NTP currently isn't very good at is dealing with > the leap second. Hm, I think the problems aren't due to the protocol but due to the environments where NTP implementations are expected to run. If you consider NTP just like a kind of transportation, it has to rely on the information coming in from different sources, e.g. - leap second schedule and TAI offset from a leapsecond file - leap second schedule and TAI offset from a GPS receiver, if the protocol (e.g. NMEA) supports this at all - Leap second warning but no TAI offset, e.g. from DCF77 - leap second warning only very shortly before it happpens, e.g 1 hour for DCF77, or 10 seconds (IIRC from IRIG, only with IEEE extensions) There are e.g. GPS receivers out there which provided faulty leap second warnings, e.g. at the end of September. What should a stratum 1 server do when it receives such a warning from its trusted reference clock? At the other side of the transport are the operating systems. Most Unix systems just step the time back by 1 second to insert a leap second. Other operating systems (e.g. Windows, except very current versions) didn't know about leap seconds at all. If a stratum 1 server receives a valid leap second warning early enough, it can pass this down to its clients, but it doesn't know what the clients do with it. If a PTP slave receives a leap second warning from its grandmaster and passes it down to its Unix kernel which runs on UTC, the same problems will occur if the system time is stepped back by the kernel. > The current draft proposal doesn't address the > problems. The major problems with it are: > - The NTP timescale is non-continues in case of a leap second, > without an indication of on which scale you are. It also > doesn't define which second should get repeated. As said above, it's not a decision of an NTP daemon how a leap second is to be handled. If the kernel just steps the time back and whether it repeats the last or the first second is up to the kernel, as long as the kernels don't provide any other API. > This means there > is a 2 second window where it's unclear what the time is. ntpd from ntp.org sends "not synchronized" for a short interval across a leap second. This has been proposed by Miroslav some time ago, and is IMO a very good way to come across the leap second. > - There is a need to distribute information about when a leap second > will happen, which for can happen over NTP or some other way. This was a feature of the Autokey protocol and extensions, even though I have to admit that it's not really related to Autokey. Not sure if there is a replacement in the new extension fields. > Experience shows that a lot of NTP servers get this wrong, > resulting in synchronization problems when some servers change > and others not. When distributed over NTP, a majority of the > servers need to indicate that it will happen. There is no way > to indicate that you don't know a leap second will happen or > not, making it harder to get it correct. See my comments above. What should an NTP server do if a trusted source provides invalid information? IMO you can't blame NTP servers for this. The majority vote of a client is a good thing to discard such faulty announcements. > You can fix the first problem by moving to a scale that is > continues, like TAI. But I'm not sure if it's better or worse > because of the 2nd problem, it will probably be about the same. > In TAI it would always be clear what the time means, even if some > servers know about the leap seconds and others not. It would avoid > marking some servers as false tickers. I'm not sure it is so easy. If you switch to TAI you always need a reliable source for the TAI offset, and the TAI offset has to be forwarded down the time synchronization chain. If your trusted source for the TAI offset provides faulty information (as with the leap second warning above), you will run into the same problems down the synchronization chain. > The current proposed draft > supports working in TAI and smeared NTP. I'm not sure about the > UT1 scale in the draft but assume it's non-continues. If you talk about smeared leap seconds, the questions are still - what shape is used? cosine, linear, ... - what interval? 2 hours? 24 hours? - is smearing half before / half after the leap second, or fully before the leap second? > An other way to fix the non-continues problem is to have some > indication of on which scale you are. It needs to be able to say > in which scale each of the timestamps is. The proposed draft has a > TAI-UTC offset if you use the NTP scale that could be used for this, > but it would then apply to both timestamps. If that is what we > want to do, it needs to be more clear. But for the UT1 scale there > is no way to indicate it. One possible way that come to my mind is to use an arbitrary time (which ever), and provide an UTC or TAI offset (including fractions of a second) in an extension field. That would be similar to the times used in emails, where the main timestamp is some local time, but an offset allows to determine the associated UTC time. > NTP could distribute information about difference between > timescales. A leap second will change the offsets, so we already > do this in a very limited way. TAI-UTC and UT1-UTC are mentioned in > the proposed draft, but it depends on which timescale you're using > which offset you can get. I'm not sure NTP is the best way to > distribute it. But for a lot of devices NTP is the only source of a > leap second information. There is also tzdist, which could be used for this. IMO this could be a good companion to NTP (or even PTP) because it can provide leap second warnings and TAI offsets, but also time zone rules, which is also very important for user space applications. > The document also has a smeared NTP option. It doesn't actually > say which time you put in the fields, the NTP time, or the smeared > NTP time. It then has an offset to the NTP time, without being > clear about the sign. The offset is also optional, which means you > might not be able to combine servers that smear, that smear > differently and that don't smear. > > I'm currently not sure if we should do something with smearing. We > could for instance say that even if the server is smearing, NTP > should always contain unsmeared time, and that smearing is an > implementation detail. Or we could standardize how it should be > smeared. Or like the current draft that you have smear offset. AFAIK, the original reason to do smearing is to completely hide a leap second from the clients. For example, we had customers running a certified Linux system with a known bug that the kernel locks up when it inserts a leap second. Due to the certification, the customer was not even allowed to update the Linux kernel, even though a kernel with a bug fix was available. So they used a smearing server to hide the leap second and prevent the kernel from lockup. With this in mind, I think the best approach would be to let a server provide smeared time, with an optional information about the current smear offset, so clients could optionally compensate the smearing. ntpd from ntp.org currently provides the current smear offset in a special interpretation of the refid field, if smearing is in progress. Martin -- Martin Burnicki Senior Software Engineer MEINBERG Funkuhren GmbH & Co. KG Email: martin.burnicki@meinberg.de Phone: +49 5281 9309-414 Linkedin: https://www.linkedin.com/in/martinburnicki/ Lange Wand 9, 31812 Bad Pyrmont, Germany Amtsgericht Hannover 17HRA 100322 Geschäftsführer/Managing Directors: Günter Meinberg, Werner Meinberg, Andre Hartmann, Heiko Gerstung Websites: https://www.meinberg.de https://www.meinbergglobal.com Training: https://www.meinberg.academy
- Re: [Ntp] Timescales, leapseconds and smearing Kurt Roeckx
- [Ntp] Timescales, leapseconds and smearing Kurt Roeckx
- Re: [Ntp] Timescales, leapseconds and smearing Kurt Roeckx
- Re: [Ntp] Timescales, leapseconds and smearing Steve Allen
- Re: [Ntp] Timescales, leapseconds and smearing Steve Allen
- Re: [Ntp] Timescales, leapseconds and smearing Hal Murray
- Re: [Ntp] Timescales, leapseconds and smearing Miroslav Lichvar
- Re: [Ntp] Timescales, leapseconds and smearing Miroslav Lichvar
- [Ntp] Antw: [EXT] Timescales, leapseconds and sme… Ulrich Windl
- Re: [Ntp] Antw: [EXT] Timescales, leapseconds and… Kurt Roeckx
- Re: [Ntp] Timescales, leapseconds and smearing Steve Allen
- Re: [Ntp] Timescales, leapseconds and smearing Steve Allen
- Re: [Ntp] Antw: [EXT] Timescales, leapseconds and… Martin Burnicki
- Re: [Ntp] Timescales, leapseconds and smearing Martin Burnicki
- Re: [Ntp] Timescales, leapseconds and smearing Martin Burnicki
- Re: [Ntp] Timescales, leapseconds and smearing Martin Burnicki
- Re: [Ntp] Antw: [EXT] Timescales, leapseconds and… Kurt Roeckx
- Re: [Ntp] Timescales, leapseconds and smearing Kurt Roeckx
- Re: [Ntp] Timescales, leapseconds and smearing Steve Allen
- Re: [Ntp] Antw: [EXT] Timescales, leapseconds and… Ulrich Windl
- [Ntp] Antw: Re: Antw: [EXT] Timescales, leapsecon… Ulrich Windl
- [Ntp] Antw: [EXT] Re: Timescales, leapseconds and… Ulrich Windl
- [Ntp] Antw: [EXT] Re: Timescales, leapseconds and… Ulrich Windl
- Re: [Ntp] Antw: Re: Antw: [EXT] Timescales, leaps… Warner Losh
- Re: [Ntp] Antw: Re: Antw: [EXT] Timescales, leaps… Kurt Roeckx
- Re: [Ntp] Antw: Re: Antw: [EXT] Timescales, leaps… Kurt Roeckx
- Re: [Ntp] Antw: Re: Antw: [EXT] Timescales, leaps… Martin Burnicki
- Re: [Ntp] Antw: Re: Antw: [EXT] Timescales, leaps… Ulrich Windl
- Re: [Ntp] Antw: Re: Antw: [EXT] Timescales, leaps… Ulrich Windl