Re: [Ntp] The bump, or why NTP v5 must specify impulse response

Miroslav Lichvar <mlichvar@redhat.com> Tue, 14 April 2020 10:35 UTC

Return-Path: <mlichvar@redhat.com>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A16B63A0A47 for <ntp@ietfa.amsl.com>; Tue, 14 Apr 2020 03:35:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.268
X-Spam-Level:
X-Spam-Status: No, score=-2.268 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.168, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=redhat.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G0iDld47sodR for <ntp@ietfa.amsl.com>; Tue, 14 Apr 2020 03:35:52 -0700 (PDT)
Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 63D8C3A0A45 for <ntp@ietf.org>; Tue, 14 Apr 2020 03:35:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1586860551; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Cf6rO3R5w5eH+0QOlLa18we7+CmN3ib7N58Tt89arBE=; b=SVBSZBFC+aKo0LR4MlVYj+hmE0bMRBUswYfgx9arK7ANNM8gHgDyJjSsTOXhWMHHR2qSAB oe1BovICF+dtGUBMaEKZ7mU8V1fzO84GY4xAb7n2tLxJa9muhXz8kmyuaEUaC9FB/e2IUO ND9oToSb8peXdQLGNcg7VuIXwglrc/Y=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-401-oFlZE9J1M1WeP0I6m1wpAA-1; Tue, 14 Apr 2020 06:35:47 -0400
X-MC-Unique: oFlZE9J1M1WeP0I6m1wpAA-1
Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 468D518B9FC2; Tue, 14 Apr 2020 10:35:46 +0000 (UTC)
Received: from localhost (holly.tpb.lab.eng.brq.redhat.com [10.43.134.11]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1C5B360BE1; Tue, 14 Apr 2020 10:35:44 +0000 (UTC)
Date: Tue, 14 Apr 2020 12:35:43 +0200
From: Miroslav Lichvar <mlichvar@redhat.com>
To: Doug Arnold <doug.arnold@meinberg-usa.com>
Cc: Daniel Franke <dfoxfranke@gmail.com>, Watson Ladd <watsonbladd@gmail.com>, NTP WG <ntp@ietf.org>
Message-ID: <20200414103543.GC1945@localhost>
References: <CACsn0c=zzDKP6iBjPJWGF0rkqSaY3AY738ynGwDZO14sdBJ-Bg@mail.gmail.com> <CAJm83bB2A3VUxXX47Y0ubmS9Xne7PRSyV_xHY_D9YvHjqE-vFA@mail.gmail.com> <CACsn0cm3jpKZTUQ=novTgVaFhc1xCJgmUF3oOgdrzQa-HgOCUQ@mail.gmail.com> <CAJm83bAqbMMs2W3SyH+3c17wcC85paY4-_jk2SxczgsxBLyYyA@mail.gmail.com> <CAJm83bAQeR_6U3jgmbWzdus3pu+OO2_KP+M9RtbCFYOfDQy4dw@mail.gmail.com> <DB8PR02MB56111CCA23CDCF97A3C9F3E8CFDD0@DB8PR02MB5611.eurprd02.prod.outlook.com>
MIME-Version: 1.0
In-Reply-To: <DB8PR02MB56111CCA23CDCF97A3C9F3E8CFDD0@DB8PR02MB5611.eurprd02.prod.outlook.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/aXeHIxXoq0UTVveSyYDZ2ipNO9M>
Subject: Re: [Ntp] The bump, or why NTP v5 must specify impulse response
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Apr 2020 10:35:54 -0000

On Mon, Apr 13, 2020 at 09:22:33PM +0000, Doug Arnold wrote:
> If we define a default clock steering algorithm, the most important property is that it always converges, even if it is not the most accurate for any specific network conditions.  It probably has to be fairly simple to have this property.  Complex algorithms usually have weird corner cases.

Simple algorithms may not be useful in real-world applications.
Complex algorithms are difficult to analyze and show that they always
work in given conditions. I'm not sure how much sense it makes for the
draft to have some general requirements.

I'd like to see an example of the issue with a long chain of PLLs. If
anyone has a code of a susceptible PLL, I can try to simulate a long
chain.

> As for Kalman filters, they work really well for things like GPS receives where the noise properties are relatively constant.  Then you can tune them for those noise properties.  Queuing noise in a network varies wildly, and is often exhibits a highly skewed client clock error measurement probability density with a peak near zero followed by a long tail in one direction.  I suspect that Kalman filters will perform poorly on highly skewed data unless you have a non-linear prefilter to make the noise probability density closer to gaussian.  I know of people who have gotten good results for PTP servo loops using Kalman filters, but with a generalized luckly packet prefilter to eliminate the long tail in the distribution, and with the prefilter also determining the Kalman filter parameters.  That is a complicated algorithm which probably has some catastrophic corner cases, so it wouldn't be a good candidate for a default servo loop which always works at least ok.

In the context of NTP, which is supposed to synchronize computer
clocks, the clock may be an even bigger problem than the network.
Without a good model of the clock a Kalman filter (at least in the
basic form) cannot perform well. The problem is in random changes in
the frequency of the clock due to temperature changes (due to random
changes in the load of various components in the computer). A Kalman
filter that works well on an idle computer won't likely work well on a
moderately loaded server. It would need to include data from a
temperature probe. Making the loop adaptive is not that easy. In NTPv3
and NTPv4 it is done by adjusting the polling interval (to which is
tied the PLL/FLL time constant).

-- 
Miroslav Lichvar