Re: [Ntp] Frequency transfer in NTP

Magnus Danielson <magnus@rubidium.se> Mon, 01 February 2021 12:18 UTC

Return-Path: <magnus@rubidium.se>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2BC953A10B1 for <ntp@ietfa.amsl.com>; Mon, 1 Feb 2021 04:18:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rubidium.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iYu7zUgi8y1I for <ntp@ietfa.amsl.com>; Mon, 1 Feb 2021 04:18:26 -0800 (PST)
Received: from ste-pvt-msa1.bahnhof.se (ste-pvt-msa1.bahnhof.se [213.80.101.70]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 41C043A0CB2 for <ntp@ietf.org>; Mon, 1 Feb 2021 04:18:24 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTP id 0203A3F622; Mon, 1 Feb 2021 13:18:21 +0100 (CET)
Authentication-Results: ste-pvt-msa1.bahnhof.se; dkim=pass (2048-bit key; unprotected) header.d=rubidium.se header.i=@rubidium.se header.b=chwi0VzR; dkim-atps=neutral
X-Virus-Scanned: Debian amavisd-new at bahnhof.se
Received: from ste-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (ste-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YPa9josLxPbV; Mon, 1 Feb 2021 13:18:19 +0100 (CET)
Received: by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 2D1A23F3ED; Mon, 1 Feb 2021 13:18:18 +0100 (CET)
Received: from machine.local (unknown [192.168.0.15]) by magda-gw (Postfix) with ESMTPSA id 7658C9A04FF; Mon, 1 Feb 2021 13:18:17 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=rubidium.se; s=rubidium; t=1612181898; bh=0KvYI0/OEASQS/EUC4sv4+ZihuB/4ZEJBxdIYovOjsg=; h=Cc:Subject:To:References:From:Date:In-Reply-To:From; b=chwi0VzROLGTKlbUWSAYQcHQcIe7FEavhQ8gG0+TxHFCJZzZyTKablDmzPZD9/drO qoPTvcgYwL28fgNhBbGQAJF73YZDhISZL36nTyWdncm0KyxUgjcBJsXFBiLpj+vX5S MM/LN5kVtpdTyJmdO3YDaQgPpRQknJMRvvKs3Zevqrzp1ndTa4PCodKpTAwDpV1k/j sdbixMPGRkQW7zfE777SlFr9Wa3m2V1+XafEWK5/4aDkjGhsf+uQI3zR2EY6QTuGBm laAv8/pZI3UGhZcJsdQGILwMNnPJ9rzPTVrONu840B1OffpCgc7MvMEn0yPweBA1iR fLtlIpMNKd2kw==
Cc: magnus@rubidium.se, ntp@ietf.org
To: Miroslav Lichvar <mlichvar@redhat.com>
References: <20210128143137.GA1205378@localhost> <f60202de-d53f-4dea-6e2b-d59dbb0e1143@rubidium.se> <20210201093709.GF1205378@localhost> <a22737e3-05d0-e681-e32f-daada351e51c@rubidium.se> <20210201113856.GI1205378@localhost>
From: Magnus Danielson <magnus@rubidium.se>
Message-ID: <ea775bc0-25dc-5821-3231-00c013f8a39b@rubidium.se>
Date: Mon, 01 Feb 2021 13:18:15 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.7.0
MIME-Version: 1.0
In-Reply-To: <20210201113856.GI1205378@localhost>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/SDFoXLVug7Db33od0LMiL4l3crg>
Subject: Re: [Ntp] Frequency transfer in NTP
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Feb 2021 12:18:29 -0000

Miroslav,

On 2021-02-01 12:38, Miroslav Lichvar wrote:
> On Mon, Feb 01, 2021 at 12:00:50PM +0100, Magnus Danielson wrote:
>> On 2021-02-01 10:37, Miroslav Lichvar wrote:
>>> On Sun, Jan 31, 2021 at 11:30:03PM +0100, Magnus Danielson wrote:
>>> Yes, you can dampen the loops to avoid the overshoot, but that will
>>> have a negative impact on the timekeeping performance. You can
>>> minimize the phase error, or frequency error, but not both at the same
>>> time.
>> Turns out that as you measure the phase stability in MTIE and TDEV, as
>> well as frequency stability in ADEV, you need to remove that overshot /
>> resonance for best stability. This becomes especially true as you build
>> a chain of them, as it will grow worse along the chain.
> Right. You cannot minimize the phase error upper in the chain without
> disturbing the clocks furher down in the chain. A sacrifice needs to
> be made to keep the chain stable unless the frequency is transfered
> separately. That's what I tried to explain in the post.

Once the phase error is small enough, other performance characteristics
tend to dominate, such as frequency stability (which is ADEV for random
noise and MAFE for systematic noise) and phase stability (which is TDEV
for random noise and MTIE for systematic noise). Turns out it works
really well in practice to get very low offsets doing this, as this is
what I've spent a very long time to perfect. You did not really achieve
the goal in the post as it lacked a number of considerations that I
tried to point out.

Also, the overshots tends to extend the time before equilibrium is
achieved, so it takes longer time to get that stable offset.

>
>> While extremely simple, difficult to tune has not been my experience.
>> Once one have done once gain scaling right, properly orthogonal setting
>> of damping and frequency have been straight-forward. Getting optimum
>> performance given a certain condition is as always a bit tricker, but
>> not impossible.
> The tricky part is making it adaptive. It's easy to tune a PI loop for
> a well-characterised clock and network, but making it work well over a
> wide range of conditions, as is common in NTP, is hard.
For sure. That's why you end up having to consider adaptive filters.
However, that adaptive filter needs to respect the basic rules
regardless, so one needs to get the resonance part right to start with
and only then one make it auto-tuneable. Those is orthogonal properties.
The PI loop is sufficient to understand the basic. We then only want to
tune the frequency for best cut-over between link noise and local
oscillator noise (and drift).
>
>>> If you wanted to be constructive, it would be best if you showed us an
>>> example of a loop that doesn't overshoot in the step response and
>>> otherwise performs similarly to ntp or chrony.
>> I could provide an example of scaling with poll-rate, sure. I just don't
>> have the setup to do measurement of a chain, so that would take time and
>> effort that I might not be able to pull together, but others may have that.
> I'd like to see an example of a loop (it doesn't have to be PI) that
> performs similarly or better than an implementation using the proposed
> frequency transfer. I don't think that is possible. You say it is, so
> an example would be a good way to show that.

Let me see if I can whip up a little simulator for you then. I've done
that before and made it match up with the reality just nicely.

There is one thing you can do to improve things, but it does not involve
frequency but rater phase. That's where the is place for possible
improvements in my mind. I also said that before we can look at
benefits, we need to fix the basic problems of the setup. Only after
that we can measure things, and I did point out that frequency aiding
alters the damping factor and hence the reason we see damping factor
improvements, but the initial damping factor was way of what it should
be for a chain network, so it is not relevant comparison. If I do an
effort, you will have to do one where you fix your extremely unfair
comparisons, as you end up not proving improvement over know good
situation. What you have ends up being meaningless comparison for me, it
needs to start from a well-tuned setup and then show further improvements.

Also, what you show is transient behavior, not long-term errors, not the
usual measure of stability. You need to do that. My experience is that
transient behaviors needs to be taken way down, and first things to fix
is damping. Once damping, and associated scaling with polling rate is
done, only then it becomes meaningful to consider fine-tuning of
frequency which is the adaptive filter trap. However, even a
non-adaptive setup should not be too bad, and the chain problems show up.

I recommend you to look at TimeLab and Stable32 for analysis. Stable32
you can download for free from IEEE UFFC, it used to be a commercial
software.

>
> Just post the code of the loop. In any language you like. I can
> integrate it with NTP and compare it with the existing
> implementations.
>
Notice that no "code" will just work as you drop it in. The gain factors
needs to be corrected for any implementation. Building code to tune that
is more or less painful depending on the implementation. Things needs to
be normalized and parametrized for this to have a chance of working. I
don't have time to make it fool proof.

Cheers,
Magnus