Re: [Ntp] NTP over PTP

Miroslav Lichvar <mlichvar@redhat.com> Wed, 30 June 2021 10:06 UTC

Return-Path: <mlichvar@redhat.com>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E4053A1690 for <ntp@ietfa.amsl.com>; Wed, 30 Jun 2021 03:06:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.298
X-Spam-Level:
X-Spam-Status: No, score=-0.298 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.198, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_DOTEDU=1.997] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=redhat.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h2X8mpyQ4Ung for <ntp@ietfa.amsl.com>; Wed, 30 Jun 2021 03:06:46 -0700 (PDT)
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BC45B3A1672 for <ntp@ietf.org>; Wed, 30 Jun 2021 03:06:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625047604; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=TYHmqC4kTbtoUNWOu6SVhv7uaNCytIaEvDsF9DApBWw=; b=Ovy1ngIpTR3Lokh/DcMH31DMLEhqualpiswKvzwhwyB4sZL0Dwodp9RqVXUG7Vxaut57Nv OFj733Wn/53IYQSMPiBEDAsJziPQYCoQvQpwnjv+qXNmnAICE9/lbPCgAjiJMSOISk1kUf UO7WDneTf2clcj0pfgNhWn5rxLMQy5s=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-299-xODPOUqAPyWlqEUKJbw1AA-1; Wed, 30 Jun 2021 06:06:41 -0400
X-MC-Unique: xODPOUqAPyWlqEUKJbw1AA-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4CC18800D62; Wed, 30 Jun 2021 10:06:40 +0000 (UTC)
Received: from localhost (holly.tpb.lab.eng.brq.redhat.com [10.43.134.11]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A1F3B5C230; Wed, 30 Jun 2021 10:06:39 +0000 (UTC)
Date: Wed, 30 Jun 2021 12:06:37 +0200
From: Miroslav Lichvar <mlichvar@redhat.com>
To: Heiko Gerstung <heiko.gerstung@meinberg.de>
Cc: "ntp@ietf.org" <ntp@ietf.org>
Message-ID: <YNxCLd3vvm3yMTl7@localhost>
References: <YNRtXhduDjU4/0T9@localhost> <36AAC858-BFED-40CE-A7F7-8C49C7E6782C@meinberg.de> <YNnSj8eXSyJ89Hwv@localhost> <D32FAF20-F529-496C-B673-354C0D60A5AF@meinberg.de> <YNrDGy2M2hpLz9zc@localhost> <C5D99A22-84B8-4D27-BE74-D8267FB1DCB0@meinberg.de> <YNrqWjHPtC7ToAL8@localhost> <125F908E-F80D-4873-A164-A460D96316E5@meinberg.de>
MIME-Version: 1.0
In-Reply-To: <125F908E-F80D-4873-A164-A460D96316E5@meinberg.de>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16
Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mlichvar@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/wUgliMEHKEbPOrFC0zWS8CghVMw>
Subject: Re: [Ntp] NTP over PTP
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Network Time Protocol <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Jun 2021 10:06:56 -0000

On Tue, Jun 29, 2021 at 01:17:45PM +0200, Heiko Gerstung wrote:
> > On Tue, Jun 29, 2021 at 09:15:22AM +0200, Heiko Gerstung wrote:
> >> > Without full on-path support NTP should generally perform better than
> >> > PTP as it doesn't assume network has a constant delay.
> >> Why do you think PTP assumes a constant network delay? PTP is measuring the
> >> delay constantly in both directions and calculates the round trip.
> > 
> > Yes, it does, but it is separate from the offset calculation.
> Which is not the same as claiming that PTP assumes the "network has a constant delay". 

Ok, let's try again. It is assumed to be constant in the interval
between its measurement and the offset measurement. The delay is
measured periodically in order to adapt to changes in network
configuration and topology. It's not assumed to change frequently.

> > The calculation is described in section 11.2 of 1588-2019. It uses
> > <meanDelay> and there is only the TX and RX timestamp of the sync
> > message like in the NTP broadcast mode. If the distribution of the
> > actual delay is not symmetric, as is common without full on-path
> > support, the average error of the measurements will not even get close
> > to zero. PTP relies on full hardware support. Without that, it
> > generally cannot perform as well as NTP.
> Wrong, even without full on-path support unicast PTP uses delay requests/responses to take the client-to-server delay into consideration as well. See IEEE1588-2019 Subsection 11.3 for a description of how this works.

Well, yes, but the delay measurement is separate from the offset
measurement. If you can log and plot the offsets measured by PTP and
NTP in a network without PTP support, you will see how the
distribution is different due to the different calculation.

> > Another issue with using PTP in network without PTP support is RX
> > timestamping fixed to the beginning of the message. If the server is
> > on a 1Gb/s link and the PTP client is on a faster link, there will be
> > an asymmetry of hundreds of nanoseconds due to the asymmetric delay
> > in forwarding of messages between different link speeds.
> Yes, there are implementations which take that into account by applying static correction values to compensate for link speed asymmetry. I believe this also affects NTP, but in most cases hundreds of nanoseconds are not a problem for applications relying on NTP synchronization. 

NTP is much less impacted as it timestamps the end of the reception.
A software timestamp is captured after the packet is received.
Hardware timestamps are transposed as described in this document:
https://www.eecis.udel.edu/~mills/stamp.html

In my testing with several different switches the ideal point of
the transposition was around beginning of the FCS or a couple octets
before it, instead of the end. I think the explanation is that it
compensates for the preamble. In either case the error was much
smaller than if it was not transposing at all.

> There are more challenges I see for NTS-over-PTP. You need to synchronize the clock of the hardware timestamper itself, i.e. getting the time into the silicon that creates the timestamp. PTP timestamps are TAI (not UTC), which itself is not a problem as long as you know the TAI-UTC offset. On a server (PTP Grandmaster) this is typically done by using some form of hardware sync for the timestamper engine, e.g. setting the ToD to the upcoming TAI second and then use the PPS to zeroize the fractions. In reality the solution is typically more sophisticated as you do not want to see micro timesteps at the start of every second. 

The same approach can be used with an NTP implementation. The hardware
can keep time in TAI as long as the TAI-UTC offset is known.

> On a client you have to synchronize your system time with the time of the hw timestamper (e.g. the NIC). That time is synchronized by the hardware itself to the PTP server. PTP4L uses phy2sys for this, but I am not sure about the accuracy with which you can read out the PHC clock and correct the OS clock with it. There is a delay when accessing a NIC over the PCI(e) bus, but this is affecting PTP in the same way. So for the client, you should be on par with PTP in this regard.

I don't see a difference between PTP and NTP in this aspect. You can
use the protocol to synchronize the NIC or the system clock, directly
or indirectly. The PCIe latency is an issue for the system clock
either way. Some NICs support PTM, which is an NTP-like protocol for
PCIe with hardware timestamping, which can be used to avoid the error
due to asymmetric PCIe latency.

> But for a server you have to find a NIC that supports feeding the PPS of your GNSS receiver (for example) to it, not impossible but also not an easy task for someone who is responsible for maintaining highly accurate synchronization for an entire corporate network. 

Same applies to both NTP and PTP. The Intel I210 is a popular NIC for
these use cases.

> The next challenge is on the server, which for unicast PTP requires a certain timestamp queue size to support a usable number of clients. A lot of NICs that claim they have IEEE1588 hardware support have small to tiny ts queue sizes, one common exampe is 4 timestamps. That means you have to be able to read out the hardware timestamps very quickly and you will not really have a chance on high speed links with hundreds and thousands of incoming NTS-over-PTP requests per second.  Those hardware timestamping engines have been designed to be used for PTP clients only, and even then not for the high packet rates that PTP supports (and sometimes requires to improve accuracy over partial on-path-support networks). They cannot be used for servers expecting to handle a high packet rate. 

Isn't that an issue for both NTP and PTP unicast using high rate sync
and delay requests?
> 
> Finally, I am not sure if IEEE1588 would be happy about an IETF standard "hijacking" one of their protocols, but most probably they cannot do anything about it. Personally I think it is a hack and should not be standardized, but that's just me. I would rather like to see some standard way of flagging an Ethernet frame that I send out to trigger a hardware timestamping engine to timestamp that frame. Such a universal approach could be used by NTP, PTP and other protocols and applications as well (not only time sync protocols), for example to measure network propagation delays etc. It is incredibly hard to get support for this into the silicon of companies like Intel or Broadcom etc., but if it would be universal enough, the chances are higher that it will make its way into products eventually. 
I think the best approach is for the hardware to timestamp all packets
as many NICs already do. The problem is with existing hardware that
cannot do that. I agree it doesn't look great when you have to run NTP
over PTP, but that seems to be the only way to get the timestamping
working on this hardware.

-- 
Miroslav Lichvar