Re: [Ntp] WGLC: draft-ietf-ntp-interleaved-modes

Harlan Stenn <stenn@nwtime.org> Fri, 14 December 2018 21:03 UTC

Return-Path: <stenn@nwtime.org>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 03544130F21 for <ntp@ietfa.amsl.com>; Fri, 14 Dec 2018 13:03:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PpRTaK0AW64u for <ntp@ietfa.amsl.com>; Fri, 14 Dec 2018 13:03:32 -0800 (PST)
Received: from chessie.everett.org (chessie.everett.org [IPv6:2001:470:1:205::234]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A3017130EE6 for <ntp@ietf.org>; Fri, 14 Dec 2018 13:03:32 -0800 (PST)
Received: from hms-mbp11.pfcs.com (75-139-194-196.dhcp.knwc.wa.charter.com [75.139.194.196]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by chessie.everett.org (Postfix) with ESMTPSA id 43Gjh22SNlzL7N; Fri, 14 Dec 2018 21:03:30 +0000 (UTC)
To: Miroslav Lichvar <mlichvar@redhat.com>, ntp@ietf.org
References: <2C2DBD6F-727F-48DB-BB48-14CE7F7F8B95@isoc.org> <A113A752-6CDA-4772-9720-A0AABFD9B450@isoc.org> <AM0PR0602MB373031DC961E9B4E10F07C38FFA90@AM0PR0602MB3730.eurprd06.prod.outlook.com> <7b7402ee-8e6b-1e3e-ea18-6d2f689318fd@nwtime.org> <20181210160548.GB27901@localhost> <3b10cdea-59af-15c8-dade-e92d6a6652fc@nwtime.org> <20181211124802.GF26705@localhost>
From: Harlan Stenn <stenn@nwtime.org>
Openpgp: preference=signencrypt
Autocrypt: addr=stenn@nwtime.org; prefer-encrypt=mutual; keydata= mQGNBFI2xmQBDACrPayw18eU4pIwCvKh7k0iMkAV9cvzs49kBppM+xoH+KKj4QWmkKELD39H ngQnT3RkKsTLlwxyLqPdUmeQNAY2M5fsOK+OF6EvwLPK9hbmE3Wx2moX+sbEUxJ2VzFhKSKb OPZALXwk1XxL0qBedz0xHYcDwaSAZZkEFXURv2pDIdrmnoUnq2gdC8GpoFJiXoUaCLSYzzaY ac4Njw7Mue8IqfzRQb70aMjXl/qmsmfmEVAyGXywDdc/ler4XSgiuYOV7Kf69bj9PFZZSMdJ MWgEyZH6lJ0TU5ccR2zp5ZRmWzQQkxJMyH2th7q0Nmz3aX4A0K4yE0Ba9/5Dr7ctpF15BrMF aEo4s5lwI6tUnkgMWo265mMzCz4mAPV/ac0w0OXQg7r9E2r0+dRapnzUlG43D0JLDqDr9uRR L6IrRQqoCWUC75lfmPYQYSlaTJaK68r3lXd0z1cXJUgVtEL5H3/Z71R2B20twcQVAnw2iIH6 L5vdrsIjHrMmkqRVbs9nNyEAEQEAAbQ5SGFybGFuIFN0ZW5uIChOZXR3b3JrIFRpbWUgRm91 bmRhdGlvbikgPHN0ZW5uQG53dGltZS5vcmc+iQG5BBMBAgAjBQJSNsblAhsvBwsJCAcDAgEG FQgCCQoLBBYCAwECHgECF4AACgkQyIwAt1pH+kBlzgv/QOg70vdj8wU/z97UPdlbxtN4THAB gfSX4N0VPKT5fjX1tFhuXZQAOv7wedR3Trh7TGteyg33TBAFf9A42mXZKi1IxAiQG118Hd8I 51rXwnugURIYQaIyQI+vbchRbwVyz+mVLTI/h6FdbsVzT4UFmir+ZMkb/XeZPu0HItk4OZHE 6hk+TuTiCnlqlCPLq371fXV54VOb91WZYD8EQFtK02QHGHsQqWvapdphiDVpYehmsPyiTESq NMKLVtjtyPkQ6S7QF3slSg+2q3j8lyxEA78Yl0MSFNU8B/BtKgzWP2itBOfi+rtUKg+jOY1V /s2uVk2kq2QmHJ/s5k5ldy3qVvoTpxvwBe0+EoBocTHYt+xxp0mTM6YY1xLiQpLznzluqg9z qtejX1gZOF4mgLiBIrhXzed3zsAazhTp5rNb1kn0brZFh6JC5Wk941eilnA4LqX8AWo0lmwo eb+mpwZK/5lNdage/anpVqft9wJ/8EcvST9TLUO4fPrmT3d/0LpWuQGNBFI2xmQBDADXLsBk I7CSa5UXlrNVFJQHER1VxRBKqjWWCh/8Qv9v3p3NrIc2UnhoZ1uWQ2voBGty5Xfy9k4afV5k WwDyRDUIb7PX+Tj4HjVVr7qvnOVe/0KzZpNq0Azd0ggFbsM+8mydktHIwJykW0NUsGwPRYuD OA0Lro0ohb5IiCt3sSQi1X1hYjo7O1Vmn8Gy/XYOnhnMux+5zDPO2yTkCNX5PocYi9IJJy6p Mq1yQV4Y2Dl8KtQzvtq55vCUxx6n0MMzFViGwNW6F4ge9ItO4tDScsgowDrHa208ehwOpv/i wjf93lCClQ6vaKmOBX872K/tdY/hwhxPPjgl1bcrOwMRYVemOPPehwnXH5bwclk1hvDQdkJQ 5pJOkE4VCryTF/iDAt4g2QnHocUwt3b6/ChUUWmj2GZ22OR12rbnCtLedwp0DpViKPUCQHBO vpgXdzE/L9zWar9fqM0EREMgfWbsJc9028qluCcFLIN1gYsq4cC+YGAcOu7HOI5orBBV4m9j XfsAEQEAAYkDPgQYAQIACQUCUjbGZAIbLgGpCRDIjAC3Wkf6QMDdIAQZAQIABgUCUjbGZAAK CRDfCQ/G52/8P/uWDACe7OEM+VETDRqjQgAwzX+RjCVPvtgrqc1SExS0fV7i1mUUxr/B8io3 Y1cRHFoFKmedxf8prHZq316Md5u4egjFdTT6ZqEqkK0hvv+i0pRpCa5EX9VIStcJStomZp8F cY34grA+EOWITaLQ4qNZUP7rf2e7gq1ubQTj7uLr6HZZvMZ5em+IvrOWEuWDI6yOiI6px04w RDfkoR2h6kgdw4V0PT4NjK9WYYKrVCf1bjLlVImNBEcXfvlUTrIYO8y6ptvoUsBQky5pQRvP 99Pn42WfyLy50aII6+vyudD4T0yLjXAz4KteUttxtIte64m/F9/7GEIZAxTUcLyOq/7bP4le h39jBckwc62iYzeK/VkU/bMMh2D68Z3QylMnhhcW27BcgQHPKsHhmFa2SNytYcuQiSdf9+pj 4i32ETz1nJAvYAAqgTF/0PL+8ZNQoEpe/n9woMKrlZrqD4EgFmhQ3bNVhlaXz1nuTZDrwPt1 yMxBuUNbCF4jFnaruwrSiGTRoIfUZQwAjQglahrV4/mcjfnvbNoseHX0PKd9q+wjg7MIjWqr f2CI8Fa6MdanqwYphz43I2yXANKFZuMWsWqyQYlvGuPUlUUcAL3stp24RkzDB1Q+JS0IZJST T2JSu0aTfUdWVNqr2UI19eX+zxbOTckSi3Ng14ezG8ZX194ZH10b8JzntQOwmA20pd5JDhug zQfASER+CZDiPPcQ4mvC4y7rMrfV6XGQbDynC3ekDxo8SC5SvjaczXMwXg6SZ8iFtEWmEwW9 r7zPjjIPDrX8w5LXBgxArM5o/HbERpc2EdAvMh1D7LC0SvmoE7fBKxsicVBe4h6vXjEZ+LLr /wuZiBld9OnxAUIpwptbBspO6WKTQYvgFH2OeDG27hiE5P4Xs4WSp5j9ez8OVB1iZnA2nCQ+ tNTjO8c+C/P92vPLx5+bpGRXTXMNaLh34PS3ZsYoUDkKZNhczRZUWJ7nynSbeeyF+QW7SLwA qY7O7dyk9LFTsfJqRQJ7tWnIAjJPCwmSgQ8Kl0UJ
Message-ID: <7fbf59ff-45ea-c1f5-3845-077c6a120f63@nwtime.org>
Date: Fri, 14 Dec 2018 13:03:29 -0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.3.3
MIME-Version: 1.0
In-Reply-To: <20181211124802.GF26705@localhost>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/QjiOG3zbcv1MMQndRK9uKtBE5Rc>
Subject: Re: [Ntp] WGLC: draft-ietf-ntp-interleaved-modes
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Dec 2018 21:03:36 -0000

Miroslav,

I'm happy to work with you on this, if you're interested and think it
would be useful.

I note:

- 18ns corresponds to a precision of about -26.  While that's pretty
special, I remain curious about how long the delay is between pulling
the transmit stamp and the packet going out over the wire.  If the basic
precision of the system clock is only -19, we're not going to do better
than 2 microseconds.  For situations like this, I'm really curious if
we'll even see any benefit to client/server interleave mode.

- I'd like to explore a "follow up" message as an alternative.

- I'm wondering if we can do all of this with "plain" packets, or if
there is benefit to using an extension field for some of these exchanges.

H

On 12/11/18 4:48 AM, Miroslav Lichvar wrote:
> On Mon, Dec 10, 2018 at 05:47:27PM -0800, Harlan Stenn wrote:
>> On 12/10/18 8:05 AM, Miroslav Lichvar wrote:
>>> The document says "The server SHOULD discard old timestamps to limit
>>> the amount of memory needed to support clients using the interleaved
>>> mode." and "The server MAY always respond in the basic mode."
>>
>> I'm not talking about old timestamps.
>>
>> I'm talking about a case where there are LOTS of current clients.
> 
> Maybe I'm not using the word "old" correctly? If there is a large
> number of clients trying to use the interleaved mode, the server needs
> to keep a large number of timestamps to be able to respond in the
> interleaved mode to all of them.
> 
>> The client/server portion of this proposal changes the daemon from
>> stateless to stateful.
> 
> Yes. A state is needed to save the transmit timestamp. There is no way
> around that. If it was sent immediately in a separate message, it
> would amplify the traffic, or the length of the response wouldn't be
> symmetric to the request.
> 
> It's a bit like a TCP-based service. If there is a large number
> of clients, or there are DoS attacks, there is no guarantee that a
> client will get an interleaved response.
> 
> An important difference to TCP is that the protocol gracefully falls
> back to the basic mode. Clients can still synchronize to the server.
> They must not expect the server to be able to respond in the
> interleaved response.
> 
>>> There is a minimal client and server implementation in python. Please
>>> feel free to test it and report if anything breaks.
>>>
>>> https://github.com/mlichvar/draft-ntp-interleaved-modes
>>
>> You have this in chronyd then? 
> 
> No, in chrony it's implemented differently. The server support was
> basically bolted on the monitoring facility. That was the easiest and
> safest way. In ntpd it might be the same.
> 
> There is only one timestamp per IP address, which means it doesn't
> work over NAT (as documented in the man page). Some of the code is
> shared with the symmetric mode and there are optimizations, so it may
> be difficult to see what is actually going on in the client/server
> interleaved mode.
> 
> For implementors interested in the client/server interleaved mode, I'd
> rather recommend to first look at the python example.
> 
>> Do you have any results on how much of a difference it makes?
> 
> It corresponds to the difference in accuracy and symmetry of the
> timestamps that the server has. In absolute terms it's usually few
> microseconds or tens of microseconds. In relative terms, it can be up
> to about two orders of magnitude.
> 
> Basically, it makes the accuracy of NTP comparable to PTP. In networks
> where switches/routers support PTP, it's worse as the devices
> typically don't have a HW support for NTP. In networks without PTP
> support it's better than PTP, because PTP is very sensitive to
> variability in network delay. Of course, it also depends on the
> implementation.
> 
> If you just want to see an example of a difference in ideal
> conditions, here are two sourcestats comparing the same server to a
> PTP clock. In the first one it's using the basic mode, in the second
> it's using the interleaved mode.
> 
> Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
> ==============================================================================
> PTP                         8   5     7     -0.000      0.003     -0ns     3ns
> 192.168.33.1               21  13    22     +0.004      0.031  +3175ns   240ns
> 
> Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
> ==============================================================================
> PTP                        13   7    12     -0.000      0.001     -0ns     4ns
> 192.168.33.1               22  10    21     +0.001      0.001    +18ns     9ns
> 
> So, in this case the stability of NTP measurements improved by a
> factor of about 20 and the accuracy relative to the PTP clock by a
> factor of about 150.
> 
>> If you're testing this with chrony I would imagine the results would be
>> slightly different from ntpd, as I don't believe chrony implements the
>> same internal algorithms and I don't know how significant that would be
>> to the results.
> 
> That shouldn't matter much if you just want to see a difference. The
> problem is that ntpd doesn't support hardware timestamping or kernel
> transmit timestamping. The interleaved mode makes a difference only if
> the server has a more accurate transmit timestamp after the
> transmission.
> 
>>> The receive and transmit timestamps in server's responses, which are
>>> copied to the origin timestamp in client requests, are fully under the
>>> control of the server. A server with a sane clock will not respond
>>> with a duplicate receive timestamp, so each client using interleaved
>>> mode sends a unique origin timestamp.
> 
>> I'm not talking about server responses, I'm talking about the server
>> properly matching stored timestamps when there are an increasing number
>> of clients behind a single IP.
> 
> Well, unique receive timestamps in server responses is what enables
> the server to properly match them with stored transmit timestamps. It
> doesn't matter how many clients are there behind a single IP. Each
> should get a unique timestamp from the server.
> 
>> This brings up the question of what happens if a server response is lost
>> - in that case the server will have sent its interleaved reply and will
>> switch the client's saved transmit/origin timestamp.  When the client
>> doesn't get the response, it will then send the next packet with the
>> previous origin timestamp, the one the server no longer has because it
>> already replied using that one?
> 
> It depends on the implementation. It may drop the timestamp to which
> it already responded to, but it doesn't have to. The python example
> would keep it, chronyd would not.
> 
>> Yes, and one way to manage this is with a capped queue length, and
>> another way to manage it is with client limits (if there is a reasonable
>> way to manage this).  My point goes more towards the belief that there
>> is benefit to removing old/expired timestamps from the queue sooner
>> rather than later.  So it may make sense to keep a timestamp for, say, 2
>> seconds past 2x(poll interval), as we don't know if the client might
>> decide to increase its poll interval based on whatever it learned from
>> our most recent response.
> 
> It might work, but it will need more memory per client. And that
> memory could be used for more timestamps and keep it simpler. One pair
> of timestamps is 16 bytes, or just 12 bytes in an optimized
> implementation. So, even a little bit of extra state can decrease the
> number of concurrent clients significantly.
> 
> In any case, I think we would need to specify a requirement for
> interleaved clients to not zero the "poll" field.
> 

-- 
Harlan Stenn <stenn@nwtime.org>
http://networktimefoundation.org - be a member!