Re: [Ntp] NTPv5: big picture

Philip Prindeville <philipp@redfish-solutions.com> Thu, 07 January 2021 05:58 UTC

Return-Path: <philipp@redfish-solutions.com>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8E20B3A0332 for <ntp@ietfa.amsl.com>; Wed, 6 Jan 2021 21:58:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Shqmuqw2bElW for <ntp@ietfa.amsl.com>; Wed, 6 Jan 2021 21:58:09 -0800 (PST)
Received: from mail.redfish-solutions.com (mail.redfish-solutions.com [45.33.216.244]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1E1F63A02BD for <ntp@ietf.org>; Wed, 6 Jan 2021 21:58:09 -0800 (PST)
Received: from [192.168.3.4] ([192.168.3.4]) (authenticated bits=0) by mail.redfish-solutions.com (8.16.1/8.16.1) with ESMTPSA id 1075w4NB362963 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 6 Jan 2021 22:58:04 -0700
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.40.0.2.32\))
From: Philip Prindeville <philipp@redfish-solutions.com>
In-Reply-To: <9b129a5f-eec0-1f9d-f4f9-0027f86ae964@rubidium.se>
Date: Wed, 06 Jan 2021 22:58:04 -0700
Cc: ntp@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <FB6CD763-E62B-4D62-91E8-B0DBAC5D44AD@redfish-solutions.com>
References: <20210101025440.ECE3340605C@ip-64-139-1-69.sjc.megapath.net> <155b7ae6-c668-f38f-2bbd-fd98fa4804db@rubidium.se> <16442E9F-DD22-4A43-A85D-E8CC53FEA3E5@redfish-solutions.com> <66534000-c3ba-8547-4fb1-1641689c6eba@rubidium.se> <E6F9312A-2080-4D13-9092-935080859750@redfish-solutions.com> <1086ffe6-234a-d2d4-13d6-6031c263f4cd@rubidium.se> <B4E8F8D4-95D8-4ACB-9770-FCFEBFE002A0@redfish-solutions.com> <093df8ba-548d-b488-4780-f28d69150884@rubidium.se> <16792971-F622-47BE-BF28-B522925734BD@redfish-solutions.com> <9b129a5f-eec0-1f9d-f4f9-0027f86ae964@rubidium.se>
To: Magnus Danielson <magnus@rubidium.se>
X-Mailer: Apple Mail (2.3654.40.0.2.32)
X-Scanned-By: MIMEDefang 2.84 on 192.168.1.3
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/RW2Mr_ASfMrq8tjy0RDk8YHkYOg>
Subject: Re: [Ntp] NTPv5: big picture
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jan 2021 05:58:13 -0000

> On Jan 5, 2021, at 6:34 AM, Magnus Danielson <magnus@rubidium.se> wrote:
> 
> Philip,
> 
> On 2021-01-05 05:53, Philip Prindeville wrote:
>> 
>>> [snip]
>>> It's the thing we need to do.
>> 
>> And there’s that conflation.  No, it’s the thing you want to do, because it kills two birds with one stone.
>> 
> You may think so, but I end up seeing that what you try to push is going
> to be problematic to operate properly, and I try to keep things together
> when they belong together. NTP provides the time-service, the
> time-service requires us to achieve certain things. Spreading things out
> to other places is like breaking layers, it complicate things. This is
> where trying to make simpler than it needs to be becomes a burden.


“Related to each other” isn’t “inter-dependent”, “intertwined”, etc. They just happen to be about the same topic.

“The time-service requires us to achieve certain things.” Again, that’s a vagary.  What “things” does “it require”?

I think NTPv5 at is core is a secure clock synchronization protocol.  Period.

Calendaring (converting groups of seconds into larger human-readable units) is an orthogonal issue.  “Related” only in that it also deals with “time”, but the similarity ends there.

The whole point of the “layering” model is explicitly to decouple one layers’ knowledge of or understanding of what happens in other layers to a minimum.  Saying they should be intermingled just because they deal with a commonality is NOT layering.


[snip]

>> 
>> Okay, well, we’ve found something else to agree on then.
> I do not really understand how you came to disagree with me, but ah well.
>> Converging on an acceptable normative standard is not unlike eating an elephant.
> I do not eat elephant, nor uneat. If I had to, I would make an overall
> plan an then eat small chunks at a time. Then again, discussing elephant
> eating is way off topic.


Not any more off-topic than bikesheds.  Both are metaphors about how progress is made (or not made).


>>>>>>>> [snip]
>> 
>> 
>>>>>>>> The next step is to have consumers of time migrate… perhaps starting with logging subsystems, since unambiguous time is a requirement for meaningful forensics.
>>>>>>>> 
>>>>>>> They will do what the requirements tell them. Many have hard
>>>>>>> requirements for UTC.
>>>>>>> 
>>>>>> Many will also migrate to whatever provides them with unassailable (unambiguous) timestamps in the case of litigation.  I’ve worked on timing for natural gas pipelines so that catastrophic failures (i.e. explosions) could be root caused by examining high precision telemetry… not the least of which was to protect the operator in the case of civil suits of criminal negligence, etc.
>>>>>> 
>>>>> There is many ways to show traceability to UTC, while technically
>>>>> avoiding problems. This however often involves knowing the TAI-UTC
>>>>> difference one way or another such that the relationship to UTC is
>>>>> maintained. The requirement to be traceable to UTC will still stand in
>>>>> many many cases. The way that one technically avoids problems is to some
>>>>> degree orthogonal to that.
>>>>> 
>>>> I disagree with this premise:  we don’t need to be traceable to UTC.  UTC needs to be derivable from our timescale, so that it’s meaningful to applications (and eventually, humans).
>>>> 
>>> For many uses I agree with you. However, there are other uses at which it becomes a requirement. The question then becomes if we engineer NTPv5 to be able to deliver sufficient properties to fulfill that, or not.
>> 
>> Again, this is a generalization that I have a hard time grounding in anything concrete.
>> 
>> Can you explain when/where/what this might be?  What are these “other uses”?
> 
> OK, so let's start with some case which I think you agree with.
> 
> Let's say you have a router or a server, it produces logs. We want
> time-stamps on these logs to know when things happens. Fine, we get time
> from some NTP server and the box seems be ticking away, and for many
> purposes we have sufficient correct time that we can correlate the logs
> in several boxes. If this time is off by some arbitrary time does not
> care as long as it is consistent between what you see, but it's good
> that it matches time of day and UTC seems good enough, probably with
> some local time-zone tossed in.
> 
> Then, let's consider a mobile base station, you have the same needs for
> logs there as in any operational environment, and for most uses the
> original description works. However, now the law enforcement puts
> requirement that key events relating to phones interaction with the
> network becomes critical, as they use these to record the presence of a
> phone to a particular base station (antenna), they record when voice and
> text messages is sent etc. Now, the regulator then puts the requirement
> on the operators that the time of those things needs to be traceable to
> UTC, and they can put some requirements on that. In practice, a lot of
> those base-stations get their time using NTP (it will shift with 5G as
> PTP is expected to take over that role, but it does not diminish the
> example).
> 
> The regulators and law makes have chosen to include these requirements
> on more and more systems, and their golden standard is to say "traceable
> to UTC". Notice that this is the UTC as being kept by BIPM, and
> traceability achieved through any of the signatory national labs, so non
> of the ambigous UTC derivates we talked about.
> 
> Another example is how some regulators check that the time in call data
> records (classically a hated term in IETF, I know), is correct, such
> that the correct fair for the time-span is put on the customers record.
> Again traceability to UTC.
> 
> Depending on applications, the requirements for timing range from +/- 15
> min to +/- 100 ns (if I ignore the needs to metrology labs). Some of
> these requirements will for sure be outside of NTP reach, because of
> it's limitations. Some of them can do with very weak solutions, even
> SNTP. However, there is only so many parallel systems that you want to
> operate, so it makes sense to be able to handle most requirements one
> simpler system (NTP that is) and then the more stringent in more
> dedicated systems. The border-line between these is roughly in the 1-10
> ms span in my judgement.
> 
> Now, traceability could mean we run djungle time and provide corrections
> on the side, we covered that separately.


Sorry, no idea what “djungle time” is.


> However, it becomes messy to
> keep a side-system to correct after the fact for most things, so it is
> again practical to make continuous adjustments to keep the system in
> line. The traceability records still being collected for the paper-work
> side of things to prove one achieves it, keeping the actual system
> adjusted close enough to UTC with that kind of span means that we need
> to follow UTC, it makes leap-second smearing not useful for those
> applications (not to say there is not other applications where it may be
> a sensible solution), I end up concluding that it makes sense to do it
> properly.
> 
> I used mobile case as one particular demonstrator, but there is more of
> them. These mobile networks do not sit on public networks (aka Internet)
> but on separated networks. These are huge separate networks. Exactly how
> much of the public Internet allowed "in" depends on so much. The
> operations of these is quite different and changes to the network can be
> cumbersome.


Things using “Internet protocols” are not The Internet, or the purview of the IETF, any more than a 9V alkaline battery is the concern of my local electric power company.

The IETF normally goes out of its way to avoid non-Internet connected scenarios, the obvious except being RFC-1918, and there only to specifically call out that a chunk of Internet number-space was being carved out of the Internet and divorced from it.


> 
>>>>> Then again, may seems to misunderstand what the term "traceable" means.
>>>>> It does not mean "locked to".
>>>>> 
>>>> Maybe I’m one of the many misunderstanding then.  For me “traceable to” is synonymous with “originating from”.
>>>> 
>>> It is a very common misunderstanding, yes. The term "traceable" refers to the unbroken chain of calibrations to the SI units and the traceability record that produces, as each calibration will measure the deviation and precision achieved. I can use a complete djungle clock that is unsteered and then through continuous calibrations be able to convert the readings of my clock into the readings of UTC. This is a mapping with defined parameters from that calibration. Adjustment of the clock during a calibration, is about re-setting the clock so that the calibration parameters compensate directly, which is more a practicality in how the conversion is made, but not strictly necessary. In practice, the actual clocks building up the EAL/TAI/UTC is free-running, and then the laboratory replica of TA and UTC is steered to be neat TAI and UTC, but the actual clocks is not steered. Also, the trouble is that gravity pull on the various labs needs compensation, which is done in the ALGOS algorithm as post-processed by BIPM, and not the labs. Never the less, all clocks have traceability to TAI/UTC. Derived clocks then show their traceability to the lab replica of UTC. This just to show how traceability actually is a bit different in metrology than you would first want to believe. Vocabulary of International Metrology (VIM) is free to download from BIPM.
>>> 
>>> One may then wonder why we need to follow the VIM and Metrology use of the terms, well, the trouble is that if we don't, we end up creating confusion, because they end up doing calibration and dissemination of time using NTP. It is their time-scales of TAI and UTC we use. I try to be strict in order to avoid confusion, and I think there is enough alternative vocabulary to use so we do not need to add that confusion.
>> 
>> Okay, thanks for clearing that up for me.
> Happy to assist.
>> 
>> 
>>>> 
>>>> 
>>>>>>>>>> I'd like the answer to be authenticated.  It seems ugly to go through NTS-KE 
>>>>>>>>>> if the answer is no.
>>>>>>>>>> 
>>>>>>>>> Do not assume you have it, prefer the authenticated answer when you can
>>>>>>>>> get it. I am not sure we should invent another authentication scheme more.
>>>>>>>>> 
>>>>>>>>> Let's not make the autokey-mistake and let some information be available
>>>>>>>>> only through an authentication scheme that ended up being used by very
>>>>>>>>> few. You want to have high orthogonality as you do not know what lies ahead.
>>>>>>>>> 
>>>>>>>>> So, we want to be able to poll the server of capabilities. Remember that
>>>>>>>>> this capability list may not look the same on un-authenticated poll as
>>>>>>>>> for authenticated poll. It may provide authentication methods, hopefully
>>>>>>>>> one framework fits them all, but we don't know. As you ask again you can
>>>>>>>>> get more capabilities available under that authentication view. Another
>>>>>>>>> configuration or implementation may provide the exact same capabilities
>>>>>>>>> regardless of authentication.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Maybe we should distribute the info via DNS where we can 
>>>>>>>>>> use DNSSEC.
>>>>>>>>>> 
>>>>>>>>> Do no assume you have DNS access, the service cannot rely on that. It
>>>>>>>>> can however be one supplementary service. NTP is used in some crazy
>>>>>>>>> places. Similarly with DNSSEC, use and enjoy it when there, but do not
>>>>>>>>> depend on its existence.
>>>>>>>>> 
>>>>>>>> Good point.
>>>>>>>> 
>>>>>>>> As someone who works in security, I’ve seen a fair share of exploits that arise when protocols make tacit assumptions about the presence and correctness of other capabilities and then these turn out not to be valid under certain critical circumstances.
>>>>>>>> 
>>>>>>>> Doing X.509 when you don’t have Internet connectivity for CRL’s or OCSP is a good example.
>>>>>>>> 
>>>>>>> I've seen many networks where normal rules of internet applies, yet they
>>>>>>> are critical, need it's time and fairly well protected.
>>>>>>> 
>>>>>> Not quite following what you’re saying.  Are you alluding to operating a (split-horizon) border NTP server to hosts inside a perimeter, that in turn don’t have Internet access themselves (effectively operating as an ALG)?  Or something else?
>>>>>> 
>>>>> There is indeed a lot of scenarios where you operate NTP inside setups
>>>>> which have no, or very limited Internet access. Yet it is used because
>>>>> it is COTS "Internet technology" that fits well with what is used. As we
>>>>> design things we also need to understand that there is a wide range of
>>>>> usage scenarios for which our normal expectation of what we can do on
>>>>> "Internet" is not necessarily true or needed.
>>>>> 
>>>> That should be a tipoff right there:  We’re working on “Internet standards”.  Not “Intranet standards”.
>>>> 
>>>> We don’t need to solve every problem in every scope.  There’s a reason the term “local administrative decision” appears in many, many standards.
>>>> 
>>>> Again, protocol, not policy.  Deciding that we need to accommodate rare/one-off usage scenarios in isolated cases is a policy decision.  It’s choosing to insert ourselves into an environment that’s not strictly within our purview.
>>>> 
>>>> What people do on their own intranets with the curtains drawn (and the perimeter secured) is very much their decision…
>>>> 
>>> If it where that easy. The success of Internet protocols and standards forces things to be used where normal Internet rules do not apply.
>> 
>> I’m not sure anything is being forced.  I think people are lazy.  And when a hammer is the closest tool to you, everything in your reach becomes a nail…
>> 
>> We can’t stop them, but nor do we have to overly concern ourselves with enabling or facilitating this behavior.
>> 
>> People use self-signed certificates even though it’s a travesty (read: insecurity) and certificates should be rooted to valid, well-known root CA’s.  Knowing this to be the case, I’m not going to bother asking myself “what happens to my protocol when self-signed certs are used”.  The answer is: “not my problem”.
> Well, for some of these things I agree we should not fix them,
> self-signed certificates is just one in a row. However, I'm not saying
> we should solve all those problems, but realizing just the diversity of
> uses that protocols we design have, can be a good exerices as to ask
> ourselves what we can learn from it, as we take a step back from just
> considering what is the right thing for the case of "Internet". Some
> design decisions we make can turn out to be unwise even for that case,
> but we may not see them. I've always found it good to understand
> multiple scenarios as things scale up and down, as it illustrates issues
> about assumptions being made, which may not be what we want even for our
> primary scenario. Some of the reasonable releases of concerns may not be
> that expenseive and we can gain a mode modular and workable protocol
> that makes it more future resistant. It's a somewhat more effort, but in
> my humble experience, it's well worth the effort and exercise.


And thinking about a service operating without working DNS, operating in the case of a partial network partition, etc. is reasonable.

Designing for an isolated Intranet is out-of-scope.


>> [snip]
>> 
>> “Require components and behaviors as needed for each such scenario” sounds dangerously like “dictate policy” to me.
>> 
>> Not sure I agree with that generalization.
>> 
>> MIME was added to Mail to add multi-media capability, and it worked.  It’s not pretty, but it’s highly functional.
>> 
>> Telnet options were from the very beginning open-ended.  When local line-editing was needed because remote editing over high-latency lines was just too painful, it “dropped right in”.  As did TN3270 interoperability for people who needed to talk to IBM Mainframes.
> As you overinterpret the generalization, it indeed becomes dangerous,
> and way way way outside of what I mean to the degree I wonder why you
> keep mocking with me all the time.


I’m not mocking anyone.  Telnet options are a reasonable parallel to NTP extension fields, and we can learn from its success.


> You end up sounding very disgraceful,
> and I do not think you intended to.


You might be reading too much into what I’m writing.


> However, the way you try to put
> limits on it, that too becomes dangerous, as you can end up with a too
> restricted thing and the end result risk becoming more complex as a
> result.


I think that being a minimalist and an adherent to KISS is a perfectly valid position to have; and you’re stating that my advocating for simplicity will eventually lead to greater complexity without substantiating that.


>>> [snip]
>> 
>> Sorry, but you keep waving your hands on this issue.  Who says we have to solve these problems?  How does it end up being “our problem”?  If they’re on an isolated intranet, with no possible connection to us, it’s a bit like a tree falling in the forest with no one around, isn’t it?  Whether it makes a sound or not changes nothing from my perspective.
> 
> No, you keep waving you hands around "Internet", when in reality the
> same technology is used in many large networks, with more or less
> connection to "The Internet". Quite large portions of the boxes is used
> in more or less hidden applications, while built on the same technology.


Again, that someone uses Internet protocols somewhere that’s not the Internet isn’t really our problem, any more than it’s the fault of a Pharma company when their drugs aren’t used as prescribed… or someone using the grip of a loaded pistol as a nutcracker and then being surprised when it discharges into their ceiling.

The fact that you’re describing it as “hidden applications” tells you everything you need to know: that whether it works or doesn’t the result (as perceived by us) is the same.  Both are “hidden”.


> 
>>> Internet technology is being used outside of what you would call Internet, it's just a fact of life. It is assumed that COTS boxes can operate there, and when not it becomes a problem, and because it's "Internet technology" it is assumed to work together regardless in a multi-vendor setup. You end up with a proliferation of various variants and it's a pure pain to handle.
>> 
>> Again, I’m not ending up with anything here, because what you describe is happening on an Intranet that bears no connection or relevance to me.
>> 
>> And I don’t assume that COTS boxes that require the Internet operate anywhere that the internet is not connected.  If they can’t be pushed software updates, for instance, then they will eventually have a critical software vulnerability which can’t be patched for and I want nothing to do with them.
>> 
>> But again, none of this is “a pure pain to handle” because it’s happening somewhere that I am insulated from, isolated to, and blissfully ignorant of.
> 
> Yes, well that is why I bring it up. What may seem like a good decision
> for Internet, may not be that for other cases, which then forces them to
> break away and solve it by some other means, which means more divide in
> features used and well, IETF can do what they feel like and we cherry
> pick the good stuff and we do what we want to do. That ends up just
> deepening the divide there is.


What the IETF does is already succeeding on tens of billions of devices.

That doesn’t seem like a “cherry-pick” nor an insignificant accomplishment.

Its success is NOT, however, a reason for it to exceed its mandate.

What we “want to do” is efficiently and in a timely fashion satisfy our charter.  Spending more time worrying about things outside of our charter defeats both of these goals.



> Rather, if one can look at other
> scenarios and see what can be useful for them, one at least can reduce
> the divide a little. IETFers can see the same thing as they look at the
> IEEE 802 group, where IETF cherry picks some of the IEEE 802 output, but
> then ignores some of the things.


Wait… so I draw a parallel to Telnet options or MIME mail, and that’s a mockery, but…

In the case of the IETF, they ignored the Ethernet v2 (DIX) standard and came up with IEEE 802.3 SNAP… which no one uses.


> 
> Now, I think there is some lessons to be learned by looking at different
> scenarios, realizing that not all solutions fit all. I think one can do
> reasonable generalizations, that is generalizations not going overboard,
> but that provide means for variations that may be useful already for the
> base scenario.


Saying “let’s design an Internet protocol that works not-on-the-Internet" is an overboard generalization.

We’re not even talking about an Internet with reduced connectivity (a network partition or asymmetrical paths), or service outages such as no operable DNS… we’re talking about "off the Internet” (i.e. an isolated Intranet).



> It's a little more than the minimum, minimum, but maybe
> more future proof. Adding extensions afterwards end up being a little
> bit messy. MIME is one, ARP, multicast group mapping and IGMP with IGMP
> snooping going towards the more uglier side of things.


ARP was a layer 2 mechanism.  The Internet starts at layer 3.  I fail to see any equivalence here.


> 
>> 
>>> Easing some of that pain will always be welcome, such as understanding that it may not always be that you can count on say DNS to always save the day for       all cases. At the same time, the best current practice for Internet devices is for sure on topic, and have particular requirements we do not want to get wrong.
>>> 
>>> Cheers,
>>> Magnus
>> 
>> It sounds like you’re sending a mixed message, but maybe I’m just not understanding.  What I’m hearing you say is, “be sparse [or ‘economical’] in your assumptions about what where things might be used”, but I’m also hearing you say “make sure that, as an Internet protocol, it operates properly in the total absence of Internet connectivity”… which is a huge burden to assume.
> 
> If the message was that simple as you like, it would rule out necessary
> parts, then the message is overly simple and not useful.


You’re conflating “convenient” or “expedient” with “necessary”.


> I can reduce
> the message a lot, but not below the most basic complexity it needs, and
> some of the things are contradictory. If you try to think I have one
> simple message, you read me all wrong.
> 
> Cheers,
> Magnus


No, I’m well aware that there are contradictions in some of the things that you say, like “let’s intertwine unrelated things in the interest of preserving ‘layering’”.

-Philip