Re: [Ntp] NTPv5: big picture

Magnus Danielson <magnus@rubidium.se> Tue, 05 January 2021 13:35 UTC

Return-Path: <magnus@rubidium.se>
X-Original-To: ntp@ietfa.amsl.com
Delivered-To: ntp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AD393A0EB9 for <ntp@ietfa.amsl.com>; Tue, 5 Jan 2021 05:35:02 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.361
X-Spam-Level:
X-Spam-Status: No, score=-2.361 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.262, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rubidium.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NAOe5IYevRJi for <ntp@ietfa.amsl.com>; Tue, 5 Jan 2021 05:34:56 -0800 (PST)
Received: from ste-pvt-msa2.bahnhof.se (ste-pvt-msa2.bahnhof.se [213.80.101.71]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0210D3A0EB5 for <ntp@ietf.org>; Tue, 5 Jan 2021 05:34:53 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa2.bahnhof.se (Postfix) with ESMTP id 9C6C53F6E8; Tue, 5 Jan 2021 14:34:37 +0100 (CET)
Authentication-Results: ste-pvt-msa2.bahnhof.se; dkim=pass (2048-bit key; unprotected) header.d=rubidium.se header.i=@rubidium.se header.b=kZEOP9l8; dkim-atps=neutral
X-Virus-Scanned: Debian amavisd-new at bahnhof.se
Authentication-Results: ste-ftg-msa2.bahnhof.se (amavisd-new); dkim=pass (2048-bit key) header.d=rubidium.se
Received: from ste-pvt-msa2.bahnhof.se ([127.0.0.1]) by localhost (ste-ftg-msa2.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5er5tv-j1SCp; Tue, 5 Jan 2021 14:34:34 +0100 (CET)
Received: by ste-pvt-msa2.bahnhof.se (Postfix) with ESMTPA id D690B3F6C0; Tue, 5 Jan 2021 14:34:33 +0100 (CET)
Received: from machine.local (unknown [192.168.0.15]) by magda-gw (Postfix) with ESMTPSA id BBCFC9A050D; Tue, 5 Jan 2021 14:34:47 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=rubidium.se; s=rubidium; t=1609853687; bh=In1BRCXqkB3CBLKjt2RbyPgnBgBvv35cpZXqEjG/OuU=; h=Cc:Subject:To:References:From:Date:In-Reply-To:From; b=kZEOP9l8OeChPeA5g0YKcVNcdmGJCRhQYO1cv/7pmK4LN6IgVxfaOWyS3IH0AOOS4 y9CNVtQ0GqufPK/u+3JlqjIL8hZcKtU/G243ArndWTicigJBe9X2Hj5W1mbVZ39QGi 4bwWM8CWVkz1tC2MzMQuSOjtA9fRaDCHm7EjfoEKBEcWEgmN6m83g3KLAjg3Iwsmdo c2TvNTl7WmFWIkdqv7sYRyyqjXi7EFYDo72nOzQaA5qmmNYPBtf1dFp0VlmJUvZ2j4 XqqfGr1BAhjtqSaqf2UnUSSM+V+jOSFBMHlKTR/vNALbgrFYIO1WmrweeLdIABP7HN 3LxufJ/pp06oA==
Cc: magnus@rubidium.se, ntp@ietf.org
To: Philip Prindeville <philipp@redfish-solutions.com>
References: <20210101025440.ECE3340605C@ip-64-139-1-69.sjc.megapath.net> <155b7ae6-c668-f38f-2bbd-fd98fa4804db@rubidium.se> <16442E9F-DD22-4A43-A85D-E8CC53FEA3E5@redfish-solutions.com> <66534000-c3ba-8547-4fb1-1641689c6eba@rubidium.se> <E6F9312A-2080-4D13-9092-935080859750@redfish-solutions.com> <1086ffe6-234a-d2d4-13d6-6031c263f4cd@rubidium.se> <B4E8F8D4-95D8-4ACB-9770-FCFEBFE002A0@redfish-solutions.com> <093df8ba-548d-b488-4780-f28d69150884@rubidium.se> <16792971-F622-47BE-BF28-B522925734BD@redfish-solutions.com>
From: Magnus Danielson <magnus@rubidium.se>
Message-ID: <9b129a5f-eec0-1f9d-f4f9-0027f86ae964@rubidium.se>
Date: Tue, 05 Jan 2021 14:34:46 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.6.0
MIME-Version: 1.0
In-Reply-To: <16792971-F622-47BE-BF28-B522925734BD@redfish-solutions.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ntp/7CUTsFXyX7wNvsuf1V5JN-0MaU8>
Subject: Re: [Ntp] NTPv5: big picture
X-BeenThere: ntp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ntp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ntp>, <mailto:ntp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ntp/>
List-Post: <mailto:ntp@ietf.org>
List-Help: <mailto:ntp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ntp>, <mailto:ntp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Jan 2021 13:35:02 -0000

Philip,

On 2021-01-05 05:53, Philip Prindeville wrote:
>
>> On Jan 4, 2021, at 5:26 PM, Magnus Danielson <magnus@rubidium.se> wrote:
>>
>> Philip,
>>
>> On 2021-01-04 21:20, Philip Prindeville wrote:
>>>> On Jan 4, 2021, at 9:27 AM, Magnus Danielson <magnus@rubidium.se>
>>>>  wrote:
>>>>
>>>> Philip,
>>>>
>>>> On 2021-01-02 03:49, Philip Prindeville wrote:
>>>>
>>>>> Replies…
>>>>>
>>>>>
>>>>>
>>>>>> On Jan 1, 2021, at 7:01 PM, Magnus Danielson <magnus@rubidium.se>
>>>>>>  wrote:
>>>>>>
>>>>>> Philip,
>>>>>>
>>>>>> On 2021-01-02 01:55, Philip Prindeville wrote:
>>>>>>
>>>>>>> Replies…
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Dec 31, 2020, at 8:35 PM, Magnus Danielson <magnus@rubidium.se>
>>>>>>>>  wrote:
>>>>>>>>
>>>>>>>> Hal,
>>>>>>>>
>>>>>>>> On 2021-01-01 03:54, Hal Murray wrote:
>>>>>>>>
>>>>>>>>> Do we have a unifying theme?  Can you describe why we are working on NTPv5 in 
>>>>>>>>> one sentence?
>>>>>>>>>
>>>>>>>>> I'd like to propose that we get rid of leap seconds in the basic protocol.
>>>>>>>>>
>>>>>>>> Define "get rid off". Do you meant you want the basic protocol to use a
>>>>>>>> monotonically increasing timescale such as a shifted TAI? If so, I think
>>>>>>>> it would make a lot of sense.
>>>>>>>>
>>>>>>>> If it is about dropping leap second knowledge, it does not make sense.
>>>>>>>>
>>>>>>> I think “handle separately” makes sense here.  It shouldn’t be a blocking problem and how we handle it, assuming me handle it correctly, is orthogonal to everything else.  Or should be.  See “assuming we handle it correctly”.
>>>>>>>
>>>>>> I think the core time should have a very well known property, and then
>>>>>> we can provide mappings of that and provide mapping parameters so that
>>>>>> it can be done correctly.
>>>>>>
>>>>> Jon Postel would frequently tell us, “protocol, not policy”.
>>>>>
>>>>> The mapping doesn’t have to be embedded in the protocol if it’s well understood and unambiguous.
>>>>>
>>>> You are mixing the cards quite a bit here. Policy is not being discussed
>>>> at all here.
>>>>
>>> It very much is.  It’s a “policy” decision to state, for example:
>>>
>>> "NTP needs to support UTC and it needs to announce leap seconds before they happen.”
>>>
>>> No, we could have a leapless monotonic timescale and leave it to the application layers or runtime environment to convert kernel time to UTC, etc.
>>>
>>> Just as it’s a policy decision to have a machine’s clock be UTC or local time, or to have the epoch start at 1900, 1970, or 1980, etc.
>>>
>> It's not a policy decision of ours. It's a requirement the surrounding world bring us. If we do not find a way to provide what others need that will work for them, they become forced to use another solution.
>>
>
> Well, you’re half right.  It’s a requirement that a lot of the world has, but as long as the timescale is convertible AT SOME POINT somewhere between the packets on-the-wire and returning to a library call in libc, THEY DON’T REALLY CARE AND THEY SHOULDN’T HAVE TO.
I wonder how you came to the conclusion that I was arguing for something
else?
>
> This is why we have layers of abstraction, instead of a single monolithically linked image running on bare metal in a single context.
>
> I think we did away with that with batch entry systems in the early 1960’s.
Now you are preaching to the choir again, what have you misunderstood? I
never claimed anything requiring that mentioned.
>> The only policy here is to make it relevant enough, and then try to make it come of as cheap as possible through engineering.
>
> I’ll take “correct” over “cheap” most days.  Incorrect ends up being quite costly.
Indeed. Incorrect is very expensive, so the "cheap solution" is the very
expensive solution. The trouble is that sometimes the cost does not show
where the cost reduction was made.
>
>>>> "protocol" is missleading as well. Also, in the old days it was a lot of
>>>> protocols, but lacking architectural support providing needed technical
>>>> guidance. We moved away from that because more was needed.
>>>>
>>> Sorry, this is a bit of a generalization.  What prior parallels that apply to us can you point to so I have a better sense of what you’re referring to?
>>>
>> For instance RTP. When RTP was built, it did not really provide a model for how the clock was related to the media transported, especially when spread over multiple streams. It was only over several generations of specifications that the NTP clock was replaced by a common media clock, and then related to the timing between the streams etc. and their meaning.
>
> Are we talking about multiple streams from the same source to the same endpoint?  Or converging streams from multiple sources?
Actually both these days.
>> In the context it was discussed, we talked about the NTPv5 time-scale. That needs to have some know relationship to some other know time-scale, such that the set of gears needed to convert time-stamps in and out of the NTPv5 time-scale is known.
>
> Not disagreeing that the conversion needs to be well-understood (and unambiguous).
>
> But it doesn’t need to happen inside NTP itself.  Or even the kernel.
I never said it needed to be done inside NTP itself or the kernel. I
only said it needed to be well understood for the implementation. The
details of the implementation, daemon, kernel, libc or whatever is
really not what we should care about, except that we care about
providing the needed information for them to do their job properly. I do
not know where you got the notion that I it necessarily needed to be
done in NTP or kernel itself, I never claimed that. I did however claim
the specifications needed to be clear, which is a completely different
thing.
>> It could be TAI-like, but it would not be the actual TAI, just as the PTP timescale for instance.
>
> My understanding is that PTP uses TAI as the default timescale.

It doesn't really. Similar enough, but not really. See further down.


>> Whatever it is, it needs to be known, so that implementation A, B and C of the protocol knows how to convert it to the same time so they become consistent. So there the mapping becomes important part of the protocol behavior such that the intended service is achieved, leaving it to an implementation issue to figure out the mapping is not good standardisation making. It needs to be known, either specified directly, or through reference.
>
> You’re conflating things: NTP as a user-space process (as most daemons are) just needs to use the same run-times that query the kernel for the time and serve it up in the requested timescale of choice, converting from the kernel’s internal canonical representation of time.
Well, that was tried an found problematic, so the nanokernel API was
added. David Mills have written about that. The timing interface from
many kernels have developed much much further since. However, this is
again handy information to discuss to aid developers, but not related to
the actual NTP protocol. We need to know the models for how things is
translated, but not really how they are implemented. I kept those very
distinct here, where as you seem to confuse what I mean with it.
>
>
>>>> The mappings I talk about needs to be not only known, but referenced or
>>>> else the "protocol" will not be able to implemented in a consistent way,
>>>> which we badly need. If a new relationship is introduced, it needs to be
>>>> specified.
>>>>
>>> We can “reference” the mapping without needing to embed it in the protocol as we’ve previously done, yes.  Agreed.
>>>
>>>
>>>
>>>> None of this have anything to do with policy as such, that is as best a
>>>> secondary concern that has nothing to do with what we discuss here.
>>>>
>>> What goes into a protocol is in itself a policy decision.
>>>
>> Well, there is already policy that it needs to be clear enough that multiple implementations can cooperate. Thus, it needs to have all the necessary details to make that feasible and demonstrateable, but that policy applies not only to NTP. It's a requirement for it to be meaningful.
>
> Not to split hairs, but that’s our ratification process.  The policy is that the protocol needs to be sufficiently clear that an implementation can be written “in the blind” using just the text of the standard, to prove out the completeness and accuracy of the standard.
>
> The above is the process that validates adherence to this policy.
Sure, and as we do the protocol documentation, we need to make sure we
fulfill everything that is needed to pass that test. Thus, it becomes a
requirement to the protocol writing that needs regular internal testing
before release to meet the validation process.
>>>>>>>>> Unfortunately, we have a huge installed base that works in Unix time and/or 
>>>>>>>>> smeared time.  Can we push supporting that to extensions?  Maybe even a 
>>>>>>>>> separate document.
>>>>>>>>>
>>>>>>>> Mapping of NTPv5 time-scale into (and from!) NTP classic, TAI, UTC,
>>>>>>>> UNIX/POSIX, LINUX, PTP, GPS time-scales is probably best treated
>>>>>>>> separately, but needs to be part of the standard suite.
>>>>>>>>
>>>>>>> I think TAI makes sense, assuming I fully understand the other options.
>>>>>>>
>>>>>> Do notice, the actual TAI-timescale is not very useful for us. We should
>>>>>> have a binary set of gears that has it's epoch at some known TAI time.
>>>>>> One such may for instance be 2021-01-01T00:00:00 (TAI). As one can
>>>>>> convert between the NTPv5 timescale and the TAI-timescale we can then
>>>>>> use the other mappings from TAI to other timescales, assuming we have
>>>>>> definitions and other parameters at hand. One such parameter is the
>>>>>> TAI-UTC (and upcoming change).
>>>>>>
>>>>> I’m hoping to avoid further proliferation of timescales if possible.
>>>>>
>>>> It's a lost cause because it builds on a misconception. It's not
>>>> creating a timescale that competes with TAI and UTC, it's the technical
>>>> way that those time-scales is encoded for communication in order to
>>>> solve practical problems and be adapted to those technical challenges.
>>>> Very few use the actual TAI or UTC encoding as they communicate, for
>>>> good technical reasons. That will continue to be so, and there is no
>>>> real case for proliferation as such.
>>>>
>>> Again, that sounds to me like a generalization.  Or perhaps it’s an assertion that’s well understood by everyone else but me… so maybe you can enlighten me?
>>>
>>> PTP uses TAI.  It doesn’t seem to have been an impediment for them.  What am I missing?  And how did these “good technical reasons” not apply here?
>>>
>> PTP does not use TAI. PTP has a defined mapping of TAI into a PTP-timescale, which is different. You can convert PTP time to and from TAI. The PTP epoch, thus when the PTP time-stamp was 0 is defined in clause 7.2.2 of IEEE1588-2008 as:
>>
>>
>> "7.2.2 Epoch
>> The epoch is the origin of the timescale of a domain.
>>
>> The PTP epoch is 1 January 1970 00:00:00 TAI, which is 31 December 1969 23:59:51.999918 UTC.
>>
>> NOTE 1—The PTP epoch coincides with the epoch of the common Portable Operating System Interface (POSIX) algorithms for converting elapsed seconds since the epoch to the ISO 8601:2004 printed representation of time of day; see ISO/IEC 9945:2003 [B16] and ISO 8601:2004 [B17].
>>
>> NOTE 2—See Annex B for information on converting between common timescales."
>
> Okay, I should have said, "PTP uses a direct 1:1 mapping to TAI without divergence."

Maybe, the thing is, there is multiple mappings that has TAI-like
properties. PTP is one and GPS is another, and there is more. The "there
is more" part is also part of objections to the TAI proliferation over
use of UTC. That is a wide discussion that you may not have seen, but
it's there anyway. I think it is better to avoid talking about PTP using
TAI, but consider that PTP uses a TAI-like or TAI-based timescale of
it's own.

Anyway, so now I think it should be clearer why I think it is better to
say that it forms a time-scale in it's own right. It has known mapping
to other time-scales, in particular TAI or UTC. Some of these mappings
can be made correct where as others will not achieve accurate mapping.

As we go forward, I think there is already a rough consensus that NTPv5
will use a new internal time-scale that has TAI like properties. I think
that is very very wise, I strongly recommend it because it's provide to
work well in many cases, while keeping implementation fairly "linear" in
that regard.

>> TAI just simply count as seconds.
>
> Right.  Which goes back to my point that a time protocol doesn’t need to care about how you slice time up into larger units of measure like minutes, hours, days, years, and leap-seconds.
>
> That’s a “presentation” issue, which is more applicable to other protocols like logging (which requires human readable timestamps) or calendaring (which deals with purely human notions of time).

Which I have been trying to point out all along. We could choose an
arbitrary epoch for NTPv5, make it behave like TAI and that will keep
the core protocol handlings clean. Then provide models to map that out
to deliver the user time-scales, and if necessary provide the needed
mapping parameters, and the TAI-UTC difference is one such parameter. If
you want a smoothed leap-second, and there seems to be people that want
that, well then we can describe another model that provide that out of
the same core time.

Rather than providing alternative time-scales on the core timing
protocol, which will be confusing to combine, consider server A
providing TAI, server B providing UTC and server C providing leap-second
smoothed UTC, then being able to do any of the fancy multiple server
combinations becomes very troublesome. If they keep a common time-format
in the core that is also TAI-like, thus no jumps etc. then that core
becomes very easy to process. Then, out of that core time, regardless
from which set of servers it was derived, we can map it out to a number
of user formats. TAI and UTC both seems to be formats we need to
support, and even if it's to my disliking, the leap-second smoothed
UTC-like time. Other such user formats is the UNIX/POSIX time_t and the
Linux time_t. Being able to achieve PTP time-scale is needed in modern
media production, such as SMPTE 2110 does, based on the SMPTE 2059-1/2
mappings. This also used in AES 67, and in particular the lack of PTP
presence while running AES 67 prompted a very long deviation away from
audio transport protocols, because the time needed was not well described.

>> Similarly, the GPS time-scale has it's epoch defined to be at 1980-01-06T00:00:00Z, thus locking in the TAI-UTC offset of 19. This is referred to as proliferation of TAI-variants, but really isn't as it's not attempting to fill the position of either TAI or UTC. Both the GPS and PTP internal time-scales is just technical scales for internal representation. Regardless if your GPS receiver output GPS time or UTC time (as GPS-UTC difference is transported), you will never get the actual GPS or UTC time, but local replicas. From a metrology calibration stand-point, they have unknown properties, due to the lack of calibration and through calibration achieved traceability. Similarly with PTP, you can out of the PTP time produce a TAI replica as well as UTC replica, as PTP transports the TAI-UTC difference.
>
> But it doesn’t have to.  That’s a convenience for devices that don’t have external sources of conversion information.
Which then turns the issue over to a feasibility aspect of doing it
large scale. I think it does not require much trouble to transported the
needed parameters, while not obstructing the core processing. The
additional parameters is very low-bandwidth data that with a bit of
thought put into it can be taken sufficiently of the critical timing
path, even if there is a few "loose" real time requirements.
>
> An E3 digital cross-connect doesn’t typically receive updates of Zoneinfo, for example… even though it very much cares about accurate time (or it will have frame-slips, etc).

The PDH and SDH/SONET does not use time in any such sense, at best you
attempt to "synchronize" (actually syntonize) just to keep quality good
enough. Having the time and date on the device has really nothing with
the switching itself to do, as that is controlled by framing and control
information, but the management to put time-stamps on events when things
failed etc, and that has far lower requirements on precision. As it
happens, I've spent a few decades designing equipment and sorting out
synchronization for telecom.

>> So, these conversions or mapping between timescales is practical engineering things we do, but not really proliferation issues.
>
> Or conversely it’s mixing up unrelated topics because of some intangible and incorrectly perceived benefit that it *may* provide, and we overcomplicate what should be a lean protocol unnecessarily.
>
> We could, if we wanted to, come up with a protocol to distribute the Zoneinfo database to “dumb” devices like video conferencing systems, which would then mean that they had a separate mechanism for knowing when to apply leap-seconds, and we could remove that from what should be a strictly clock synchronization protocol… instead of laboring under the misconception that we need to solve all these problems in a common place, just because they’re both time-related.
I am not in favour of using the zoneinfo database for that, I think it
complicate matters and becomes troublesome to make the overall function
operate as needed, at least for what I consider is the core service.
>> It's the thing we need to do.
>
> And there’s that conflation.  No, it’s the thing you want to do, because it kills two birds with one stone.
>
You may think so, but I end up seeing that what you try to push is going
to be problematic to operate properly, and I try to keep things together
when they belong together. NTP provides the time-service, the
time-service requires us to achieve certain things. Spreading things out
to other places is like breaking layers, it complicate things. This is
where trying to make simpler than it needs to be becomes a burden.

Splitting it off is extremely ill-adviced in my bitter experience. It's
drawing the wrong conclusions from previous troubles.

>
>> Just using TAI at the side of UTC has been an issue, as some feel it should be UTC and only UTC to rule them all.
>
> And yet UTC isn’t inherently unambiguous, especially if we believe that a day contains 86,400 seconds, 24 hours of 60 minutes, 60 minutes of 60 seconds, and each second being an SI…
UTC is unambigous. You keep talking about what ambigous mappings of UTC
produce, but that result will not be compliant with UTC. Do not confuse
the properties of UTC with that of derivate time-scales such as NTPv4,
POSIX and Linux produce. The underlying UTC is unambigous itself. It's
very simple, if you cannot represent a leapsecond as :60, you do not
have the UTC. NTPv4, POSIX time_t and Linux time_t fail that test, so
you cannot talk about them as being UTC. At the best, they can represent
an UTCesque time. For some uses, that's good enough, and that's fine if
that is all you need. If you need better consistency or you need the
actual UTC, you need to do more. This goes back to what I said earlier,
I see many uses needing (as in requiring, sometime by law) UTC.
>> Well, in practice we need TAI-properties alongside the UTC properties. We kind of actually do not the actual TAI, but it is handy if we can provide the replica. We will never get the actual UTC, but it is very handy if we can produce a replica. Since we need to be better than 1 second in our achieved precision for several applications, we end up needing to handle leap-seconds better than doing the POSIX or Linux mapping, which is just other mappings. The NTPv4 time-scale is just one more of those mappings of UTC.
>>
>> BTW, Annex B in IEEE1588-2008 is handy informative annex of the IEEE1588 standard.
>
> I’ll look it over when I have some time.
It's a fairly quick read, not only a full 3 pages, but it is useful and
handy.
>
>
>>>>>>> If NTP v5 sticks around as long as NTP v4 has to date, I think we can’t underestimate the implications in both autonomous flight (the unpiloted taxis that are being certified right now come to mind), as well as the proliferation of commercial space flight… space flight has been commoditized (in part) by the use of commercial-off-the-shelf technologies such as metal 3D printing for generating bulkheads and structural panels.
>>>>>>>
>>>>>>> Why shouldn’t the time standard/format used for space flight also be COTS?
>>>>>>>
>>>>>>> It seems increasingly probably over the next 20 years that interplanetary flight will become common.
>>>>>>>
>>>>>> Things can be installed in places where we just don't have access to
>>>>>> normal network.
>>>>>>
>>>>> Can you say what a “normal network” will be in 20 years?  I can’t.
>>>>>
>>>>> When I wrote RFC-1048 in 1988 I hardly thought there would be more than 4B IP-enabled smartphones with 2-way video capability less than 30 years later.
>>>>>
>>>>> I’m not even sure what 10 years down the road looks like.  And I’ve been trying for almost 4 decades.  My track record is nothing to brag about.
>>>>>
>>>>> What’s the shelf-life on WiFi6 or 5G?
>>>>>
>>>>> Will residential US ISP’s finally have rolled out IPv6?
>>>>>
>>>> I think you completely missed my point. To the level that you missed
>>>> what I was really saying is that there is a wide range of difference
>>>> scenarios already today, and as we design a core protocol, no single one
>>>> of them will provide the full truth.
>>>>
>>> The corollary of that is that whatever we design, there will be use cases where we don’t apply well… or at all.
>>>
>>> I think it’s a fool’s errand to pursue a “one size fits all” solution to a highly technical problem.
>>>
>> Which isn't what I say, if you listened a little more careful to what I say. I actually say the opposite. One size do not fit all, and the trouble is that I see that the same protocol design may end up being used in so diverse settings that we might consider these as just that, different scenarios. I think most of the things we want to do will be the same, but some aspects will be quite different. However, if we treat them with sufficient care so they can be added and removed for the various scenarios, we can make sure the core protocol remains the same and the set of things needed for the "Internet" scenario is clear and required as we implement that, and wise to do in that scenario.
>
> If people try to use the wrong protocol in their given circumstances, there’s not much we can do to stop them.  At least not in any practicable way.
>
> Worrying about how our protocol operates in outlying cases beyond our scope or control is a waste of time.  And there will always be more such permutations than we can foresee.
It tends to be a useful exercise, even if one does not complete all
assignments.
>>>> You only illustrate that you do not know me when you attempt to
>>>> challenge me like that.
>>>>
>>> I’m not challenging anyone.  I’m acknowledging that there are unknowables, particularly about the future.
>>>
>>> The larger a period one considers, the more numerous and significant in magnitude the unknowables are in retrospect.
>>>
>> As if I where not already alluding to that. 
>
> Okay, well, we’ve found something else to agree on then.
I do not really understand how you came to disagree with me, but ah well.
> Converging on an acceptable normative standard is not unlike eating an elephant.
I do not eat elephant, nor uneat. If I had to, I would make an overall
plan an then eat small chunks at a time. Then again, discussing elephant
eating is way off topic.
>>>>>>> Further, assuming we start colonizing the moon and Mars… stay with me here… will the length of a terrestrial day still even be relevant?  Or will we want a standard based not on the arbitrary rotation of a single planet, but based on some truly invariant measure, such as a number of wavelengths of a stable semiconductor oscillator at STP?
>>>>>>>
>>>>>> You can setup additional time-scales and supply parameters to convert
>>>>>> for them. If you need a Mars time, that can be arranged. JPL did that
>>>>>> because they could.
>>>>>>
>>>>> KISS
>>>>>
>>>>> The Mars Climate Orbiter impacted the planet in 1999 because NASA was using imperial units while everyone else was using metric.  That’s a catastrophic mistake arising from the existence of a SECOND scale of measurement.
>>>>>
>>>>>
>>>>> https://everydayastronaut.com/mars-climate-orbiter/
>>>> You are now very much of the mark here.
>>>>
>>> Sorry, typo?  “On the mark”?  “Off the mark”?
>>>
>> Off
>>
>> The only relevance to that story is with Mars. It was not a timing issue, it was a huge engineering debacle quite complex that it's hard to see it's relevance here. 
>
> The relevance is that some people were thinking in metric while others were thinking in imperial.
>
> Had metric been the single common unit of measure, there never would have been any confusion about the values being discussed.
The actual flaw was much more problematic, because first of all you had
a very asymmetric bird for economical reasons and the balance setup
required remote control. Turns out that the command and control
interface specifications where not checked between the satellite and the
ground control segment, so the lack of conversion on the ground became
apparent only in the investigation where it was uncovered that data was
interpreted as being of a different unit than sent, because the
interface specifications had not been fully reviewed by all relevant
parties.
> Similarly, it would be nice to have a single, unambiguous timescale from which other timescales (including those which are inherently ambiguous due to leap-seconds, etc) could be derived…
Sure, that part is exactly what I have been advocating for here.
>>>>>>>>> --------
>>>>>>>>>
>>>>>>>>> Part of the motivation for this is to enable and encourage OSes to convert to 
>>>>>>>>> non-leaping time in the kernels.  Are there any subtle details in this area 
>>>>>>>>> that we should be aware of?  Who should we coordinate with?  ...
>>>>>>>>>
>>>>>>>> I think that would be far to ambitious to rock that boat.
>>>>>>>>
>>>>>>> Divide and conquer.
>>>>>>>
>>>>>>> I think POSIX clock_* attempted this by presenting mission requirements-based API’s.
>>>>>>>
>>>>>> That was only part of the full interface.
>>>>>>
>>>>> Yes.  So?
>>>>>
>>>> If one argues based on what the POSIX CLOCK_* does, then one will not
>>>> get the full view.
>>>>
>>> The “view” that I am going for is that API’s can be extended to accommodate shortcomings that we’re understood previously, but later come to light.
>>>
>> Sure they can. Also, there can be API's that is already there which can be considered if they are not sufficient and already implemented.
>
> Sorry, if they “are not” sufficient and already implemented?  Or if they “are”?
Now, I come to believe there is sufficient interface already
implemented, it's just not included in what you referenced. Not all
implementations have this. I think we can look at what is already there
and see how that can be used. However, as per our other discussion, this
is more of an implementation issue. It is however relevant to some
degree if people think there is no support, and I think it is.
>
>
>>>>>>> The next step is to have consumers of time migrate… perhaps starting with logging subsystems, since unambiguous time is a requirement for meaningful forensics.
>>>>>>>
>>>>>> They will do what the requirements tell them. Many have hard
>>>>>> requirements for UTC.
>>>>>>
>>>>> Many will also migrate to whatever provides them with unassailable (unambiguous) timestamps in the case of litigation.  I’ve worked on timing for natural gas pipelines so that catastrophic failures (i.e. explosions) could be root caused by examining high precision telemetry… not the least of which was to protect the operator in the case of civil suits of criminal negligence, etc.
>>>>>
>>>> There is many ways to show traceability to UTC, while technically
>>>> avoiding problems. This however often involves knowing the TAI-UTC
>>>> difference one way or another such that the relationship to UTC is
>>>> maintained. The requirement to be traceable to UTC will still stand in
>>>> many many cases. The way that one technically avoids problems is to some
>>>> degree orthogonal to that.
>>>>
>>> I disagree with this premise:  we don’t need to be traceable to UTC.  UTC needs to be derivable from our timescale, so that it’s meaningful to applications (and eventually, humans).
>>>
>> For many uses I agree with you. However, there are other uses at which it becomes a requirement. The question then becomes if we engineer NTPv5 to be able to deliver sufficient properties to fulfill that, or not.
>
> Again, this is a generalization that I have a hard time grounding in anything concrete.
>
> Can you explain when/where/what this might be?  What are these “other uses”?

OK, so let's start with some case which I think you agree with.

Let's say you have a router or a server, it produces logs. We want
time-stamps on these logs to know when things happens. Fine, we get time
from some NTP server and the box seems be ticking away, and for many
purposes we have sufficient correct time that we can correlate the logs
in several boxes. If this time is off by some arbitrary time does not
care as long as it is consistent between what you see, but it's good
that it matches time of day and UTC seems good enough, probably with
some local time-zone tossed in.

Then, let's consider a mobile base station, you have the same needs for
logs there as in any operational environment, and for most uses the
original description works. However, now the law enforcement puts
requirement that key events relating to phones interaction with the
network becomes critical, as they use these to record the presence of a
phone to a particular base station (antenna), they record when voice and
text messages is sent etc. Now, the regulator then puts the requirement
on the operators that the time of those things needs to be traceable to
UTC, and they can put some requirements on that. In practice, a lot of
those base-stations get their time using NTP (it will shift with 5G as
PTP is expected to take over that role, but it does not diminish the
example).

The regulators and law makes have chosen to include these requirements
on more and more systems, and their golden standard is to say "traceable
to UTC". Notice that this is the UTC as being kept by BIPM, and
traceability achieved through any of the signatory national labs, so non
of the ambigous UTC derivates we talked about.

Another example is how some regulators check that the time in call data
records (classically a hated term in IETF, I know), is correct, such
that the correct fair for the time-span is put on the customers record.
Again traceability to UTC.

Depending on applications, the requirements for timing range from +/- 15
min to +/- 100 ns (if I ignore the needs to metrology labs). Some of
these requirements will for sure be outside of NTP reach, because of
it's limitations. Some of them can do with very weak solutions, even
SNTP. However, there is only so many parallel systems that you want to
operate, so it makes sense to be able to handle most requirements one
simpler system (NTP that is) and then the more stringent in more
dedicated systems. The border-line between these is roughly in the 1-10
ms span in my judgement.

Now, traceability could mean we run djungle time and provide corrections
on the side, we covered that separately. However, it becomes messy to
keep a side-system to correct after the fact for most things, so it is
again practical to make continuous adjustments to keep the system in
line. The traceability records still being collected for the paper-work
side of things to prove one achieves it, keeping the actual system
adjusted close enough to UTC with that kind of span means that we need
to follow UTC, it makes leap-second smearing not useful for those
applications (not to say there is not other applications where it may be
a sensible solution), I end up concluding that it makes sense to do it
properly.

I used mobile case as one particular demonstrator, but there is more of
them. These mobile networks do not sit on public networks (aka Internet)
but on separated networks. These are huge separate networks. Exactly how
much of the public Internet allowed "in" depends on so much. The
operations of these is quite different and changes to the network can be
cumbersome.

>>>> Then again, may seems to misunderstand what the term "traceable" means.
>>>> It does not mean "locked to".
>>>>
>>> Maybe I’m one of the many misunderstanding then.  For me “traceable to” is synonymous with “originating from”.
>>>
>> It is a very common misunderstanding, yes. The term "traceable" refers to the unbroken chain of calibrations to the SI units and the traceability record that produces, as each calibration will measure the deviation and precision achieved. I can use a complete djungle clock that is unsteered and then through continuous calibrations be able to convert the readings of my clock into the readings of UTC. This is a mapping with defined parameters from that calibration. Adjustment of the clock during a calibration, is about re-setting the clock so that the calibration parameters compensate directly, which is more a practicality in how the conversion is made, but not strictly necessary. In practice, the actual clocks building up the EAL/TAI/UTC is free-running, and then the laboratory replica of TA and UTC is steered to be neat TAI and UTC, but the actual clocks is not steered. Also, the trouble is that gravity pull on the various labs needs compensation, which is done in the ALGOS algorithm as post-processed by BIPM, and not the labs. Never the less, all clocks have traceability to TAI/UTC. Derived clocks then show their traceability to the lab replica of UTC. This just to show how traceability actually is a bit different in metrology than you would first want to believe. Vocabulary of International Metrology (VIM) is free to download from BIPM.
>>
>> One may then wonder why we need to follow the VIM and Metrology use of the terms, well, the trouble is that if we don't, we end up creating confusion, because they end up doing calibration and dissemination of time using NTP. It is their time-scales of TAI and UTC we use. I try to be strict in order to avoid confusion, and I think there is enough alternative vocabulary to use so we do not need to add that confusion.
>
> Okay, thanks for clearing that up for me.
Happy to assist.
>
>
>>>
>>>
>>>>>>>>> I'd like the answer to be authenticated.  It seems ugly to go through NTS-KE 
>>>>>>>>> if the answer is no.
>>>>>>>>>
>>>>>>>> Do not assume you have it, prefer the authenticated answer when you can
>>>>>>>> get it. I am not sure we should invent another authentication scheme more.
>>>>>>>>
>>>>>>>> Let's not make the autokey-mistake and let some information be available
>>>>>>>> only through an authentication scheme that ended up being used by very
>>>>>>>> few. You want to have high orthogonality as you do not know what lies ahead.
>>>>>>>>
>>>>>>>> So, we want to be able to poll the server of capabilities. Remember that
>>>>>>>> this capability list may not look the same on un-authenticated poll as
>>>>>>>> for authenticated poll. It may provide authentication methods, hopefully
>>>>>>>> one framework fits them all, but we don't know. As you ask again you can
>>>>>>>> get more capabilities available under that authentication view. Another
>>>>>>>> configuration or implementation may provide the exact same capabilities
>>>>>>>> regardless of authentication.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Maybe we should distribute the info via DNS where we can 
>>>>>>>>> use DNSSEC.
>>>>>>>>>
>>>>>>>> Do no assume you have DNS access, the service cannot rely on that. It
>>>>>>>> can however be one supplementary service. NTP is used in some crazy
>>>>>>>> places. Similarly with DNSSEC, use and enjoy it when there, but do not
>>>>>>>> depend on its existence.
>>>>>>>>
>>>>>>> Good point.
>>>>>>>
>>>>>>> As someone who works in security, I’ve seen a fair share of exploits that arise when protocols make tacit assumptions about the presence and correctness of other capabilities and then these turn out not to be valid under certain critical circumstances.
>>>>>>>
>>>>>>> Doing X.509 when you don’t have Internet connectivity for CRL’s or OCSP is a good example.
>>>>>>>
>>>>>> I've seen many networks where normal rules of internet applies, yet they
>>>>>> are critical, need it's time and fairly well protected.
>>>>>>
>>>>> Not quite following what you’re saying.  Are you alluding to operating a (split-horizon) border NTP server to hosts inside a perimeter, that in turn don’t have Internet access themselves (effectively operating as an ALG)?  Or something else?
>>>>>
>>>> There is indeed a lot of scenarios where you operate NTP inside setups
>>>> which have no, or very limited Internet access. Yet it is used because
>>>> it is COTS "Internet technology" that fits well with what is used. As we
>>>> design things we also need to understand that there is a wide range of
>>>> usage scenarios for which our normal expectation of what we can do on
>>>> "Internet" is not necessarily true or needed.
>>>>
>>> That should be a tipoff right there:  We’re working on “Internet standards”.  Not “Intranet standards”.
>>>
>>> We don’t need to solve every problem in every scope.  There’s a reason the term “local administrative decision” appears in many, many standards.
>>>
>>> Again, protocol, not policy.  Deciding that we need to accommodate rare/one-off usage scenarios in isolated cases is a policy decision.  It’s choosing to insert ourselves into an environment that’s not strictly within our purview.
>>>
>>> What people do on their own intranets with the curtains drawn (and the perimeter secured) is very much their decision…
>>>
>> If it where that easy. The success of Internet protocols and standards forces things to be used where normal Internet rules do not apply.
>
> I’m not sure anything is being forced.  I think people are lazy.  And when a hammer is the closest tool to you, everything in your reach becomes a nail…
>
> We can’t stop them, but nor do we have to overly concern ourselves with enabling or facilitating this behavior.
>
> People use self-signed certificates even though it’s a travesty (read: insecurity) and certificates should be rooted to valid, well-known root CA’s.  Knowing this to be the case, I’m not going to bother asking myself “what happens to my protocol when self-signed certs are used”.  The answer is: “not my problem”.
Well, for some of these things I agree we should not fix them,
self-signed certificates is just one in a row. However, I'm not saying
we should solve all those problems, but realizing just the diversity of
uses that protocols we design have, can be a good exerices as to ask
ourselves what we can learn from it, as we take a step back from just
considering what is the right thing for the case of "Internet". Some
design decisions we make can turn out to be unwise even for that case,
but we may not see them. I've always found it good to understand
multiple scenarios as things scale up and down, as it illustrates issues
about assumptions being made, which may not be what we want even for our
primary scenario. Some of the reasonable releases of concerns may not be
that expenseive and we can gain a mode modular and workable protocol
that makes it more future resistant. It's a somewhat more effort, but in
my humble experience, it's well worth the effort and exercise.
>
>
>> This is very much so with NTP. I think there is things we can relatively easy do to identify some other usage scenarios       that can be supported without too much work. Then again, I'm not saying that it needs to be the main focus, but rather, allowing for it and think that through may also be a good vehicle to support future changes to NTP itself as it is adapted for that unknown future.
>
> If I were redoing SMTP again, I would spend no time at all trying to accommodate the existence of X.400, Bitnet, Decnet architecture, Microsoft MAPI/Exchange, etc.  Refrain: not my problem.
Which is not really on the same scale here.
>> Part of this was "I want this being authenticated". Well, the vehicle to transport the TAI-UTC difference in NTP ended up being wrapped into the NTP Autokey specification, which to work needed to create means to provide extension field to allow for Auotkey to work in, and then that also allowed for the TAI-UTC difference transport that needed that extension mechanism too. Also, it was felt that we want that authenticated, and sure the TAI-UTC differnce is kind of important. Now, a couple of years down the line the security of the Autokey mechanism was found faulty and with that the TAI-UTC difference transport got thrown out. So, this is when I said that we should be careful not to connect it too hard, and then about feature capabilities, because then in future we may loose it if we do not treat it a bit more orthogonal. If we 5-15 years down the line need to replace the authentification mechaism, we should not loose the things wrapped and secured by it. We might actually want to have ways of knowing capabilities when we first knock on the NTP server door just to know which authentification mechanisms is there, so we can transition and change foot. The other extensions there is should move over as we migrate. We failed to do that well in NTPv4. And then the realization that some of this may not be applicable for all uses, and specific requirements be for others.
>
> Cautiously, it sounds like we found yet another thing to agree on: that we should decouple as much as we can from unrelated features, rather than unnecessarily munging them together.
>
>
> So while I don’t mind the notion of carrying the TAI-to-UTC conversion and traceability information as an optional field, I don’t think that argues for (i.e. justifies) making UTC be the canonical time format that NTP uses natively.

I have been advocating the ability to carry such as extension fields,
and use a simple TAI-like time-scale in the core. Then on a separate
notion, I think that for many it will end up being a requirement to get
that TAI-UTC offset so the field may not be as optional as one thinks,
but that is a separate issue from that of not having even the extension
field as such to transport it in.

So, due decoupling, but keep the ability to transport needed dynamic
parameters.

>
>
>
>>>> What is a very wise thing
>>>> to do for "Internet" may be overly burdensome and even prohibitive in
>>>> other scenarios, and vice versa, things that we may think is wise to do
>>>> in those scenarios can be wildly bad in the "Internet" usage scenario.
>>>>
>>> Again, we’re not the “Other scenarios Engineering Task Force”.
>>>
>> Well, if life was that simple.
>
> First step: don’t add new complexity.  Second step: purge existing unnecessarily complexity wherever you find it.
>
Only true if you achieve the minimum complexity the problem require, and
there comes the hard part, agreeing what is in that scope. I've found
that people cut out and fail to add the needed complexity needed. Thus,
you need to understand what is needed, and then add the capability for
it with the least complexity.

For NTP, in NTPv4 you have an uncessary complex thing due to the basic
properties of the core time-scale jump around. That's based on some
history and how things kind of work. We can replace that with a simpler
time-scale to do the core things in. Then, we seem to need a few other
outputs, and then we need to add the capability for those, while trying
to have least complexity. In my experience the answer is a TAI-like core
time, mapping of core-time into TAI and then further though known
definitions.

Along the discussions, there have been some confusing to what represents
TAI and UTC, I think we now sorted some of those out. I think it is fair
to transport the TAI-UTC difference, complete with heads-up information
of next upcoming event. With that, the core time can be easy to process,
we can produce a TAI replica, an UTC replica, an NTPv4 replica, an PTP
replica, an UNIX/POSIX time_t replica, an Linux time_t replica and even
if we so like an GPS time replica. Those all within the precision of the
core time-scale, if implemented correctly. We have then not provided a
leap econd smoothed UTC replica or UT1 replica (which achieves a similar
property). We can discuss wither that should be done through static
mapping (as possible for the leap second smoothed case) or through
additional dynamic parameters. As long as there is optional fields to
support such dynamic parameters for particular transformations, we could
support future formats, even if intermediary servers do not understand
these formats, as it will be a service to those end nodes that require them.

Now, one could discuss the merit of transporting these dynamic
parameters either inside NTP or through side-channel forms, and I tend
to think that it become troublesome to use side-channel forms for a
number of reasons, including operational concerns. This is why I have a
strong preference to keep it in a NTP extension field.

>
>>>> This is why I try to point out that we might consider to keep the core
>>>> behaviors as independent of such usage scenarios, but then enable the
>>>> ability to drop in or even require components and behaviors as needed
>>>> for each such scenario.
>>>>
>>> We might actually agreeing here, though for different motivations.  I like to keep things simple and keep policy out of protocol, which has the end effect of making the protocol inherently more widely applicable (including beyond the originally mandated scope).
>>>
>> The trouble here is that we want to keep things simple, but not too simple so that it no longer solves the problems that needs to be solved. Whenever one tries to simplify too much, what looks simple and good, ends up being an engineering nightmare as workaround upon workaround is needed to achieve what is needed.
>
> “Require components and behaviors as needed for each such scenario” sounds dangerously like “dictate policy” to me.
>
> Not sure I agree with that generalization.
>
> MIME was added to Mail to add multi-media capability, and it worked.  It’s not pretty, but it’s highly functional.
>
> Telnet options were from the very beginning open-ended.  When local line-editing was needed because remote editing over high-latency lines was just too painful, it “dropped right in”.  As did TN3270 interoperability for people who needed to talk to IBM Mainframes.
As you overinterpret the generalization, it indeed becomes dangerous,
and way way way outside of what I mean to the degree I wonder why you
keep mocking with me all the time. You end up sounding very disgraceful,
and I do not think you intended to. However, the way you try to put
limits on it, that too becomes dangerous, as you can end up with a too
restricted thing and the end result risk becoming more complex as a
result. I've done both mistakes multiple times myself, and seen it done
many many more times than I care to explain.
>
>>>>>>>> When DNS is available, what additional values can be put there to aid
>>>>>>>> validation and service?
>>>>>>>>
>>>>>>> Uh… Available “where”?  The internet is relativistic.  What the server sees and what the client sees can be wildly different, even from moment to moment.
>>>>>>>
>>>>>> I didn't say it was on Internet at all times. I can see things like a
>>>>>> point-to-point wire, I can see a heavily fortified network air-gapped
>>>>>> from Internet, I can see networks heavily fortified with very indirect
>>>>>> access to Internet. I can see things with a NAT or just a simple
>>>>>> firewall to the Internet. For some of these scenarios DNS will be
>>>>>> available, for some not. They all use the same "Internet technology"
>>>>>> because that is COTS. So, designing and internet technology also entails
>>>>>> to design for these other scenarios. A succsessful design also
>>>>>> understand and respect the problems of the more isolated scenarios.
>>>>>>
>>>>> FWIW, a border forwarding/caching recursive DNS server is a degenerate case of an ALG.
>>>>>
>>>> Sure, but not always available, and sometimes the operation of those may
>>>> prohibit uses further into the network because well, politics within
>>>> organization or even between organizations. So, you may not always have
>>>> it available as you would like it. Thus, you have another usage scenario.
>>>>
>>>> Cheers,
>>>> Magnus
>>>>
>>> Um…  So what’s to stop such users from operating a GPS clock and driving their time from that, etc, etc?
>>>
>>> If they’re choosing to NOT participate in the Internet at large then I’m not really sure how the Internet Engineering Task Force bears any obligation to accommodate them.
>>>
>>> In other words, “not my problem”.
>>>
>> Well, it may not be your problem, but it ends up being our problem as we then needs to solve those issues.
>
> Sorry, but you keep waving your hands on this issue.  Who says we have to solve these problems?  How does it end up being “our problem”?  If they’re on an isolated intranet, with no possible connection to us, it’s a bit like a tree falling in the forest with no one around, isn’t it?  Whether it makes a sound or not changes nothing from my perspective.

No, you keep waving you hands around "Internet", when in reality the
same technology is used in many large networks, with more or less
connection to "The Internet". Quite large portions of the boxes is used
in more or less hidden applications, while built on the same technology.

>> Internet technology is being used outside of what you would call Internet, it's just a fact of life. It is assumed that COTS boxes can operate there, and when not it becomes a problem, and because it's "Internet technology" it is assumed to work together regardless in a multi-vendor setup. You end up with a proliferation of various variants and it's a pure pain to handle.
>
> Again, I’m not ending up with anything here, because what you describe is happening on an Intranet that bears no connection or relevance to me.
>
> And I don’t assume that COTS boxes that require the Internet operate anywhere that the internet is not connected.  If they can’t be pushed software updates, for instance, then they will eventually have a critical software vulnerability which can’t be patched for and I want nothing to do with them.
>
> But again, none of this is “a pure pain to handle” because it’s happening somewhere that I am insulated from, isolated to, and blissfully ignorant of.

Yes, well that is why I bring it up. What may seem like a good decision
for Internet, may not be that for other cases, which then forces them to
break away and solve it by some other means, which means more divide in
features used and well, IETF can do what they feel like and we cherry
pick the good stuff and we do what we want to do. That ends up just
deepening the divide there is. Rather, if one can look at other
scenarios and see what can be useful for them, one at least can reduce
the divide a little. IETFers can see the same thing as they look at the
IEEE 802 group, where IETF cherry picks some of the IEEE 802 output, but
then ignores some of the things.

Now, I think there is some lessons to be learned by looking at different
scenarios, realizing that not all solutions fit all. I think one can do
reasonable generalizations, that is generalizations not going overboard,
but that provide means for variations that may be useful already for the
base scenario. It's a little more than the minimum, minimum, but maybe
more future proof. Adding extensions afterwards end up being a little
bit messy. MIME is one, ARP, multicast group mapping and IGMP with IGMP
snooping going towards the more uglier side of things.

>
>> Easing some of that pain will always be welcome, such as understanding that it may not always be that you can count on say DNS to always save the day for       all cases. At the same time, the best current practice for Internet devices is for sure on topic, and have particular requirements we do not want to get wrong.
>>
>> Cheers,
>> Magnus
>
> It sounds like you’re sending a mixed message, but maybe I’m just not understanding.  What I’m hearing you say is, “be sparse [or ‘economical’] in your assumptions about what where things might be used”, but I’m also hearing you say “make sure that, as an Internet protocol, it operates properly in the total absence of Internet connectivity”… which is a huge burden to assume.

If the message was that simple as you like, it would rule out necessary
parts, then the message is overly simple and not useful. I can reduce
the message a lot, but not below the most basic complexity it needs, and
some of the things are contradictory. If you try to think I have one
simple message, you read me all wrong.

Cheers,
Magnus