Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Tue, 12 June 2012 13:05 UTC

Message-ID: <4FD73E76.7050105@it.aoyama.ac.jp>
Date: Tue, 12 Jun 2012 22:04:54 +0900
From: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.9) Gecko/20100722 Eudora/3.0.4
MIME-Version: 1.0
To: Stephen Farrell <stephen.farrell@cs.tcd.ie>
Subject: Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
References: <4FCDD499.7060206@it.aoyama.ac.jp> <4FCDE96E.5000109@cs.tcd.ie> <4FD7083A.6080502@it.aoyama.ac.jp> <4FD712E8.7010506@cs.tcd.ie>
In-Reply-To: <4FD712E8.7010506@cs.tcd.ie>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Cc: Graham Klyne <GK@ninebynine.org>, IETF discussion list <ietf@ietf.org>, "draft-farrell-decade-ni@tools.ietf.org" <draft-farrell-decade-ni@tools.ietf.org>
Precedence: list

Hello Stephen,

On 2012/06/12 18:59, Stephen Farrell wrote:
>
> Hi Martin,
>
> On 06/12/2012 10:13 AM, "Martin J. Dürst" wrote:
>> Hello Stephen,
>>
>> This mail responds to your points on the main technical issue that I
>> have identified.
>>
>> On 2012/06/05 20:11, Stephen Farrell wrote:
>>
>>> On 06/05/2012 10:42 AM, "Martin J. Dürst" wrote:
>>>> Hello everybody,

>>> Major design issue:
>>>>
>>>> The draft defines two schemes, which differ only slightly, and mostly
>>>> just gratuitously (see also editorial issues).
>>>> These are the ni: and the nih: scheme. As far as I understand, they
>>>> differ as follows:
>>>>                                       ni:                nih:
>>>> authority:                          optional           disallowed
>>>> ascii-compatible encoding:          base64url          base16
>>>> check digit:                        disallowed         optional
>>>> query part:                         optional           disallowed
>>>> decimal presentation of algorithm:  disallowed         possible
>
> I'll note in passing that the two schemes differ in all those
> respects. You may disagree with our design, but basically you're
> showing that the two differ in pretty much all possible ways
> other than that both include a hash value.
>
>>>>
>>>> The usability of URIs is strongly influenced by the number of different
>>>> schemes, with the smaller a number, the better. As a somewhat made-up
>>>> example, if the original URIs had been separated into httph: for HTML
>>>> pages and httpi: for images, or any other arbitrary subdivision that one
>>>> can envision, that would have hurt the growth and extensibility of the
>>>> Web. Creating new URI schemes is occasionally necessary, and the ideas
>>>> that lead to this draft definitely seem to warrant a new scheme (*), but
>>>> there's no reason for two schemes.
>>>> [(*) I know people who would claim the the .well-formed http/https thing
>>>> is completely sufficient, no new scheme needed at all.]
>>>>
>>>> More specifically, if the original URIs had been separated into httpm:
>>>> (for machines) and httph: (for humans), the Web for sure wouldn't have
>>>> grown at the speed it did (and does) grow. In practice, there are huge
>>>> differences in human 'speakability' for URIs (and IRIs, for that
>>>> matter); compare e.g. http://google.com with
>>>> http://www.google.co.jp/#sclient=psy-ab&hl=en&site=&source=hp&q=hash&oq=hash&aq=f&aqi=g4&aql=
>>>>
>>>>
>>>> (which I have significantly shortened to hopefully eliminate potential
>>>> privacy issues), or compare the average mailto: URI with the average
>>>> data: URI. However, what's important is that there never has been a
>>>> strong dividing line between machine-only and human-only URIs or
>>>> schemes, the division has always been very gradual. Short and mainly
>>>> human-oriented URIs have of course been handled by machines, and on the
>>>> other hand, very long URIs have been spoken when really necessary.
>>>> "Speakability" has been maintained to some extent by scheme designers,
>>>> and to some extent by "survival of the fittest" (URIs that weren't very
>>>> speakable (or spellable/memorizable/guessable/...), and their Web sites,
>>>> might just die out slowly).
>>>>
>>>> It should also be noted that the resistance against multiple URI schemes
>>>> may have been low because there are so many different ways to express
>>>> hashes in the draft anyway, and one more (the nih: section is the last
>>>> one before the examples section) didn't seem like much of a deal
>>>> anymore. But when it comes to URIs, one less is a lot better than one
>>>> more.
>>>>
>>>> In the above ni:/nih: distinction, nih: seems to have been added as an
>>>> afterthought after realizing that reading an ni: URI aloud over the
>>>> phone may be somewhat suboptimal because there is a need for repeated
>>>> "upper case" - "lower case" (sure very quickly shortened to "upper" -
>>>> "lower" and then to "up" - "low" or something similar). It is not a bad
>>>> idea to try to make sure that IETF technology, and URIs in particular,
>>>> are accessible to people with certain kinds of dislexya. (There are
>>>> indeed people who have tremendous difficulties with distinguishing
>>>> upper- and lower-case letters, and this may or may not be connected with
>>>> other aspects of dislexya.) It is however totally unclear to this
>>>> reviewer why this has to lead to two different URI schemes with other
>>>> gratuitous differences.
>>>>
>>>> Finding a solution is rather easy (of course, other solutions may also
>>>> be possible): Merge the schemes, so that authority, check digit, and
>>>> query part are all optional (an authority part and/or a query part may
>>>> very well be very useful in human communication, and a check digit won't
>>>> hurt when transmitted electronically) and the decimal presentation of
>>>> the algorithm is always allowed, and use base32
>>>> (http://tools.ietf.org/html/rfc4648) as the encoding. This leads to a
>>>> 16.6% less efficient encoding of the value part of the ni: URI, but
>>>> given that other URI-related encodings, e.g. the %-encoding resulting
>>>> when converting an IRI to an URI, are much less efficient, and that URI
>>>> infrastructure these days can handle URIs with more than 1000 bytes,
>>>> this should not be a serious problem. Also, there's a separate binary
>>>> format (section 6) that is more compact already.
>>>
>>> I strongly disagree with merging ni&   nih. Though that clearly
>>> could be done, it would be an error.
>>>
>>> There was no such comment on the uri-review list and the designated
>>> expert was happy. That review was IMO the time for such comments
>>> and second-guessing the designated expert at this stage seems
>>> contrary to the registration requirements. So process-wise I
>>> think your main comment is late.
>>
>> First, if IETF Last Call is too late to make serious technical comments
>> on drafts, then I think we have to rename it to IETF Too-Late Call.
>>
>> Second, designated experts are there to check for minimum requirements
>> for a registration, and to give advice as they see fit (and have time).
>> I'm myself a designated expert on "Character Sets", and I have
>> definitely in the past approved, and would again in the future approve,
>> registrations for stuff on which I would complain strongly if the
>> question was "is this a good technical solution".
>>
>> Graham Klyne, the designated expert for URI scheme registrations, has
>> confirmed offline that he does not see his role as "expert reviewer" as
>> judging the technical merit of a URI scheme proposal.
>
> While that's fair enough. Its also fair to note that there was
> discussion of the this document on the uri-review list but this
> aspect was not raised at all. That list is called "uri-review"
> and from its archives it does seem to frequently do more than
> just check the paperwork (including quite a few mails from you:-).

Oh well. I'm not sure how familiar you are with the IETF :-), but it's 
basically all a volunteer organization. So even more than in an 
organization where people are getting paid, some things happen to get a 
lot of attention and other things happen to get less attention than 
maybe they would deserve. And most people have their day jobs, and 
occasionally miss a mail or two, or even a few more. And many people are 
on many lists and just skim them until something catches their eye, and 
ignore many threads. If you don't do anything like this and survive, I'd 
like to know how :-).

So the uri-review list, like any other IETF (or other) list, can get 
excited about stuff, sometimes for good reasons, or can just ignore some 
aspects of some proposal. It's unfortunate, but it's something we have 
to live with. You can't ask everybody on that list to take the same 
amount of time to look at all the proposals that e.g. a Security 
Directorate or Apps Area Directorate reviewer is using.

>>> But in any case, I also think you're wrong technically in this>>  case.
>>
>> Let's see. I hope we agree that we should come to a conclusion on this
>> issue on technical merits, rather than on process details.
>
> Sure.

Good.

>>> nih *is* intended for a corner case,
>
> Let me emphasise the above. nih is not intended to be used
> broadly, nor often. If you want a hash-based URI scheme for
> users to speak that is for broad frequent use then I think
> you are free to try design one. But nih is not that and is
> not intended to be that.

Then, at the barest minimum, don't sell it as that. The only thing the 
draft essentially says is "use this if you want to read/speak it". Now 
you are saying "don't use that for users to speak, unless in specific 
corner cases", but you still haven't told me or the readers of the draft 
what these corner cases are, and why the SHOULD/MUST not use it 
otherwise even if they might think it makes sense.

The second, more general advice is of course to not design URIs for 
corner cases, because URIs work best if they are used widely. (There may 
be a very tiny chance that your case is different, but even after 
reading all your mail, I haven't understood that.)

> (And I'm not sure such a beast
> could really be done well.)

Speaking 256 or so bits of random data aloud is in no way going to be 
terribly easy :-).

>>> where humans need to speak these
>>> URIs and was added as a direct result of requirements from the core
>>> WG and not as an afterthought. ni URIs are not intended for that
>>> and so there really are IMO different requirements, (esp. e.g.
>>> checkdigit) that are best met with different schemes.
>>
>> I agree that the value of a checkdigit is very limited for communication
>
> s/very limited/useless/
>
>> among machines (and for communication among humans with the help of
>> machines, such as in the case of email).
>>
>> On the other hand, I can't understand why (even assuming we needed a
>> separate scheme) there is no authority and no query part on nih.
>
> The main intent of nih is to allow entry of something that
> confirms something else (e.g. a public key) that is already
> present.

At least a little bit of information about the "use case", but this just 
leads to more questions:

Do you need an URI for that? (Isn't that what people are supposed to do 
with fingerprints these days?)

How/where would a user input that URI e.g. in a browser?

Would copy/paste from an email or a Web page be fine?

Would entry be needed, or could it just be checked? (I understand that 
forcing a user to enter the data is way safer than believing her that a 
check was made, but then it's also way more tedious, and many users will 
just switch to different software.)

Can ni: URIs also be used for the same purpose? If not, why not (they 
essentially contain the same information)?

What about other URIs, starting e.g. with http:?

I won't claim that this list of questions is exhaustive, but I hope it's 
a good start.

> There is no need for an authority for that, for
> the use-cases we have. We could speculate about other potential
> use-cases but we'd rather not speculate like that when there's
> no need to.

There was a lot of stuff that we take for granted today that Tim 
Berners-Lee didn't speculate when he designed the first URI schemes. But 
if he had designed these schemes with only corner cases in mind, he 
wouldn't have invented the Web :-). (sorry about an analogy again)

>> For the authority, I'd assume that it would be as useful when the URI is
>> transmitted e.g. over the phone as when it is transmitted e.g. over email.
>
> We don't have a use for that that I know about.

Do you mean the "over the phone" part or the authority part? (assuming 
the later below)

> I agree
> it could be done, but then I think it'd also impact on
> usability, which will be pretty crap no matter what's
> done.

Reading "example.com" or some equivalent domain name in addition to a 
long hex string shouldn't really make matters much worse.

> But making usability worse also seems wrong.

My understanding is that there are some cases where things will be found 
with authority, but not without. If that's right, then I'm sure users 
will prefer to find what they are looking for, after aurally 
transmitting 256 or thereabout bits, even at the cost of an additional 
domain name (and a few slashes), which are the easiest parts to get 
across the line quickly and correctly.

> Not
> having an authority also seems to work fine for PGP keys

That the lack of authority works in some cases is perfectly okay. But 
it's not an argument for not allowing authority for cases (even those 
that maybe you're not seeing yet) where it is or will be useful.

> and the lack of an authority does get rid of some threats,
> if the nih URI is used for something security-sensitive.

Which would mean that ni: has these threats, yes? Where are they 
described? I don't remember seeing them in the security section.

>> For the query part, there are already various ideas and proposals
>> floating around,
>
> Where? If you mean draft-hallambaker-decade-params

Correct.

> then
> we (the authors of that) don't think those are useful for
> nih names.

Why not? And what if others think they would be useful?

>> and at least some of them would be of interest for when
>> the URI is transmitted e.g. over the phone. Also, even if we currently
>> didn't have any actual proposals for query parameters, I think it would
>> be a very bad idea to exclude them a priori for transmission e.g. over
>> the phone.
>
> I disagree that there is any "very bad idea" here.
>
>>> Merging ni/nih would also add more complexity for no benefit,
>>> which would be a bad idea.
>>
>> Can you please explain what kind of complexity would have to be added?
>
> I think its obvious actually. In your table above you highlighted
> 5 ways in which ni and nih differ. Merging all those yields loads
> of combinations, which makes for complexity.

Well, if your implementation is a big switch with 32 cases, one for each 
combination, with no code sharing, then indeed you get complexity. But I 
hope we agree that that's very bad design, and totally unnecessary. It's 
very straightforward to go through the components one-by-one and deal 
with them (hint: I'd start with the check digit).

On the other hand, the complexity you get from query parameters is 
potentially huge, because, contrary to the basic syntax, we don't know 
for sure what combinations of parameters people will dream up, and how 
these will interact.

>> In terms of specification, merging the two schemes doesn't seem to be
>> difficult or complex at all. Also, in terms of implementation, the only
>> additions to the ni: scheme that become necessary are the check digit
>> and the expression of the "suite id" as a decimal. It's very difficult
>> for me to imagine that this would add significant complexity to an
>> implementation; if code for nih: exists, that can mostly just be moved
>> over.
>
> Feel free to look at our code. (With the caveat that I'm a crap
> programmer so close your eyes a bit when you look at the 'C' code:-)
>
>>> Your analogy about httpm/h may appear reasonable, but it is always
>>> unreasonable to draw conclusions from analogies. It is also unwise
>>> to reason from counterfactuals, which we'd also be doing if we
>>> accepted your argument. So I find that speculation utterly useless
>>> to be honest.
>>
>> It is definitely unreasonable to draw conclusions from analogies *only*.
>
> I only saw the analogy. What in your httpm/h argument is not
> couterfactual analogy?

Do you claim that it's not true that URIs are often transferred 
electronically, often transferred on paper, and often spoken, and that 
these sets for which these activities happen overlap? If you think that 
has to be different for ni:/nih:, you have to convince me, not the other 
way round.

>> But if you think that the httpm/h analogy is wrong, and that ni/nih is
>> different, could you please explain *what* is different?
>
> We have real use cases for ni and nih and we think they differ.
> I'd be repeating myself to say why again.

You're still very vague on these "real use cases". Confirmation of PGP 
keys by speaking the URI aloud seems to be your main/only use case for 
nih:. But I'm only guessing, because you haven't really explained this yet.

>>> In this case, we are dealing with different requirements so this
>>> should stay as-is.
>>
>> If "different requirements" is your main (or only) real argument,
>
> That would be a valid argument.

It would only be a valid argument if these are requirements that can't 
be consolidated.

>> could
>> you at least explain exactly how they are different?
>
> I did that above. Asking for an "exact" explanation seems
> like asking the same thing again.

Here are the pieces of text from above that I can at least in some way 
link to actual requirements:
"nih is not intended to be used broadly, nor often."
"The main intent of nih is to allow entry of something that
confirms something else (e.g. a public key) that is already
present."

Is that all there is? Or did I miss something? Maybe it would help this 
discussion if you wrote a few paragraphs of free-standing text 
explaining the use cases (and restrictions, and reasons for these) for 
both schemes. It may turn out to be a valid addition to the document at 
the very minimum. (I'm not asking for an independent "use cases and 
requirements" draft :-)

>> Just that one
>> requirement came from the core WG and others from other WGs or other
>> parties doesn't help me to understand how the actual requirements
>> differ. (Please note that even if the requirements differ, that doesn't
>> mean that we need different technology to address them.)
>
> Perhaps not. But that was the design choice we made and its
> a valid one.

Given the more than 20 years of experience we have of how URIs are 
handled on the Web and around it, I have very good reasons to strongly 
doubt that this is a good design choice.

>> Why do you say that ni: URIs are not intended for humans to speak?
>
> So phone me up and say this:
>
>    ni:///sha-256;UyaQV-Ev4rdLoHyJJWCi11OHfrYv9E1aGQAlMO2X_-Q

So apart from the upper/lower problem, would the equivalent nih: URI be 
any better?

>> What
>> am I supposed to do if I got an ni: URI in a mail message and call you
>> on the phone to tell you about that?
>
> Not my problem actually, but I guess most people might
> say "remember that mail you sent me with all that gobbledygook
> nonsense - what the hell was that about?" :-)

I think "tell you about it" should have been "speak it out to you".

>>   If I want to send somebody the
>> information in an ni: URI by mail, should I use only the ni: version or
>> only the nih: version, or both, if I can't exclude that the recipient
>> may want to relay this information via voice?
>
> You can try either and let me know what works. This seems like
> a very artificial use-case for ni.

Why artificial? Long URIs get spoken over the phone once in a while. And 
I really don't understand why this is "very artificial" for ni:, but 
"indispensable" for nih:.

>>> Finally, we have (some, early,) running code that matches the
>>> current draft and that ought also count for something
>>
>> How much?
>
> Feel free to go look and see. [1] I've not counted lines of
> code, but we have c, python, ruby and clojure library
> implementations and some apps and other bits and pieces.
>
>     [1] http://sourceforge.net/projects/netinf/
>
>> The boiler plate on every ID is pretty clear that they are not
>> set in stone. Also, the changes needed to merge the two schemes are not
>> rocket science, quite to the contrary. (I herewith volunteer to fix the
>> Ruby version, just to show)
>
> I didn't say our code is set in stone. I said that running code
> counts.
>
> I didn't say a merge would require rocket science.

It somehow sounded like this.

> I said it'd be
> a bad idea and would produce a worse result.

Can you substantiate that? The discussion on use cases is still open, 
and the argument with all the 32 combinations also didn't fly.

>>> when compared
>>> to a change that would be a gratuitous dis-improvement
>>
>> In what sense would merging the two schemes be a dis-improvement? Can
>> you please explain?
>
> I believe I did that above.

Very marginally, perhaps. More information would definitely help.

>>> based it
>>> seems upon dubious argument
>>
>> If you think that my arguments are dubious, please explain exactly why.
>
> I believe I did that above. (To be clear: not all your argument is
> dubious

Thanks for some encouragement :-).

> but the httph/m part is IMO.)

I'm sorry that I started out with that analogy, but I thought it would 
be easy to understand. You seem to be caught in the specific use/corner 
cases, but I'm trying to look at the long term big picture.

>>> that is also offered at the wrong
>>> point in the process.
>>
>> See above. If there's something wrong with IETF Last Call, or with the
>> fact that the Apps Area Directorate does reviews (which I don't think),
>> then that should be addressed separately. For this discussion, I hope we
>> can concentrate on technical issues.
>
> Right. But let's not ignore the fact that the uri-review
> list had sight of this at the end of April.

[If you keep insisting, why don't you check your MUA and look at all the 
messages I sent on the draft. You may be surprised, but I seriously hope 
you will stop bringing up "uri-review" again.]

> Bottom line - we have use-cases and a valid design and running
> code that as far as we know works and I see no reason to make
> the change you'd like, which would make thing worse IMO.

It may look like some more work in the short time, but it will make 
things more general and create more potential for the future.

Regards,    Martin.

APPSDIR review of draft-farrell-decade-ni-07 Martin J. Dürst
Re: APPSDIR review of draft-farrell-decade-ni-07 Stephen Farrell
Re: APPSDIR review of draft-farrell-decade-ni-07 … Martin J. Dürst
Re: APPSDIR review of draft-farrell-decade-ni-07 … Stephen Farrell
Re: APPSDIR review of draft-farrell-decade-ni-07,… Martin J. Dürst
Re: APPSDIR review of draft-farrell-decade-ni-07,… Stephen Farrell
Re: APPSDIR review of draft-farrell-decade-ni-07,… Martin J. Dürst
Re: APPSDIR review of draft-farrell-decade-ni-07,… Stephen Farrell
registries and designated experts (was: Re: APPSD… Peter Saint-Andre
Re: registries and designated experts Dave Crocker
Re: registries and designated experts (was: Re: A… Barry Leiba
Re: registries and designated experts (was: Re: A… SM
Re: registries and designated experts Brian E Carpenter
Re: registries and designated experts John C Klensin
Re: registries and designated experts SM
Re: registries and designated experts Randy Bush
Re: registries and designated experts John C Klensin
Re: registries and designated experts (was: Re: A… Bjoern Hoehrmann
Re: registries and designated experts Bjoern Hoehrmann
Re: registries and designated experts Randy Bush
Re: registries and designated experts Brian E Carpenter
RE: registries and designated experts Romascanu, Dan (Dan)
Re: registries and designated experts Thomas Narten
Re: registries and designated experts ned+ietf
Re: registries and designated experts John C Klensin
Re: registries and designated experts Bjoern Hoehrmann
Re: registries and designated experts Dave Crocker
Re: APPSDIR review of draft-farrell-decade-ni-07,… Stephen Farrell
Re: registries and designated experts Martin J. Dürst
Re: registries and designated experts Stephen Farrell
Re: registries and designated experts Martin J. Dürst
Re: APPSDIR review of draft-farrell-decade-ni-07,… Graham Klyne
Re: registries and designated experts Graham Klyne
Re: APPSDIR review of draft-farrell-decade-ni-07,… Graham Klyne