Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)

Graham Klyne <> Fri, 15 June 2012 17:37 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DD2E321F8673 for <>; Fri, 15 Jun 2012 10:37:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -5.116
X-Spam-Status: No, score=-5.116 tagged_above=-999 required=5 tests=[BAYES_40=-0.185, DATE_IN_PAST_06_12=1.069, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 3Xos4gohsZpd for <>; Fri, 15 Jun 2012 10:37:19 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 2BC6F21F8672 for <>; Fri, 15 Jun 2012 10:37:18 -0700 (PDT)
Received: from ([]) by with esmtp (Exim 4.75) (envelope-from <>) id 1SfaSN-0003Aj-T5; Fri, 15 Jun 2012 18:37:11 +0100
Received: from ([] helo=Eskarina.local) by with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from <>) id 1SfaSM-0005jP-4E; Fri, 15 Jun 2012 18:37:10 +0100
Message-ID: <>
Date: Fri, 15 Jun 2012 11:48:57 +0100
From: Graham Klyne <>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:6.0) Gecko/20110812 Thunderbird/6.0
MIME-Version: 1.0
To: Stephen Farrell <>
Subject: Re: APPSDIR review of draft-farrell-decade-ni-07, major design issue (one or two URI schemes)
References: <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Oxford-Username: zool0635
X-Mailman-Approved-At: Mon, 18 Jun 2012 05:41:05 -0700
Cc: IETF discussion list <>, "" <>
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 15 Jun 2012 17:37:22 -0000


(Personal hat on)

I've followed elements of this exchange.  I must confess that when I read 
through the draft previously, I didn't really pay attention to the nih: parts.

I can see that there are distinct use-cases here, and I think you have 
reasonable grounds for not wanting to combine them.

What I can't see is why the speakable form (nih:) needs to be a URI scheme - 
what are the envisaged contexts of use where the information provided by an nih: 
URI actually needs to be a URI, as opposed to just, say, a simple string?

In all the uses I can think of for ni:, I don't see a corresponding use for a 
speakable form.  I think you said somewhere in this exchange, nih: is intended 
to be used to confirm some information that you already have.  As such, I'm not 
seeing how it can be said to identify a resource.


On 12/06/2012 15:09, Stephen Farrell wrote:
> Martin,
> I honestly don't think this exchange is going
> anywhere new so I've not provided blow-by-blow
> answers below.
> We (the authors) think ni and nih are better
> kept separate because the use-cases and
> requirements differ. And that's what the
> limited amount of running code does.
> I think I've explained that, and I have no
> intention whatsoever of getting into the "write
> up the use-cases and requirements" game that
> too-often happens in the IETF when its neither
> needed nor productive. (Regardless of whether
> the game is played with I-Ds or email.)
> Sometimes that is useful, but not here IMO.
> You disagree, which is fine. You think ni and nih
> should be merged because they don't differ
> sufficiently, (I think), and (I guess) because
> you think that new URI schemes are more expensive
> than what you see as the differences between
> these two proposed schemes warrant.
> I suggest we see if anyone else chimes in on
> this aspect (so far nobody has, despite a quite
> active IETF LC:-) and if not leave it to the
> sponsoring AD to figure out what, if anything,
> needs doing at the end of IETF LC.
> Cheers,
> S.
> PS: Many thanks for all the other good comments,
> though we disagree on this one, you've helped
> make the draft better for sure.
> On 06/12/2012 02:04 PM, "Martin J. Dürst" wrote:
>> Hello Stephen,
>> On 2012/06/12 18:59, Stephen Farrell wrote:
>>> Hi Martin,
>>> On 06/12/2012 10:13 AM, "Martin J. Dürst" wrote:
>>>> Hello Stephen,
>>>> This mail responds to your points on the main technical issue that I
>>>> have identified.
>>>> On 2012/06/05 20:11, Stephen Farrell wrote:
>>>>> On 06/05/2012 10:42 AM, "Martin J. Dürst" wrote:
>>>>>> Hello everybody,
>>>>> Major design issue:
>>>>>> The draft defines two schemes, which differ only slightly, and mostly
>>>>>> just gratuitously (see also editorial issues).
>>>>>> These are the ni: and the nih: scheme. As far as I understand, they
>>>>>> differ as follows:
>>>>>>                                        ni:                nih:
>>>>>> authority:                          optional           disallowed
>>>>>> ascii-compatible encoding:          base64url          base16
>>>>>> check digit:                        disallowed         optional
>>>>>> query part:                         optional           disallowed
>>>>>> decimal presentation of algorithm:  disallowed         possible
>>> I'll note in passing that the two schemes differ in all those
>>> respects. You may disagree with our design, but basically you're
>>> showing that the two differ in pretty much all possible ways
>>> other than that both include a hash value.
>>>>>> The usability of URIs is strongly influenced by the number of
>>>>>> different
>>>>>> schemes, with the smaller a number, the better. As a somewhat made-up
>>>>>> example, if the original URIs had been separated into httph: for HTML
>>>>>> pages and httpi: for images, or any other arbitrary subdivision
>>>>>> that one
>>>>>> can envision, that would have hurt the growth and extensibility of the
>>>>>> Web. Creating new URI schemes is occasionally necessary, and the ideas
>>>>>> that lead to this draft definitely seem to warrant a new scheme
>>>>>> (*), but
>>>>>> there's no reason for two schemes.
>>>>>> [(*) I know people who would claim the the .well-formed http/https
>>>>>> thing
>>>>>> is completely sufficient, no new scheme needed at all.]
>>>>>> More specifically, if the original URIs had been separated into httpm:
>>>>>> (for machines) and httph: (for humans), the Web for sure wouldn't have
>>>>>> grown at the speed it did (and does) grow. In practice, there are huge
>>>>>> differences in human 'speakability' for URIs (and IRIs, for that
>>>>>> matter); compare e.g. with
>>>>>> (which I have significantly shortened to hopefully eliminate potential
>>>>>> privacy issues), or compare the average mailto: URI with the average
>>>>>> data: URI. However, what's important is that there never has been a
>>>>>> strong dividing line between machine-only and human-only URIs or
>>>>>> schemes, the division has always been very gradual. Short and mainly
>>>>>> human-oriented URIs have of course been handled by machines, and on
>>>>>> the
>>>>>> other hand, very long URIs have been spoken when really necessary.
>>>>>> "Speakability" has been maintained to some extent by scheme designers,
>>>>>> and to some extent by "survival of the fittest" (URIs that weren't
>>>>>> very
>>>>>> speakable (or spellable/memorizable/guessable/...), and their Web
>>>>>> sites,
>>>>>> might just die out slowly).
>>>>>> It should also be noted that the resistance against multiple URI
>>>>>> schemes
>>>>>> may have been low because there are so many different ways to express
>>>>>> hashes in the draft anyway, and one more (the nih: section is the last
>>>>>> one before the examples section) didn't seem like much of a deal
>>>>>> anymore. But when it comes to URIs, one less is a lot better than one
>>>>>> more.
>>>>>> In the above ni:/nih: distinction, nih: seems to have been added as an
>>>>>> afterthought after realizing that reading an ni: URI aloud over the
>>>>>> phone may be somewhat suboptimal because there is a need for repeated
>>>>>> "upper case" - "lower case" (sure very quickly shortened to "upper" -
>>>>>> "lower" and then to "up" - "low" or something similar). It is not a
>>>>>> bad
>>>>>> idea to try to make sure that IETF technology, and URIs in particular,
>>>>>> are accessible to people with certain kinds of dislexya. (There are
>>>>>> indeed people who have tremendous difficulties with distinguishing
>>>>>> upper- and lower-case letters, and this may or may not be connected
>>>>>> with
>>>>>> other aspects of dislexya.) It is however totally unclear to this
>>>>>> reviewer why this has to lead to two different URI schemes with other
>>>>>> gratuitous differences.
>>>>>> Finding a solution is rather easy (of course, other solutions may also
>>>>>> be possible): Merge the schemes, so that authority, check digit, and
>>>>>> query part are all optional (an authority part and/or a query part may
>>>>>> very well be very useful in human communication, and a check digit
>>>>>> won't
>>>>>> hurt when transmitted electronically) and the decimal presentation of
>>>>>> the algorithm is always allowed, and use base32
>>>>>> ( as the encoding. This leads to a
>>>>>> 16.6% less efficient encoding of the value part of the ni: URI, but
>>>>>> given that other URI-related encodings, e.g. the %-encoding resulting
>>>>>> when converting an IRI to an URI, are much less efficient, and that
>>>>>> URI
>>>>>> infrastructure these days can handle URIs with more than 1000 bytes,
>>>>>> this should not be a serious problem. Also, there's a separate binary
>>>>>> format (section 6) that is more compact already.
>>>>> I strongly disagree with merging ni&    nih. Though that clearly
>>>>> could be done, it would be an error.
>>>>> There was no such comment on the uri-review list and the designated
>>>>> expert was happy. That review was IMO the time for such comments
>>>>> and second-guessing the designated expert at this stage seems
>>>>> contrary to the registration requirements. So process-wise I
>>>>> think your main comment is late.
>>>> First, if IETF Last Call is too late to make serious technical comments
>>>> on drafts, then I think we have to rename it to IETF Too-Late Call.
>>>> Second, designated experts are there to check for minimum requirements
>>>> for a registration, and to give advice as they see fit (and have time).
>>>> I'm myself a designated expert on "Character Sets", and I have
>>>> definitely in the past approved, and would again in the future approve,
>>>> registrations for stuff on which I would complain strongly if the
>>>> question was "is this a good technical solution".
>>>> Graham Klyne, the designated expert for URI scheme registrations, has
>>>> confirmed offline that he does not see his role as "expert reviewer" as
>>>> judging the technical merit of a URI scheme proposal.
>>> While that's fair enough. Its also fair to note that there was
>>> discussion of the this document on the uri-review list but this
>>> aspect was not raised at all. That list is called "uri-review"
>>> and from its archives it does seem to frequently do more than
>>> just check the paperwork (including quite a few mails from you:-).
>> Oh well. I'm not sure how familiar you are with the IETF :-), but it's
>> basically all a volunteer organization. So even more than in an
>> organization where people are getting paid, some things happen to get a
>> lot of attention and other things happen to get less attention than
>> maybe they would deserve. And most people have their day jobs, and
>> occasionally miss a mail or two, or even a few more. And many people are
>> on many lists and just skim them until something catches their eye, and
>> ignore many threads. If you don't do anything like this and survive, I'd
>> like to know how :-).
>> So the uri-review list, like any other IETF (or other) list, can get
>> excited about stuff, sometimes for good reasons, or can just ignore some
>> aspects of some proposal. It's unfortunate, but it's something we have
>> to live with. You can't ask everybody on that list to take the same
>> amount of time to look at all the proposals that e.g. a Security
>> Directorate or Apps Area Directorate reviewer is using.
>>>>> But in any case, I also think you're wrong technically in this>>   case.
>>>> Let's see. I hope we agree that we should come to a conclusion on this
>>>> issue on technical merits, rather than on process details.
>>> Sure.
>> Good.
>>>>> nih *is* intended for a corner case,
>>> Let me emphasise the above. nih is not intended to be used
>>> broadly, nor often. If you want a hash-based URI scheme for
>>> users to speak that is for broad frequent use then I think
>>> you are free to try design one. But nih is not that and is
>>> not intended to be that.
>> Then, at the barest minimum, don't sell it as that. The only thing the
>> draft essentially says is "use this if you want to read/speak it". Now
>> you are saying "don't use that for users to speak, unless in specific
>> corner cases", but you still haven't told me or the readers of the draft
>> what these corner cases are, and why the SHOULD/MUST not use it
>> otherwise even if they might think it makes sense.
>> The second, more general advice is of course to not design URIs for
>> corner cases, because URIs work best if they are used widely. (There may
>> be a very tiny chance that your case is different, but even after
>> reading all your mail, I haven't understood that.)
>>> (And I'm not sure such a beast
>>> could really be done well.)
>> Speaking 256 or so bits of random data aloud is in no way going to be
>> terribly easy :-).
>>>>> where humans need to speak these
>>>>> URIs and was added as a direct result of requirements from the core
>>>>> WG and not as an afterthought. ni URIs are not intended for that
>>>>> and so there really are IMO different requirements, (esp. e.g.
>>>>> checkdigit) that are best met with different schemes.
>>>> I agree that the value of a checkdigit is very limited for communication
>>> s/very limited/useless/
>>>> among machines (and for communication among humans with the help of
>>>> machines, such as in the case of email).
>>>> On the other hand, I can't understand why (even assuming we needed a
>>>> separate scheme) there is no authority and no query part on nih.
>>> The main intent of nih is to allow entry of something that
>>> confirms something else (e.g. a public key) that is already
>>> present.
>> At least a little bit of information about the "use case", but this just
>> leads to more questions:
>> Do you need an URI for that? (Isn't that what people are supposed to do
>> with fingerprints these days?)
>> How/where would a user input that URI e.g. in a browser?
>> Would copy/paste from an email or a Web page be fine?
>> Would entry be needed, or could it just be checked? (I understand that
>> forcing a user to enter the data is way safer than believing her that a
>> check was made, but then it's also way more tedious, and many users will
>> just switch to different software.)
>> Can ni: URIs also be used for the same purpose? If not, why not (they
>> essentially contain the same information)?
>> What about other URIs, starting e.g. with http:?
>> I won't claim that this list of questions is exhaustive, but I hope it's
>> a good start.
>>> There is no need for an authority for that, for
>>> the use-cases we have. We could speculate about other potential
>>> use-cases but we'd rather not speculate like that when there's
>>> no need to.
>> There was a lot of stuff that we take for granted today that Tim
>> Berners-Lee didn't speculate when he designed the first URI schemes. But
>> if he had designed these schemes with only corner cases in mind, he
>> wouldn't have invented the Web :-). (sorry about an analogy again)
>>>> For the authority, I'd assume that it would be as useful when the URI is
>>>> transmitted e.g. over the phone as when it is transmitted e.g. over
>>>> email.
>>> We don't have a use for that that I know about.
>> Do you mean the "over the phone" part or the authority part? (assuming
>> the later below)
>>> I agree
>>> it could be done, but then I think it'd also impact on
>>> usability, which will be pretty crap no matter what's
>>> done.
>> Reading "" or some equivalent domain name in addition to a
>> long hex string shouldn't really make matters much worse.
>>> But making usability worse also seems wrong.
>> My understanding is that there are some cases where things will be found
>> with authority, but not without. If that's right, then I'm sure users
>> will prefer to find what they are looking for, after aurally
>> transmitting 256 or thereabout bits, even at the cost of an additional
>> domain name (and a few slashes), which are the easiest parts to get
>> across the line quickly and correctly.
>>> Not
>>> having an authority also seems to work fine for PGP keys
>> That the lack of authority works in some cases is perfectly okay. But
>> it's not an argument for not allowing authority for cases (even those
>> that maybe you're not seeing yet) where it is or will be useful.
>>> and the lack of an authority does get rid of some threats,
>>> if the nih URI is used for something security-sensitive.
>> Which would mean that ni: has these threats, yes? Where are they
>> described? I don't remember seeing them in the security section.
>>>> For the query part, there are already various ideas and proposals
>>>> floating around,
>>> Where? If you mean draft-hallambaker-decade-params
>> Correct.
>>> then
>>> we (the authors of that) don't think those are useful for
>>> nih names.
>> Why not? And what if others think they would be useful?
>>>> and at least some of them would be of interest for when
>>>> the URI is transmitted e.g. over the phone. Also, even if we currently
>>>> didn't have any actual proposals for query parameters, I think it would
>>>> be a very bad idea to exclude them a priori for transmission e.g. over
>>>> the phone.
>>> I disagree that there is any "very bad idea" here.
>>>>> Merging ni/nih would also add more complexity for no benefit,
>>>>> which would be a bad idea.
>>>> Can you please explain what kind of complexity would have to be added?
>>> I think its obvious actually. In your table above you highlighted
>>> 5 ways in which ni and nih differ. Merging all those yields loads
>>> of combinations, which makes for complexity.
>> Well, if your implementation is a big switch with 32 cases, one for each
>> combination, with no code sharing, then indeed you get complexity. But I
>> hope we agree that that's very bad design, and totally unnecessary. It's
>> very straightforward to go through the components one-by-one and deal
>> with them (hint: I'd start with the check digit).
>> On the other hand, the complexity you get from query parameters is
>> potentially huge, because, contrary to the basic syntax, we don't know
>> for sure what combinations of parameters people will dream up, and how
>> these will interact.
>>>> In terms of specification, merging the two schemes doesn't seem to be
>>>> difficult or complex at all. Also, in terms of implementation, the only
>>>> additions to the ni: scheme that become necessary are the check digit
>>>> and the expression of the "suite id" as a decimal. It's very difficult
>>>> for me to imagine that this would add significant complexity to an
>>>> implementation; if code for nih: exists, that can mostly just be moved
>>>> over.
>>> Feel free to look at our code. (With the caveat that I'm a crap
>>> programmer so close your eyes a bit when you look at the 'C' code:-)
>>>>> Your analogy about httpm/h may appear reasonable, but it is always
>>>>> unreasonable to draw conclusions from analogies. It is also unwise
>>>>> to reason from counterfactuals, which we'd also be doing if we
>>>>> accepted your argument. So I find that speculation utterly useless
>>>>> to be honest.
>>>> It is definitely unreasonable to draw conclusions from analogies *only*.
>>> I only saw the analogy. What in your httpm/h argument is not
>>> couterfactual analogy?
>> Do you claim that it's not true that URIs are often transferred
>> electronically, often transferred on paper, and often spoken, and that
>> these sets for which these activities happen overlap? If you think that
>> has to be different for ni:/nih:, you have to convince me, not the other
>> way round.
>>>> But if you think that the httpm/h analogy is wrong, and that ni/nih is
>>>> different, could you please explain *what* is different?
>>> We have real use cases for ni and nih and we think they differ.
>>> I'd be repeating myself to say why again.
>> You're still very vague on these "real use cases". Confirmation of PGP
>> keys by speaking the URI aloud seems to be your main/only use case for
>> nih:. But I'm only guessing, because you haven't really explained this yet.
>>>>> In this case, we are dealing with different requirements so this
>>>>> should stay as-is.
>>>> If "different requirements" is your main (or only) real argument,
>>> That would be a valid argument.
>> It would only be a valid argument if these are requirements that can't
>> be consolidated.
>>>> could
>>>> you at least explain exactly how they are different?
>>> I did that above. Asking for an "exact" explanation seems
>>> like asking the same thing again.
>> Here are the pieces of text from above that I can at least in some way
>> link to actual requirements:
>> "nih is not intended to be used broadly, nor often."
>> "The main intent of nih is to allow entry of something that
>> confirms something else (e.g. a public key) that is already
>> present."
>> Is that all there is? Or did I miss something? Maybe it would help this
>> discussion if you wrote a few paragraphs of free-standing text
>> explaining the use cases (and restrictions, and reasons for these) for
>> both schemes. It may turn out to be a valid addition to the document at
>> the very minimum. (I'm not asking for an independent "use cases and
>> requirements" draft :-)
>>>> Just that one
>>>> requirement came from the core WG and others from other WGs or other
>>>> parties doesn't help me to understand how the actual requirements
>>>> differ. (Please note that even if the requirements differ, that doesn't
>>>> mean that we need different technology to address them.)
>>> Perhaps not. But that was the design choice we made and its
>>> a valid one.
>> Given the more than 20 years of experience we have of how URIs are
>> handled on the Web and around it, I have very good reasons to strongly
>> doubt that this is a good design choice.
>>>> Why do you say that ni: URIs are not intended for humans to speak?
>>> So phone me up and say this:
>>>     ni:///sha-256;UyaQV-Ev4rdLoHyJJWCi11OHfrYv9E1aGQAlMO2X_-Q
>> So apart from the upper/lower problem, would the equivalent nih: URI be
>> any better?
>>>> What
>>>> am I supposed to do if I got an ni: URI in a mail message and call you
>>>> on the phone to tell you about that?
>>> Not my problem actually, but I guess most people might
>>> say "remember that mail you sent me with all that gobbledygook
>>> nonsense - what the hell was that about?" :-)
>> I think "tell you about it" should have been "speak it out to you".
>>>>    If I want to send somebody the
>>>> information in an ni: URI by mail, should I use only the ni: version or
>>>> only the nih: version, or both, if I can't exclude that the recipient
>>>> may want to relay this information via voice?
>>> You can try either and let me know what works. This seems like
>>> a very artificial use-case for ni.
>> Why artificial? Long URIs get spoken over the phone once in a while. And
>> I really don't understand why this is "very artificial" for ni:, but
>> "indispensable" for nih:.
>>>>> Finally, we have (some, early,) running code that matches the
>>>>> current draft and that ought also count for something
>>>> How much?
>>> Feel free to go look and see. [1] I've not counted lines of
>>> code, but we have c, python, ruby and clojure library
>>> implementations and some apps and other bits and pieces.
>>>      [1]
>>>> The boiler plate on every ID is pretty clear that they are not
>>>> set in stone. Also, the changes needed to merge the two schemes are not
>>>> rocket science, quite to the contrary. (I herewith volunteer to fix the
>>>> Ruby version, just to show)
>>> I didn't say our code is set in stone. I said that running code
>>> counts.
>>> I didn't say a merge would require rocket science.
>> It somehow sounded like this.
>>> I said it'd be
>>> a bad idea and would produce a worse result.
>> Can you substantiate that? The discussion on use cases is still open,
>> and the argument with all the 32 combinations also didn't fly.
>>>>> when compared
>>>>> to a change that would be a gratuitous dis-improvement
>>>> In what sense would merging the two schemes be a dis-improvement? Can
>>>> you please explain?
>>> I believe I did that above.
>> Very marginally, perhaps. More information would definitely help.
>>>>> based it
>>>>> seems upon dubious argument
>>>> If you think that my arguments are dubious, please explain exactly why.
>>> I believe I did that above. (To be clear: not all your argument is
>>> dubious
>> Thanks for some encouragement :-).
>>> but the httph/m part is IMO.)
>> I'm sorry that I started out with that analogy, but I thought it would
>> be easy to understand. You seem to be caught in the specific use/corner
>> cases, but I'm trying to look at the long term big picture.
>>>>> that is also offered at the wrong
>>>>> point in the process.
>>>> See above. If there's something wrong with IETF Last Call, or with the
>>>> fact that the Apps Area Directorate does reviews (which I don't think),
>>>> then that should be addressed separately. For this discussion, I hope we
>>>> can concentrate on technical issues.
>>> Right. But let's not ignore the fact that the uri-review
>>> list had sight of this at the end of April.
>> [If you keep insisting, why don't you check your MUA and look at all the
>> messages I sent on the draft. You may be surprised, but I seriously hope
>> you will stop bringing up "uri-review" again.]
>>> Bottom line - we have use-cases and a valid design and running
>>> code that as far as we know works and I see no reason to make
>>> the change you'd like, which would make thing worse IMO.
>> It may look like some more work in the short time, but it will make
>> things more general and create more potential for the future.
>> Regards,    Martin.