Re: [urn] Benjamin Kaduk's Discuss on draft-hakala-urn-nbn-rfc3188bis-01: (with DISCUSS and COMMENT)

John C Klensin <john-ietf@jck.com> Sat, 09 June 2018 20:20 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F8A9130F76; Sat, 9 Jun 2018 13:20:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0
X-Spam-Level:
X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iHGi58RHkCud; Sat, 9 Jun 2018 13:20:49 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7FD92130F74; Sat, 9 Jun 2018 13:20:49 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1fRkLu-000MSW-H6; Sat, 09 Jun 2018 16:20:46 -0400
Date: Sat, 09 Jun 2018 16:20:39 -0400
From: John C Klensin <john-ietf@jck.com>
To: Benjamin Kaduk <kaduk@mit.edu>, Peter Saint-Andre <stpeter@mozilla.com>
cc: urn@ietf.org, The IESG <iesg@ietf.org>, draft-hakala-urn-nbn-rfc3188bis@ietf.org
Message-ID: <A26E16146AB2ED0AB3D676DD@PSB>
In-Reply-To: <20180608203227.GD16349@kduck.kaduk.org>
References: <152837409539.30768.4568779645299135020.idtracker@ietfa.amsl.com> <6a1a100c-3bc0-76d3-3ae4-047d37906bfc@mozilla.com> <20180608203227.GD16349@kduck.kaduk.org>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/tufDRh0rAaUYtxaqSQIUjNN7hKc>
Subject: Re: [urn] Benjamin Kaduk's Discuss on draft-hakala-urn-nbn-rfc3188bis-01: (with DISCUSS and COMMENT)
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Jun 2018 20:20:52 -0000


--On Friday, June 8, 2018 15:32 -0500 Benjamin Kaduk
<kaduk@mit.edu> wrote:

>> The semantics of r-components are yet to be defined. I would
>> venture that the IETF is probably not the right place to do
>> that work, given how little energy remained in the URN WG at
>> the end (and we probably didn't have the right people in the
>> room in the first place).
> 
> I won't argue with that.  Does it make sense to say something
> like "There are not currently any broadly accepted semantics
> for r-components at the time of this writing which may be
> grounds to be cautious with their use" in this document?

Perhaps.  But see below.

>...
>> >    If an NBN identifies a work, descriptive metadata about
>> >    the work SHOULD be supplied.  The metadata record MAY
>> >    contain links to Internet-accessible digital
>> >    manifestations of the work.
>> > 
>> > This left me confused.  Is it only intended to apply in the
>> > case described in the previous paragraph, where the
>> > resource identified by the NBN is not available in the
>> > Internet?  Or does it always apply, forcing the metadata to
>> > take precedence over delivering the actual work?  (Or maybe
>> > I'm just confused, and there's an easy way to deliver both
>> > metadata and the actual work alongside each other with no
>> > ambiguity.)
>> 
>> Juha can clarify this.

In the interest of saving time, I can probably get this one.
The answer is "yes", this is intended to be about works not
accessible on the Internet (although a very similar issue
applies in at least one case where the NBN describes a
conceptual work whose components also have NBNs, but not all of
the components are available on the Internet.  The extra
paragraph break my be my fault as periodically serving as copy
editor on the I-D.

>> > Section 4.1
>> > 
>> >    National Bibliography Number (NBN) is a generic term
>> >    referring to a group of identifier systems administered
>> >    by the national libraries and institutions authorized by
>> >    them.
>> > 
>> > "the national libraries" implies a specific set -- which
>> > ones?  It may be better to hedge with "some national
>> > libraries".
>> 
>> Or remove "the" ... "by national libraries".

> That's probably better :)

That would be my preference, but Juha should decide on this.




>> > Section 4.2.2
>> > 
>> > Do we need to say anything about a URN-to-URI step before
>> > talking about URI-to-resource services?

Given what 3986 has to say, a URN-to_URI step would be an
oxymoron.  If you meant a URN-to-URL step, that is probably a
matter for 8141 and it may be worth pointing out that members of
the web community (a euphuism for a particular, mostly known,
set of individuals who claim to speak for that community in case
you haven't figured that out) have been violently opposed to
such text, claiming that, if it is needed, then there is really
no need for URNs.  On the other hand, while the URNBIS WG could
not reach consensus on any particular proposal and did reach
consensus about not trying to proceed with definitions, that is
much of what r-components are expected to be about.

>> > I'm also wondering about any relationship between "component
>> > resource" NBNs and f-components of the containing work.  If
>> > there is are NBNs assigned to both an image within a work
>> > and that containing work, and an NBN with f-resource is
>> > used to refer to the image within the containing work, is
>> > there any relationship between the f-resource and the
>> > image-specific NBN?

On a per-sub-namespace basis, possibly.   In the general case,
maybe.  This is not an NBN issue but an issue about how
namespaces are managed, organized, and used, i.e., probably an
8141 issue.

>> > Section 4.3
>> > 
>> >    Expressing NBNs as URNs is usually straightforward, as
>> >    only ASCII characters are allowed in NBN strings.  If
>> >    necessary, NBNs MUST be translated into canonical form
>> >    as specified in RFC 8141.
>> > 
>> > When is it necessary?
>> 
>> It seems that in theory an NBN itself could contain non-ASCII
>> characters, whereas an NBN URN and its nbn_string construct
>> can contain only ASCII characters. At least that is my
>> understanding.

That is correct.  But, more or less per 3986, _any_ URI can
contain non-ASCII characters in the tail by %-encoding them.
There were some moves in the URNBIS WG to restrict that for
URNs, but it met resistance from the usual suspects.  The bottom
line here, and I don't know how loudly to say it, is that using
non-ASCII characters in nbn_strings would probably nothing short
to stupid, especially given that both IETF and W3C have
suggested that they be avoided in identifiers non-specialist end
users are not expected to see.   However, due to a problem that
goes back well before the early decision that ISO 8859-1 was
going to be an adequate encoding for HTMP content (but of which
that decision is symptomatic), it would be unsurprising if one
or more national libraries whose local language uses Latin
script with a few lightly-decorated characters had not taken
that advice or had decided to incorporate existing (perhaps for
decades) identifier strings with a few Latin characters outside
the ASCOO subset into their national NBNs.   One could imagine
rewording the text mentioned above for more clarity (a job I
will happily leave to the experts who make up the RFC Editor
function) but the bottom line is that all we do is to say "don't
do that, but if you decide to do it anyway, this is what you
must do to prevent even worse problems".  

>> >    Being part of the prefix, sub-namespace identifier
>> >    strings are case- insensitive.  They MUST NOT contain
>> >    any hyphens.
>> > 
>> > This MUST seems to just duplicate a syntactic requirement
>> > from the ABNF; is RFC 2119 language really necessary?
 
>> /me shrugs

Probably not, but, while Juha should confirm, I assume that part
of the origin of this text is that several other International
Standard identifiers, e.g., ISBNs, all hyphens and treat them as
optional.   It might be wise to reinforce the message that the
URN:NBN solution to the problems that causes is to clearly say
"no" and say that clearly enough that even those whose eyes
glaze over at ABNF will get the message.   Whether it is better
done by something like the sentences above or by saying "Hyphens
are prohibited by the ABFN, see Section XXXX" is, IMO, a matter
of editorial style and preference.

>> > Section 8
>> > 
>> >    John Klensin provided significant editorial and advisory
>> >    support for late versions of the draft.
>> > 
>> > Presumably that's "later versions"?
>> 
>> Yes.

I really don't care.  If one thinks this is an editorial
problem, leave it to the RFC Editor.  If one thinks it is
substantive, remember that, while this is a -00 draft, the I-D
itself has been through many iterations under other names, so it
depends on how you count because I had nothing to do with early
version of the I-D and at most only a reviewer/participant role
in RFC 3188.  If this were a different sort of document and I
cared, I could make a strong case that I've been involved enough
and have written enough text to be listed as a Contributor, but
I think the nature of this document is that it is better if Juha
is sole author without contributors other than Alfred. 

FWIW, I can't see why attribution at this level should be an
IESG problem unless you have reason to believe that IPR rules
are being violated.

Finally, to avoid writing a separate note even though it will
make this a paragraph longer, I think several of the comments
you )Benjamin) and Adam have made make a strong case for a
clarifying update to RFC 8141.  In principle, I agree with that.
It is little surprise to me that new URN namespace proposals are
exposing issues that, if we had more ability to predict them,
would have been reflected in 8141 itself.  The difficulty with
such an update is that, at the time 8141 and 8254 were
completed, the URNBIS WG had run out of energy and was
developing a level of acrimony that made further progress
unlikely.  If we were to try to open 8141 to do a clarification,
I can just about guarantee that some of those who were the
sources of frustration that led to that acrimony would insist
that no document move forward until their pet issues and easy
solutions were addressed.  That, in turn, would result in a
situation like the i18n one, only with less downside if the
issues are not addressed and more issues that soul require
solving fundamental philosophical disagreements in IETF
community. I can't recommend going there, but it doesn't seem to
me that trying to clarify 9141 by text in a single namespace
definition is the solution either.

best,
     john