Re: [urn] Comments on WG Items -- #2 : related to rfc2141bis -00 (fwd)

Alfred Hönes <ah@TR-Sys.de> Thu, 23 December 2010 22:53 UTC

From: Alfred Hönes <ah@TR-Sys.de>
Message-Id: <201012232254.XAA14978@TR-Sys.de>
To: urn@ietf.org
Date: Thu, 23 Dec 2010 23:54:35 +0100
Mime-Version: 1.0
Content-Type: text/plain; charset="hp-roman8"
Content-Transfer-Encoding: 7bit
Subject: Re: [urn] Comments on WG Items -- #2 : related to rfc2141bis -00 (fwd)
Precedence: list

Resending -- sorry, inadvertantly missed copying the list!


Mykyta,
here are my initial responses to your thoughts about
  draft-ietf-urnbis-rfc2141bis-urn-00 .

I'll let the discussion settle before any changes are actually
made to the draft and a new version gets posted.


> ==URN Syntax draft==
> 1. Abstaract
> I think it should be more brief. I propose the following:
> "This document specifies generic syntax for Unifirm Resource Names
> (URNs), which stand as persistant, location-independent resource
> identifiers. It obsoletes RFC 2141"

The "Discussion" Note is intended to disappear in future draft
versions and will not be shipped to the IESG.  The majority of the
text is taken from RFC 2141 and should not be changed without need,
as I have tried to explain in my general response.

The short paragraph on Namespace specifications has been added as
additional guidance to the document set, which will thereby be already
present in the future RFC metadata, so that interested parties will
be led as easily as possible to the sibling document.
Once the rfc3406bis draft progresses, this short paragraph will
naturally shrink further.
For now, I'd prefer to leave the text "as is".

(Btw:
We are well below the "should not be much longer than 20 text lines"
rule from the RFC style guide.)


> You may add additional info in Introduction. Proposed abstarcat
> reflects what the documents is about clearly and briefly. What do you
> think about this?

See above.

The RFC authoring guidelines encourage the duplication of information
from the Abstract in the Introduction, likely adding more details and
citations, which are not possible in the Abstract.

Note that the Abstract becomes RFC metadata and hence most relevant
to people looking for relevant documents, but it is skipped routinely
by many readers of RFC documents.
The Abstract therefore needs to provide enough information for
readers of RFC metadata to allow them to make an educated guess
of relevance for them.

Further, the Abstract is a favorite focal point of wordsmithing for
reviewers -- including the IESG --, and hence needs to be clear and
informative enough for persons that usually focus on very different
matter.  For replacement documents, the IESG usually requests
evidence for the necessity of changes in central points, and the
Abstract is on such central point.


> 2. Introduction Section.
> -I propose to replace the paragraph sequence so that it is: para2,
> para3, para1. The most significant ones should go first, but now they
> don't.

Significance and importance vary for different parties.

The text has been formed under the impression of the discussion
during the pre-BoF efforts, where the testimonial to the content
of the first paragraph was seen as a fundamental cornerstone for
entering into a WG formation process.

If these concerns do not apply any more, I'll not insist on the
present order of paragraphs.  I hope for comments and guidance
from "old URN folks" on this detail.

The first para is a preview and motivation for what follows.
It seems logical to start with that material.


> -Rename the Section 1.1 to 'Historical Overview of URNs'.  ...

Isn't it more an overview of past processes and documents?
"Overview of URNs" seems to raise expectations for technical and
usage elaborations.
The text does not focus on properties of, and use cases for, URNs,
which would IMHO qualify the text for the suggested headline.

>    ...    And remove paragraphs 1, 9, 10.

Cf. my general comments.

The changes related to the shift to ABNF have been one of the focal
points in the motivation for the entire effort.  Giving here the
background information on why this shift is needed seems to be useful.
Note that the lack of a formal specification of the notation used in
RFC 2141 would disqualify such document in these days even if it
seeked only publication as an Informational RFC.
We want a document that can be advanced on the Standards Track,
and therefore the need for a change is elementary.

>          ...  We can put the ABNF convention into section 1.4.

Sect. 1.4 introduces *what* is going to be used in the sequel.
It would IMO be confusing to discuss there what this document
is _not_ using.
Also, [A]BNF is not "requirements language" (something at the meta
level), but a formal language used in the syntax description.

> Putting info about used BNF in previous specifications is odd, I
> think.

See above.  The arguments that lead to our charter and are presented
there in the definition of the current work item IMO should be
reflected in the document.   The IESG will compare the description
of what we are chartered to do with the outcome of the WG, and this
is an element that maintains this linkage.


> -Remove paragraphs 2, 3, 5 in section 1.3. The, IMO, also give
> odd information.

These paragraphs, again, are directly related to the discussion that
lead to the WG charter and to the charter itself.  They add to the
linkage between charter and deliverable of the WG.  Further, para #2
refers to the "running code" principle that is regarded as a basic
element of work in the IETF.  If there were not proven needs from
existing and evolving implementations, it would have been difficult
to convince the IESG and the IAB to charter this WG.


> -Rename section 1.4 to Convention and add there ABNF convention.

This is almost boilerplate text.

Note that RFC style, as practiced since several years, insists on
listing all RFC 2119 keywords -- even those that do not appear in the
body of the memo.

As already noted, requirements language supplies a "checklist"
for compliance verification/testing; ABNF is one possible means
(a toolkit) to build implementations on.  So these are at two
different abstraction levels and should perhaps not be merged
into a standard section with established boilerplate text.

There's no need for a parser implementation to actually use the ABNF
description in the document, nor can the use of ABNF be verified from
external behavior; thus it cannot be subject to compliance testing.

Changes to that boilerplate would likely cause discussions in the
approval process and need justification.  I'd prefer to avoid such
complications and stay with the standard text and headline.


> 3. Syntax itself
> -Remove ABNF convention from para 1 of section 2 (put in 1.4).

No.  Section 1.4 is at the meta level, dealing with language that
specifies externally observable conformance requirements, whereas
Section 2 is the normative specification kernel of the memo.


> -ABNF nit: <NID> and <NSS> are capitalized, but only core rules
>  are capitalized.

This is legacy and falls under "don't change unless strongly needed"
policy.  Use of capital letters is not forbidden there, and capital
NID and NSS are in widespread use in existing RFCs and other documents.
So while a new "clean table" design would indeed perhaps use lowercase,
we should emphasizes continuity.

Also note that, per RFC 5234, ABNF rule names are case insensitive.
There is anecdotal evidence of conflicts caused by ignoring that.
Nevertheless, for consistency it seems valuable to remain stuck with
a single form of capitalization that already is in widespread use.


> -Generic URI syntax - shouldn't we mention that only one segment is
> allowed? we do not use nore than 1 segment in URNs.

There's the top-down view from the RFC 3986 perspective.
A few lines later, the draft says:

   ... the following additional syntax rule is superimposed on
   <path-rootless> to establish a level of hierarchy called "Namespace":

      urn-path   = NID ":" NSS

and NSS is further described in Section 2.2.
The text there already contains the "preview":

     NSS = segment-nz

that becomes valid if the suggested modifications from RFC 2141 are
going to be adopted, which will allow common parsing code for URNs
as for general URI parsing.

So this already is in the draft, and once we have hashed out the
open points, the text will become more compact and this consequence
will hopefully become more evident at first glance.


> -Why do we allow only 32 cahrs in NID? Do not we need more? (that is
> question for discuss).

Yes, as in RFC 2141, the draft allows up to 32 characters.
The longest NID registered so far (within a decade or so) uses
9 characters, and nobody ever has called for more.
So why change the rule?

We are assumed to act conservatively, and without real need
and/or benefit, no changes to existing specs will be suggested.
(Note that Service Names (previously: port names) will remain
limited to 14 characters under the emerging IANA registry rules,
whereas older specifications admitted up to 40 characters.)
IMHO, 32 characters should suffice, with *much* headroom.

However, I observe that an unintentional change did make it into the
draft: "0*31" should be "1*31" for compatibility with RFC 2141.
This will be corrected for the next draft update.

> -As for form fo NID - I consider that to be OK (letters, digits and
> hyphens only). If URNs are 'human-friendly' identifiers, the need to
> be simple.

There are plausible arguments, however, to further restrict the possible
combinations of characters, as noted in the draft, in order to ensure
that a NID cannot be confused with a (decimal) number representation.
We have a single NID registered so far that contains a leading digit,
but further restrictions like the syntax rule below would be compatible
with all present NID registrations.

For example, we could adopt/adapt the upcoming ABNF rule for Service
Names (typically tied to transport protocol ports)
[ cf. draft-gudmundsson-dnsext-srv-clarify-02.txt ]:

     NID    = 2*32 (ALPHA / DIGIT / HYPHEN)
            ; also conforming to the <token> rule below
     token  = *(1*DIGIT [HYPHEN]) ALPHA *([HYPHEN] (ALPHA / DIGIT))
     HYPHEN = %x2d      ; "-"


> -Should we put @ in the allowed chars. Some software would interpret
> that as e-mail address.

As I understand you, that's an argument in favor of "no change".

> -If we restrict %00, we should put it into 'excluded' rule. ...

%x00 *is* in CTL (Appendix B.1 of RFC 5234).
So you're knocking on a widely open door.

> Moreover, let's put MUST requirement for 'excluded' rule as
> 'MUST not be used in URNs'

The <excluded> rule is added for clarity and completeness only.
It is simply a corollary of what is specified in 2.2.
An additional "MUST" thus does not make sense, because it would be
redundant and could not be tested independently of the rules in 2.2.


> 4. Minor questions.
> -Should we discuss the document at uri_review and uri@w3.org mailing
> lists?

An announcement of the URNbis effort with an invitation for
participation has been sent to both lists (and the iri list, btw)
long before the BoF in Maastricht last summer, when the first
individual rfc2141bis draft has been submitted.

We will follow normal procedures and announce WGLC for this draft
on uri-review as well, once we have arrived at this stage.


> -If we reserve 'urn' NS, we should provide the template for it.

That doesn't make much sense to me.
Due to the form of the string, we would have to go for a Formal NID
registration, but the related requirements could not be satisfied;
for example, such registration would need to "demonstrate utility
in the Internet" -- whereas we claim there would be no utility
at all in the usage of this hypothetical NID, but confusion.

Let's avoid overhead and let IANA simply add a footnote to the
registry that records this permanent prohibition of assignment.
(This is actually requested in the rfc3406bis draft.)


> -Do we really need Appendix A?

See my (independently sent) general notes.


Kind regards,
  Alfred.

Re: [urn] Comments on WG Items -- #2 : related to… Alfred Hönes