Comments on draft-seantek-rdf-urn-00
worley@ariadne.com (Dale R. Worley) Thu, 13 November 2014 02:37 UTC
Return-Path: <worley@ariadne.com>
X-Original-To: urn-nid@ietfa.amsl.com
Delivered-To: urn-nid@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94F7A1A1ADA for <urn-nid@ietfa.amsl.com>; Wed, 12 Nov 2014 18:37:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kamH53govBEe for <urn-nid@ietfa.amsl.com>; Wed, 12 Nov 2014 18:37:43 -0800 (PST)
Received: from resqmta-ch2-09v.sys.comcast.net (resqmta-ch2-09v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:41]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9D7321A1A9D for <urn-nid@ietf.org>; Wed, 12 Nov 2014 18:37:43 -0800 (PST)
Received: from resomta-ch2-11v.sys.comcast.net ([69.252.207.107]) by resqmta-ch2-09v.sys.comcast.net with comcast id Eqck1p0022Ka2Q501qdiF4; Thu, 13 Nov 2014 02:37:42 +0000
Received: from hobgoblin.ariadne.com ([24.34.72.61]) by resomta-ch2-11v.sys.comcast.net with comcast id Eqdh1p00V1KKtkw01qdiev; Thu, 13 Nov 2014 02:37:42 +0000
Received: from hobgoblin.ariadne.com (hobgoblin.ariadne.com [127.0.0.1]) by hobgoblin.ariadne.com (8.14.7/8.14.7) with ESMTP id sAD2bfqg001732; Wed, 12 Nov 2014 21:37:41 -0500
Received: (from worley@localhost) by hobgoblin.ariadne.com (8.14.7/8.14.7/Submit) id sAD2behU001729; Wed, 12 Nov 2014 21:37:40 -0500
Date: Wed, 12 Nov 2014 21:37:40 -0500
Message-Id: <201411130237.sAD2behU001729@hobgoblin.ariadne.com>
From: worley@ariadne.com
Sender: worley@ariadne.com
To: Sean Leonard <dev+ietf@seantek.com>
In-reply-to: <05E89947-5180-40BB-A14A-9D97E92DDAB1@seantek.com> (dev+ietf@seantek.com)
Subject: Comments on draft-seantek-rdf-urn-00
References: <05E89947-5180-40BB-A14A-9D97E92DDAB1@seantek.com>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1415846262; bh=erZtKO7o+CRfa7ZVmCFXIOBH3RcrLC24LVUKTEKlnt8=; h=Received:Received:Received:Received:Date:Message-Id:From:To: Subject; b=m4fJHcOACVrDWSVm7AjcWLeiKWeMBdRmNJOyIe4n3ZsOLMe/in7+t52TvY43jN/eM R7MZ85Y58ywnoqlxUpFeeLYRYeR2NGlEW67PBO/aHtWcNLsIJHS8pens8lE0G6BHPb B7Ps5Y/h3WGWBsgyCNRW2oaSRCd3PGm3lsKlxZfAxLlGm8OM/SVZ3pDtcce6H2JgPg iuob0zpTrjge3B87Wt7G9JDD8zf2nLu4l1EjE2nFWCeWdMoSKH6a1AaBDqULdoKjls g+DwxMKOqklneV+MqSjFxelomKzA9sAizPUCXs2fe5zIFxs/dgqTC0OtgLDindA5ir k6MmbERXxH3/Q==
Archived-At: http://mailarchive.ietf.org/arch/msg/urn-nid/56oLfVh36MyxO7VpJFYsLUFk8xE
Cc: urn-nid@ietf.org
X-BeenThere: urn-nid@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: discussion of new namespace identifiers for URNs <urn-nid.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn-nid/>
List-Post: <mailto:urn-nid@ietf.org>
List-Help: <mailto:urn-nid-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Nov 2014 02:37:46 -0000
(Many of these comments apply to draft-seantek-xmlns-urn-00 as well.) 1. Introduction The Resource Description Framework [RDF] is a framework for representing information in the web. RDF contains nodes that are identified by URI references. The URI reference is basically an opaque string with semantics applied onto it by the RDF standard; RDF applications are not required or expected to dereference the URI. You almost certainly mean "URI" not "URI reference" ("RDF contains nodes that are identified by URIs.") -- a "URI reference" is an appearance of a URI in a particular place, in the same way a footnote in a book is a reference. There can be many URI references to the same URI. (You can see this by expanding the acronym "URI": "RDF contains nodes that are identified by Uniform Resource Identifiers." not "RDF contains nodes that are identified by Uniform Resource Identifier references.") This document defines a URN specifically for identifying RDF URI references. This should read "This document defines a URN namespace identifier [or alternatively, NID] specifically for use in constructing RDF URIs." The abstract resource does not have any particular concrete representation (such as a type of content identified by Internet media type), although concrete representations may be associated with it. This is true, but you say later that, one or more URIs may be associated with the URN via the IANA registration. I believe that these are an important feature of the NID, and you should probably discuss them here, as they act somewhat like a representation of the URN. Abstract parts of the abstract resource can be identified with fragment identifiers. How does this statement interact with the fact that RFC 2141 does not admit fragment identifiers for URNs, and that there is work afoot to allow fragment identifiers for URNs? Declaration of syntactic structures: The structure of the Namespace Specific String is any valid XML name corresponding to the "Name" production in Section 2.3 of [XML] (production 5), with the following restrictions: 1. The name MUST be at least four characters. 2. Colons MAY be used as arbitrary intra-name dividers. 3. Colons MUST NOT appear at the beginning or end of the name. 4. Consecutive colons are PROHIBITED. and the following relaxation: 5. The first part of the name preceding the first colon MAY be a whole decimal number as discussed in "Process of identifier assignment". "PROHIBITED" is not an RFC 2119 word, so it should probably appear in lower-case. While it's a convenient mnemonic to compare the NSS syntax to the Name syntax, it's easier for implementers if the NSS syntax is given directly and in one place. It's not clear what item 2 means, as there is no definition of "divider", and the syntax of Name allows a colon in any position. The phrasing of 5 is peculiar, since Name allows digits in all positions except the first -- it would seem that the natural expression would be "The first character may be a digit". But if that is the intended meaning, why was it not stated in that simpler way? The text suggests that the use of leading digits is only permitted 'as discussed in "Process of identifier assignment"', suggesting that the use of leading digits is further restricted in some manner which is not specified in this section. Whatever restrictions are placed on the use of leading digits should be expressed unambiguously in this section. The stated syntax does not allow fragment identifiers, but the rest of the document presumes that a fragment identifier may be present. Probably this section only intends to define the RDF URN NSS, leaving implicit the syntax of the full RDF URN. That should be clarified by providing an explicit production for <rdf-urn>. When encoded in a URN, Unicode code points beyond U+007F are encoded as percent-encoded UTF-8. Conveniently, all XML name characters in the US-ASCII range are in the [RFC3986] unreserved set. Describing the syntax of the NSS by specifying a set of Unicode strings and then an encoding to be applied to that set of strings to produce the URNs is formally correct but puts a burden on an implementer. It would be better if that aspect of the syntax was also described as a combination of an informal description of the intention and a complete and correct ABNF. Identifier uniqueness considerations: Once a name is registered in the IANA registry, it is unique. Identifier persistence considerations: Once a name is registered in the IANA registry, it is permanent. These are awkwardly phrased. It would be better phrased as "The meaning of an identifier is registered in the registry, and thus is unique." and "Once an identifier is registered, its meaning cannot be changed." (However, if a registration can be withdrawn as is mentioned in passing later in the draft, can a URN whose previous registration was withdrawn later be registered again -- with a different description/registrant/associated URIs?) Process of identifier assignment: Identifiers are registered with IANA on a First-Come, First-Served basis. One-character names and prefixes are RESERVED for further use. Two- and three-character names and prefixes are RESERVED for language tags and regional codes; however, those names have no such semantic content when used in an RDF URN. Whole number prefixes are RESERVED for IANA Private Enterprise Numbers. Registrants are free to register names with reserved two-character and three-character prefixes, such as "au:flag" or "en:us:ca:lax". Registrants are also free to register names with reserved whole number prefixes, such as "20:10-250". There are a number of difficulties here. The word RESERVED is not an RFC 2119 word, and so should probably be in lower-case. The second and third sentence describes one-, two-, and three- character "names". It's not clear what "name" means here. By default, I expect it to be the same as "URN", but of course all URNs have at least 7 characters. So perhaps "name" means "NSS". But the syntax definition restricts NSSs to have at least 4 characters. I don't understand what "however, those names have no such semantic content when used in an RDF URN" means. -- You've just said that such (syntactically excluded!) "names" are reserved, which presumably means that they can't be used as NSSs to form URNs that are registered. But if they can't be used to form URNs, how could they be "used in an RDF URN"? Or is the important consequence based on one-character "prefixes"? But every string has a one-character prefix. When you say, "Whole number prefixes are RESERVED for IANA Private Enterprise Numbers.", what do you mean by "reserved"? There is no obvious semantic correspondence between private enterprise numbers and URN NSSs. And by "prefix" to you mean the default meaning of "an initial substring of characters in a string", or (as I suspect you mean) "an initial substring of characters in an NSS which is followed by a colon"? I suspect what you mean is that any NSS of the form <digits>:<something> is implicitly associated in some way with the registrant of the enterprise number <digits>, but you don't say that explicitly. And at the end you say "registrants are free to register names with reserved ... prefixes". In what manner are prefixes "reserved" if registrants are free to register them? (And in particular, if a registrant can register a URN starting with a private enterprise number even if it is not the registrant of the private enterprise number.) Process for identifier resolution: The fact that one or more URIs may attached to an RDF URN's registration to provide a resolution for the URN is a very important feature of your definition, and it deserves to be described more explicitly than just as part of the "Process for identifier resolution". Indeed, in practice, it's a fundamental part of the semantics of an RDF URN as you have described it. Conversely, since there is (as far as I know) no defined way for a URN resolver to look up the information associated with *any* IANA registration, it's not clear how this information can be used in an implemented resolution software. More discussion is needed on how a queryable database of these associations is to be accessed. Similar considerations apply to "Validation mechanism". Fragments (delimited by the # character) are not considered part of the namespace-specific string, so a fragment would not affect lexical equivalence. This sentence seems to be part of the rules for lexical equivalence, but is not in that section of the template. Assuming that URNs with fragment identifiers are compared in the ways described in RFC 3986 for generic URIs, the fragment identifiers of two URNs *are* compared when testing the URNs for equivalence. 3. IANA Considerations These considerations place various burdens on IANA. Has anyone checked that IANA is in position to undertake them? In particular: The registration template SHALL be encoded in UTF-8. Can IANA process registrations containing arbitrary Unicode characters? If a registrant attempts to register a name that is confusingly similar to other registered names (such as only differing by case, or differing by code points but generating the same or confusingly similar visual representations), the registrants of the prior names are to receive a warning notification of the impending registration. However, there is no protest mechanism; the registration will still succeed unless withdrawn by the registrant. IANA SHOULD implement a modern algorithm to detect such confusingly similar names. What is the definition of "confusing"? Can an RFC apply a SHOULD to IANA? If a registrant attempts to register a name that contains a whole number prefix, the registrant of the corresponding IANA Private Enterprise Number is to receive a warning notification of the impending registration. Is IANA prepared to undertake this? Given that there is no protest mechanism, what is the purpose of these? As far as I know, there is no non-IETF mechanism that any registrant could use to oppose such a registration. There is no defined mechanism for withdrawing any registration that I know of. Is there intended to be one for this registry? Dale
- Review and URN assignment for draft-seantek-rdf-u… Sean Leonard
- Comments on draft-seantek-rdf-urn-00 Dale R. Worley
- Re: Comments on draft-seantek-rdf-urn-00 Sean Leonard
- draft-seantek-rdf-urn-00 and draft-seantek-xmlns-… Dale R. Worley
- Re: draft-seantek-rdf-urn-00 and draft-seantek-xm… Sean Leonard
- Re: draft-seantek-rdf-urn-00 and draft-seantek-xm… Dale R. Worley
- Re: draft-seantek-rdf-urn-00 and draft-seantek-xm… Sean Leonard
- Re: draft-seantek-rdf-urn-00 and draft-seantek-xm… Sean Leonard
- Re: draft-seantek-rdf-urn-00 and draft-seantek-xm… Dale R. Worley