Re: Publication request for draft-spinosa-urn-lex

Barry Leiba <barryleiba@computer.org> Mon, 05 May 2014 14:37 UTC

MIME-Version: 1.0
Sender: barryleiba@gmail.com
In-Reply-To: <CALaySJJk5YiCQZqt6WoWkqfAzi2A04HEAH=vG0pVAy8e45N5aQ@mail.gmail.com>
References: <CALaySJJk5YiCQZqt6WoWkqfAzi2A04HEAH=vG0pVAy8e45N5aQ@mail.gmail.com>
Date: Mon, 05 May 2014 10:37:04 -0400
Message-ID: <CALaySJLiqXBrP_6yCCzTjK9hWaNooLJM5H0w_MVpAmmKjCEBOg@mail.gmail.com>
Subject: Re: Publication request for draft-spinosa-urn-lex
From: Barry Leiba <barryleiba@computer.org>
To: draft-spinosa-urn-lex.all@tools.ietf.org, Ted Hardie <ted.ietf@gmail.com>, Patrik Fältström <paf@frobbit.se>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/urn-nid/wqKYA_QLOkF6XoG6eTtxtthN918
Cc: urn-nid@ietf.org, Andrew Newton <andy@hxr.us>
Precedence: list

This is a follow-up to my review far below, giving some concrete suggestions.

I'm looking at what I consider to be the major issues with the LEX NID
proposal, and I'd like to put forth my own proposal for how to resolve
them and move forward.  I'm not saying that this is the only way
forward, but I am saying that these are issues that need to be dealt
with, and that what I'm proposing will deal with them to my own
satisfaction.  Please consider adopting these into the draft and
posting an update.  And please feel free to discuss any of this with
me.

The major issues I see are these:

1. While the two-character jurisdiction IDs are not likely to change,
they might do so, and have done so already.  The desire for URN
persistence makes it extremely unwise to expect wholesale renaming,
should some IDs change in the future.  Ted has suggested using
three-character jurisdiction IDs to avoid this problem, and I urge you
to switch to that mechanism.

2. You're requiring that LEX URNs conform to some naming convention
that appears to be built around the documents' titles.  That
requirement is very much dependent upon languages and character sets,
and isn't advisable on that ground; it's also likely to run counter to
the desire by some jurisdictions to use catalogue numbers and other
schemes that have nothing to do with document titles or keywords.

3. You're setting up direct dereferencing of LEX URNs as though they
were HTTP URIs, and that's contrary to how URNs are used and puts a
burden on client software to know how to do the dereferencing.  It
also seems unnecessary in general: if this NID catches on, documents
and their URNs will be indexed and will be locatable through search
engines.  Further, organizations that makes legal documents available
are likely to provide their own repositories for the documents, and,
hence, their own mappings, which might lead one to a copy that has
notes and commentary alongside it, for example.  A better approach is
to suggest a way of mapping URNs to HTTP URIs, and to offer a service
and registration that jurisdictions could opt into.

What I'm suggesting here pulls out the actual normative requirements
for LEX URNs, and leaves the other things as suggestions for
implementation and deployment, making clear what behaviour is desired
without requiring things to be done a certain way.

I'll note that if your advice and suggestions are good, they'll likely
be widely adopted even if they're not put forth as requirements... but
that LEX URNs can still be useful even for jurisdictions that choose
not to use that advice.  So I think moving to this setup not only
resolves the specification issues, but also results in a more flexible
NID.

-- Section 1.1 --

OLD
   The identifier is conceived so that its construction depends only on
   the characteristics of the document itself and is, therefore,
   independent from the document's on-line availability, its physical
   location, and access mode.
NEW
   The identifier itself is assigned by the jurisdiction that owns the
   identified document, and it is independent from the document's on-line
   availability, its physical location, and its access mode.  Even a
   document that is not available online at all may still have a LEX URN
   that identifies it.
END

OLD
   In an on-line environment with resources distributed among
   different Web publishers, uniform resource names allow simplified
   global interconnection of legal documents by means of automated
   hypertext linking.
NEW
   In an on-line environment with resources distributed among
   different Web publishers, uniform resource names allow simplified
   global interconnection of legal documents by means of automated
   hypertext linking.  LEX URNs are therefore particularly useful
   when they can be mapped into locators such as HTTP URIs.
END

-- Section 1.2 --

It's nice to have this list for now, but the hope, of course, is that
many entities will pick this up, and that this list will become
obsolete very quickly.  Maybe the answer here is to just say, "The
following entities support this proposal at the time of publication:"
(which kind of seems self-evident, but which clarifies that this isn't
just for a few organizations only).

-- Section 1.3 --

OLD
   The LEX naming convention has interpreted all these recommendations,
   proposing an original solution for sources of law identification.
NEW
   Section (??) supplements the required name syntax with a suggested
   naming convention that has interprets all these recommendations
   into an original solution for sources of law identification.
END

-- Sections 1.4 thru 1.6 --

These sections should be reworked so that they clearly specify
desirable characteristics, but not normative requirements.  Some of
the content is likely to be better moved to a later section.

-- Section 2.3 --

There's a good reference for ABNF: RFC 5234.  You should use it (as a
normative reference), and not repeat syntax definitions here.

-- Section 3.3 --

Yeh, one has to love how legal folks talk.  But, really, this should
be much, much simpler and more straightforward: 'Names belonging to
the "lex" namespace are case-insensitive.  They MUST be created in
lower case, but names that differ only in case MUST be considered to
be equivalent."

(If you then feel that you have to explain why, go ahead and do it in
another paragraph.)

-- Sections 3.4 thru 3.9 --

These need to be moved into the suggested implementation section.

-- Section 4 and its subsections --

This is the suggested implementation section.  The top-level text in
Section 4 should make it clear that none of this is required for
constructing the URN, but is advice for making the URNs more useful.
(Personally, I'm not convinced that it does make them more useful, but
if you're convinced of it I have no objection to your saying so.)

-- Section 5.1 --

This needs to be fixed to address the "jurisdiction IDs shouldn't
change" problem.

I'll note that even three-character IDs might change, in the event
that a country name changes, but the former three-character ID is not
likely to be reassigned.  If we want to be *really* robust about it,
we could create a registry of codes that are never changed or reused,
where the registry maps those codes into the jurisdictions as we know
them.

But that's probably more than we need here.

-- Section 6 and its subsections --

I very strongly advise pulling back on this and working out some text
along these lines:

- As with other URNs, LEX URNs can not be directly used as "clickable
links" -- they can't be directly dereferenced.

- Nevertheless, they are most useful when document can actually be
located by using them.

- That means that we expect search engines to use them as keys for
locating documents.

- Owners of document repositories should provide easy mapping from
URNs to the locators for the documents in their repositories.

- ITTIG-CNR will maintain a mapping service that will be made
available to jurisdictions that choose to use it.  The service will
locate documents through URIs of this form:

   http[s]://lex.ittig-cnr.gov.it/urn/lex/<jurisdiction>/<local-name>

(with characters suitably encoded, if necessary, of course).

- Documents within the "it" jurisdiction will be directly served by
such URIs.  That is, "lex.ittig-cnr.gov.it/urn/lex" is the mapping
prefix for the "it" jurisdiction.

- Other jurisdictions may register their own mapping prefixes with
ITTIG-CNR.  ITTIG-CNR will redirect accordingly.  For example, if "br"
registers a mapping prefix of "lex.senado.gov.br/lex-map", then the
URN

   urn:lex:br:2013-003401-00003

...could be converted to the URI

   https://lex.ittig-cnr.gov.it/urn/lex/br/2013-003401-00003

...which would then be redirected to

   https://lex.senado.gov.br/lex-map/br/2013-003401-00003

- Naturally, each jurisdiction's mapping prefix could be used
directly, as well.  It's just that ITTIG-CNR is nicely providing a
referral service.

Now, it's possible that ITTIG-CNR doesn't want to provide the referral
service, and you might just want to set up the registry; that's fine
as well.  And it would be reasonable to have it be an IANA registry,
rather than something maintained by ITTIG-CNR.  I suggest a
registration policy of "First Come First Served" (with a reference to
RFC 5226), but you might want to use "Expert Review", with explicit
instructions that the designated expert is *only* meant to check that
people are not registering for jurisdictions they have no connection
to (malicious or accidental).

-- Section 7.6 --

The IANA Considerations would now replace the urn.arpa stuff with the
setup of the registry that I'm proposing above.

-- Section 7.7 --

OLD
   This document introduces no additional security considerations beyond
   those associated with the use and resolution of URNs in general.
END

This is patently false.  The system you're proposing has some huge
exposures to name collisions, faulty dereferencing and resolution, and
so on.  The changes I'm proposing here make it somewhat better, but
you still need to talk about issues involved with name
creation/mapping/resolution for cases where your suggested mechanisms
are used.

I can guarantee that Stephen Farrell will shred your document when it
comes to the IESG if you don't flesh out the Security Considerations
and actually think about how what your proposing can accidentally go
astray... or can be misused, manipulated, or attacked to go astray.

-- Appendices --

Finally, I'm not sure what might have to be reworked in the appendices
to go along with the other changes... but it's likely that some
changes will be needed.

-----------

OK, so... I know this is a lot.  I think it's conceptually *not* a
lot, and is conceptually quite contained.  But it's a massive
reorganization and rethinking of the document.  As I said, please
consider this and please discuss it with me.  I think it's important
to get the "lex" NID assigned and registered, and I want to help you
get that done (and I know that this has dragged on for a long time).
Please work with me on this, and let's get something that works for
you and that the IETF regulars who have a specific URN architecture in
mind can also accept.

Barry, Applications AD

On Tue, Apr 29, 2014 at 9:06 PM, Barry Leiba <barryleiba@computer.org> wrote:
> This document was discussed on the urn-nid list between late March and
> mid-April 2013.  At the time, Ted Hardie and Patrik Fältström
> commented, and the authors addressed some of their comments.  In the
> end, I think that not all of their comments were adequately addressed,
> so I'm explicitly including Ted and Patrik in the "to" field here, to
> bring their attention to the publication request and to my AD review.
> Ted, Patrik, please check whether the current version
>
> https://datatracker.ietf.org/doc/draft-spinosa-urn-lex/
>
> ...addresses your issues to your satisfaction, and also let me know if
> any of my comments below are totally off base.
>
> ---
> I am very uncomfortable with this document, for a number of reasons:
>
> 1. The use of two-character jurisdiction identifiers was brought up by
> Ted, and I'm not happy with the resolution, which seems to recommend
> massive renaming of existing -- possibly quite old -- documents in
> cases where the identifiers change.  This is simply not practical, and
> goes against the concept that URNs be persistent.  Obviously, no one
> can guarantee permanence, and persistence doesn't mean permanence.
> Nevertheless, when we can anticipate issues (and particularly when we
> have a demonstration than an issue has occurred, as is the case with
> "ai"), we should deal with them in a way that doesn't upset the entire
> apple cart.  Ted suggested three-letter jurisdictions, which seems a
> reasonable approach.  There might be others.  Massive renaming of
> perhaps many thousands of years-old documents isn't a reasonable
> approach.
>
> 2. The whole "national characters" discussion in 3.4 seems odd.  It
> was mentioned in the reviews, and corrections have been made.  I can
> live with what's there now, but it still seems wrong to specify things
> this way -- see item 3 for more.
>
> 3. Section 3.5 is grossly language-dependent, and was so obviously
> written by people whose language supports what's demanded there.  As
> it happens, turning "Ministry of Finance" into "ministry.finance"
> works fine in Italian, as well as in Spanish, French, and English.  We
> have words that mean "the" and "of", and such, which we can eliminate.
>  Languages such as Russian do not; the noun forms themselves change
> with the case, to give the meaning of the word "of" in "of finance" in
> the noun itself.  "Ministry of Finance" in Russian is "министерство
> финансов" (Ministerstvo Finansov), but the nominative word for
> "Finance" would be "финансы" (Finansy).  Should the Section 3.5
> process have it normalized to "министерство.финансы", to eliminate the
> "of" that's implicit in the case of "финансов"?  Does it matter?  (And
> I don't even want to try to think about how this gets done in Chinese
> languages.)
>
> All this stuff in Sections 3.4 and 3.5 (and the later sections as
> well) would be fine if it were in a non-normative example of how one
> jurisdiction has chosen to create the names.  But apart from the
> protocol requirements that have to do with required encoding of
> non-US-ASCII characters, having this stuff be normative just strikes
> me as wrong.  And pointless: does it really matter *how* the names are
> assigned?  It should simply be up to the jurisdiction to create the
> names, and if my jurisdiction chooses to name its documents as
> "urn:lex:barryland:000000000001", then
> "urn:lex:barryland:000000000002", and so on... or perhaps I want to
> use SHA-1 hashes, as in RFC 6920.  Is that really a problem?
>
> If you really are trying to set up a system wherein the URN can be
> computed correctly from the document's title and other metadata, I
> submit that such an effort may work for some situations, but is doomed
> to fail in general.
>
> 4. You appear to be requiring that "urn:lex:" names work as locators
> in HTML hrefs.  From Section 1.5:
>
>    LEX names will be used on a large scale in references as a HREF
>    attribute value of the hypertext link to the referred document.
>
> ...along with the whole "lex.urn.arpa" setup described in Section 6,
> and the registration of the NAPTR records.
>
> This is unprecedented, and, again, strikes me as wrong.  URNs can be
> mapped into corresponding URLs, and being able to do so is what makes
> them useful, but it seems that such mappings will be very much
> jurisdiction-dependent, and it seems *very* unlikely that
>
>    <a href="urn:lex:barryland:000000000001">
>
> ...could be dereferenced as it stands.  That would require that every
> browser (and every other application that processes these things) know
> how to dereference every jurisdiction's LEX names -- and you're
> depending upon the entire world building their resolution around your
> structure and the registry you aim to set up.  Unless I'm seriously
> misunderstanding something, this just seems like it won't work.  In
> any case, what you're describing in Section 6 is not a URN namespace,
> but a URI locator scheme.
>
> ---
> I have a couple of other comments, at a much lower severity than the others:
>
> - What's with the reference to W3C in the abstract?
>
> - Section 3.9 specifies what to do with ordinal numbers.  Why are
> those called out specifically, while cardinal numbers aren't?  Why
> should I handle "law relating to a second home" one way (ordinal
> number), and "law relating to two homes" (cardinal number) in a
> different way?  But, again, this is related to my comment above about
> why any of this is normative, and that comment applies to all sections
> from 3.4 to 3.9.
>
> There seems to be a lot of stuff in Section 4's subsections that also
> specifies how these names should behave in practice, but is doing it
> with normative requirements on the contruction of the name itself.  As
> with the things in Section 3, these seem nice as explanations of
> desirable (even required) characteristics, but wrong as normative
> mechanisms with respect to the name structure.
>
> Barry, Applications AD

Fwd: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Publication request for draft-spinosa-urn-lex Barry Leiba
Re: Publication request for draft-spinosa-urn-lex Barry Leiba
Re: Publication request for draft-spinosa-urn-lex Patrik Fältström
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Re: Publication request for draft-spinosa-urn-lex Barry Leiba
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Re: Publication request for draft-spinosa-urn-lex Barry Leiba
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Re: Publication request for draft-spinosa-urn-lex Barry Leiba
Re: Publication request for draft-spinosa-urn-lex Dale R. Worley
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi
Re: Publication request for draft-spinosa-urn-lex Barry Leiba
Re: Publication request for draft-spinosa-urn-lex Enrico Francesconi