[urn] new urn PWID draft (7) with corrections

Eld Zierau <elzi@kb.dk> Thu, 02 May 2019 12:56 UTC

Return-Path: <elzi@kb.dk>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B9F5120114 for <urn@ietfa.amsl.com>; Thu, 2 May 2019 05:56:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y9PxM5O027pS for <urn@ietfa.amsl.com>; Thu, 2 May 2019 05:56:24 -0700 (PDT)
Received: from smtp-out12.electric.net (smtp-out12.electric.net [89.104.206.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C8BE4120387 for <urn@ietf.org>; Thu, 2 May 2019 05:56:23 -0700 (PDT)
Received: from 1hMBG8-0006Vf-VG by out12d.electric.net with emc1-ok (Exim 4.90_1) (envelope-from <elzi@kb.dk>) id 1hMBG9-0006XQ-TH; Thu, 02 May 2019 05:56:21 -0700
Received: by emcmailer; Thu, 02 May 2019 05:56:21 -0700
Received: from [92.43.124.147] (helo=deliveryscan.hostedsepo.dk) by out12d.electric.net with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from <elzi@kb.dk>) id 1hMBG8-0006Vf-VG; Thu, 02 May 2019 05:56:20 -0700
Received: from localhost (unknown [10.72.17.201]) by deliveryscan.hostedsepo.dk (Postfix) with ESMTP id 909D99F0B2; Thu, 2 May 2019 14:56:20 +0200 (CEST)
Received: from 10.72.17.201 ([10.72.17.201]) by dispatch-outgoing.hostedsepo.dk (JAMES SMTP Server 2.3.2-1) with SMTP ID 91; Thu, 2 May 2019 14:56:18 +0200 (CEST)
Received: from out12c.electric.net (smtp-out12.electric.net [89.104.206.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "electric.net", Issuer "COMODO RSA Domain Validation Secure Server CA" (verified OK)) by outgoing-postscan.hostedsepo.dk (Postfix) with ESMTPS id 4DB67105C; Thu, 2 May 2019 14:56:20 +0200 (CEST)
Received: from 1hMBG7-0008Sm-Vi by out12c.electric.net with hostsite:2468467 (Exim 4.90_1) (envelope-from <elzi@kb.dk>) id 1hMBG8-0008Ur-Tj; Thu, 02 May 2019 05:56:20 -0700
Received: by emcmailer; Thu, 02 May 2019 05:56:20 -0700
Received: from [92.43.124.46] (helo=pf2.outprescan-mta.hostedsepo.dk) by out12c.electric.net with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from <elzi@kb.dk>) id 1hMBG7-0008Sm-Vi; Thu, 02 May 2019 05:56:19 -0700
Received: from post.kb.dk (post-03.kb.dk [130.226.226.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by pf2.outprescan-mta.hostedsepo.dk (Postfix) with ESMTPS id 915B2105C; Thu, 2 May 2019 14:56:19 +0200 (CEST)
Received: from EXCH-02.kb.dk (exch-02.kb.dk [10.5.0.112]) by post.kb.dk (Postfix) with ESMTPS id 6090595DAF; Thu, 2 May 2019 14:56:19 +0200 (CEST)
Received: from EXCH-02.kb.dk (10.5.0.112) by EXCH-02.kb.dk (10.5.0.112) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Thu, 2 May 2019 14:56:19 +0200
Received: from EXCH-02.kb.dk ([fe80::b595:1a1f:5666:b29]) by EXCH-02.kb.dk ([fe80::b595:1a1f:5666:b29%7]) with mapi id 15.01.1713.004; Thu, 2 May 2019 14:56:19 +0200
From: Eld Zierau <elzi@kb.dk>
To: Peter Saint-Andre <stpeter@stpeter.im>, "urn@ietf.org" <urn@ietf.org>
Thread-Topic: new urn PWID draft (7) with corrections
Thread-Index: AdUA5eI8i0YmDbN8SwiZDxbA6TaDHA==
Date: Thu, 02 May 2019 12:56:18 +0000
Message-ID: <dafc04bfeeed4cd4ba874beb597deeaa@kb.dk>
Accept-Language: da-DK, en-US
Content-Language: da-DK
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [130.226.229.95]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Outbound-IP: 92.43.124.46
X-Env-From: elzi@kb.dk
X-Proto: esmtps
X-Revdns: outprescan-mta.hostedsepo.dk
X-HELO: pf2.outprescan-mta.hostedsepo.dk
X-TLS: TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256
X-Authenticated_ID:
X-PolicySMART: 10573177, 19718497
X-Virus-Status: Scanned by VirusSMART (c)
X-Virus-Status: Scanned by VirusSMART (s)
X-Outbound-IP: 92.43.124.147
X-Env-From: elzi@kb.dk
X-Proto: esmtps
X-Revdns: deliveryscan.hostedsepo.dk
X-HELO: deliveryscan.hostedsepo.dk
X-TLS: TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256
X-Authenticated_ID:
X-Virus-Status: Scanned by VirusSMART (c)
X-Virus-Status: Scanned by VirusSMART (s)
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/KGmj--Y9mRssJ-jFlOd61jQ8crI>
Subject: [urn] new urn PWID draft (7) with corrections
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 May 2019 12:56:28 -0000

Thanks again for your comments
I have uploaded a draft version 7 - and described how I have addressed the comments in the below mail from Peter
Does this cover what is needed?

Best regards, Eld

-----Original Message-----
From: Peter Saint-Andre <stpeter@stpeter.im> 
Sent: Tuesday, April 30, 2019 5:14 AM
To: Eld Zierau <elzi@kb.dk>
Cc: urn@ietf.org
Subject: Re: [urn] Comments on PWID -05 - now PWID -06

Hello Eld,

Your proposed syntax (with "~") looks fine to me.
> Eld: :)

The ABNF definition of your proposed syntax does not conform to RFC 5234. You can check the ABNF using this tool:

https://tools.ietf.org/tools/bap/abnf.cgi

> Eld: it conforms now - thank you so much for providing this the link to the syntax checker - that was very helpful 

In particular, it's not clear to me what a rule like this is intended to
mean:

   registered-archive-id = +( unreserved )

Do you mean that a registered-archive-id can include one or more instances of characters from the `unreserved` rule? If so, change "+" to "*".

> Eld: I meant with one or more characters - but I found out it should then be 1*unreserved and likewise for other occurrences 


To simplify the ABNF, you could use the datetime rules from RFC 3339.

> Eld: I used to in an earlier version, but Dale noticed that there was a difference (in mail on 28th of February 2019): "But comparing that to W3CDTF, I see no single nontermainal which corresponds to the set of formats allowed in W3CDTF.  I suggest you make a more rigid specification as to what is allwed for archival-time." - so I think I better stick to the rigid version in order to be sure.

Please don't use `URI` as the name of an ABNF rule because that's already defined in RFC 3986 and could cause confusion. Perhaps call it `uri-string`.

>Eld: Done

Personally I found the `precision-spec` categories difficult to understand and sometimes ambiguous. For instance:

* A precision level of "part" seems to be an HTML file only (at least in the case when "it refers to an html web element"), however a URI can point to many file types other than HTML files. Perhaps "single" (as in a single file) would be clearer; it would also be good to specify how this is handled in the case of file types other than HTML.

* Does a precision level of "page" apply only to HTML pages with all "referenced web parts"? (By the latter term I think you mean what the HTML 5.2 specification defines as "embedded content"; in general it would be good to align terminology.)

>Eld: I have rephrased to make it more clear - it was explained in two steps before, - I have therefore also restructured a bit to make it more clear (page 11-13)

As to the registration, instead of version 6 it should be version 1 because this is the initial registration (i.e., whenever we are finished with this process it will be the initial version, whereas if you update the entire registration in the future that would be version 2).

>Eld: got it - I change it and left details to change log comment
>Eld? I have also change the version in the top of the template - since I guess that is the same thing - is that correct?

The security considerations strike me as underspecified. An archived web page or part could be just as dangerous as a "live" page or part; for instance, it could include insecure scripts, malware, trackers, etc.
Furthermore, an archived page could in fact be more dangerous, because it could include outdated scripts with known vulnerabilities that can never be patched because the script is archived for all time in a vulnerable state (an attack of this sort was recently discovered in the wild).

>Eld: You are quite right, - I have taken the liberty to rephrase you comment and add it to the section, - hope that is ok

Best Regards,

Peter

On 4/29/19 6:10 AM, Eld Zierau wrote:
> Did any of you have comments to my previous mail?
> Is there any action you want me to take in order to get it accepted?
> Best Regards, Eld
> 
> -----Original Message-----
> From: Eld Zierau
> Sent: Friday, March 1, 2019 1:29 PM
> To: 'Martin J. Dürst' <duerst@it.aoyama.ac.jp>; 'Dale R. Worley' 
> <worley@ariadne.com>
> Cc: 'urn@ietf.org' <urn@ietf.org>; 'L.Svensson@dnb.de' 
> <L.Svensson@dnb.de>
> Subject: [urn] Comments on PWID -05 - now PWID -06
> 
> I have now uploade a new version: draft-pwid-urn-specification-06
>  - and thanks again for comments and suggestions
> 
> Regarding the suggestion from Martin (included below), I can as a computer scientist certainly see the reasoning as quite obvious. However, my experience with presentation of the PWID is that syntax based on computational reasoning is something that users find illogically, e.g. that the archived-item-id (usually URI) is included in the end of the PWID. I believe that adding a "~" for identifiers that are registered separately is acceptable for such users, but I am also convinced that a "+" before a domain will be something that confuses (non-computer science) users a lot. 
> Also, as said in my previous mail, it is highly unlikely that there will ever be a case where "~" is the first character in a domain for a web archive. Therefore, it seems that it should not be necessary. 
> A minor extra thing is that all existing PWIDs (and tools providing and resolving PWIDs) would not comply, which they would otherwise (none of these use registered identifiers yet only domains and URIs).
> In other words: I will be very sorry to add a "+" to domains, and I believe it is not necessary.
> 
> The uploaded version  does not include a "+" to domains, - If 
> required, I will of course add it (although sorry to do so)
> 
> Please let me know if it acceptable, and I will act accordingly.
> 
> Best regards, Eld
> 
> 
> On 2019/03/01 11:31, Dale R. Worley wrote:
>> Martin J. Duerst <duerst@it.aoyama.ac.jp> writes:
>>>> [...]  E.g., one could require that any archive-id that is not 
>>>> intended to be interpreted as a DNS name to start with one of "-", 
>>>> ".", "_", "~".
>>>
>>> I haven't looked into the details, but in general, I think this is a 
>>> bad idea. It is much better to have an explicit distinction than to 
>>> rely on some syntax restrictions. Such syntax restrictions may or 
>>> may not actually hold in practice. It's very easy to create a DNS 
>>> name starting with '-' or '_', for example, even though officially, that's not allowed.
>>
>> I may agree with you ... But what do you mean by "an explicit 
>> distinction"?  E.g., I would tend to consider "archive-ids starting 
>> with '~' are registered archive names, and archive-ids that do not 
>> are considered DNS names" to be an "explicit" distinction, but you 
>> mean something else.
> 
> Well, the explicit distinction would be "if it starts with '~', what follows is a registered archive name, and if it starts with '+', what follows is a DNS name" or some such. This would not exclude any leading characters in either archive names or DNS names.
> 
> Regards,   Martin.
> 
>> Or maybe the right question is, What do you propose as an alternative?
> _______________________________________________
> urn mailing list
> urn@ietf.org
> https://www.ietf.org/mailman/listinfo/urn
>