Re: [urn] PWID URN namespace registration with latest draft version 12

Peter Saint-Andre <stpeter@stpeter.im> Fri, 18 November 2022 19:26 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 08350C1524D1 for <urn@ietfa.amsl.com>; Fri, 18 Nov 2022 11:26:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=stpeter.im header.b=mhz1BbCQ; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=Xev6dPvl
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id syXbTP_qniIG for <urn@ietfa.amsl.com>; Fri, 18 Nov 2022 11:26:38 -0800 (PST)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D1ADEC1524C9 for <urn@ietf.org>; Fri, 18 Nov 2022 11:26:37 -0800 (PST)
Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id A85055C0163; Fri, 18 Nov 2022 14:26:36 -0500 (EST)
Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Fri, 18 Nov 2022 14:26:36 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stpeter.im; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1668799596; x= 1668885996; bh=OLfc7jhldejqog3V7bCM9tzXYT8suotWGbC44xuXzEY=; b=m hz1BbCQB+0zfIez5Z9RCUw5XjQwCj/bPDRk2pisqj01tMIGDVR0z/jXVXmZmgQta 8TFHJDDGduPdKdwdAtdvWp7FNIFAPii1qemQq0MWUZ9ONsGkYQQwS04XAZxI/lsx I59jHUiiA/A9dIV8aEilHAS3+K7jJ4vldqOL0I+RQqRCgR0Jbq04GxfjYIraQy+q bUZ3gEVomn08024O3jurCN1ORngG6sd+P64XEnxYIyVsw8SSEnPKdU/sjqfDm+1X SBBPNY788iEK2Fv7kI1YotS2yyp0mHl23isTzJnVyMtl11v7wMY8HwPB/XtJaTny eBFVqbmdXGKqjzTQ002CQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1668799596; x=1668885996; bh=O Lfc7jhldejqog3V7bCM9tzXYT8suotWGbC44xuXzEY=; b=Xev6dPvl8l7WK72Lg 9GrlE+w2AzIw2PSSyXshWNF85sZEmWcaCf7SnWzHYSMdzqDA5tOQg+L6RNeRL5Qi YU+MMSd2VOd4ARSqoUOvt1IUHf98oNqcli1B8/roB8taOSTfJVHs6O5rvt6EmN+q 0yPGM2nS0N53QT/zyF9AWLo+fOIJ9LxEzjDril9K4wlwvb/Q0lPHO2YCP380XL7L 8dKpq/h4q/LHP5+D0ewF56cpRTnjx1C5PTWsyqSqMLILsa45alpz1vedJKg4U/Z/ FqkhRxESJsoiDpkePwVzftikDHkz4ZOnaQHIrSG/KcmNzHH6vI04fzeE2I+EH80K 7PwuA==
X-ME-Sender: <xms:bNx3Y-10a0VKk1tbiqtlqxgyUXTkeiWJOvOz-wd2CF0WkUZe01MiAw> <xme:bNx3YxHTP7JjqevOBhHpVhory7tVFL5SlW5XH0SfuWIR_VgLHIU--b0uXNmJdbEyJ bse450GS4KXV6-V9Q>
X-ME-Received: <xmr:bNx3Y2582oX21E3GBdVCQV6nOtft78Gxav-ZBs4i2VuVxmAJHFo2nnLjdIFumOg7>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvgedrhedtgdduvdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffhvfhfjggtgfesthekredttdefjeenucfhrhhomheprfgvthgv rhcuufgrihhnthdqtehnughrvgcuoehsthhpvghtvghrsehsthhpvghtvghrrdhimheqne cuggftrfgrthhtvghrnhepheekieekgeeutdegheeufeehteegiedvgefhleefvdduvddv heduleefleetgfdvnecuffhomhgrihhnpehirghnrgdrohhrghdpnhgvthhlrggsrdgukh dpnhgvthhprhgvshgvrhhvvgdrohhrghdpnhgvthgrrhhkihhvvghtrdgukhdpkhgsrdgu khdprghurdgukhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehsthhpvghtvghrsehsthhpvghtvghrrdhimh
X-ME-Proxy: <xmx:bNx3Y_06HahLlTk_1Wy6HJLkjBIK4i5p3mRWDtZSveHYx2OsVp6wQA> <xmx:bNx3YxEw8dqnnvpgAOqZ26kCEIGiqxThT8LNlSk68wmRw4Sc3co8Sg> <xmx:bNx3Y49Zc3gJuDJ5kPFwN5UxiimvY5AFxBMeIyiiRrCkVALCNheqwg> <xmx:bNx3Y_PIT5j3f-ZUEQWc3Amotg5CFQxxP6hluVLiMfCw1_GGEQJUoQ>
Feedback-ID: i24394279:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 18 Nov 2022 14:26:35 -0500 (EST)
Message-ID: <b3a76d7f-7869-fff5-9e07-26e562b7bf61@stpeter.im>
Date: Fri, 18 Nov 2022 12:26:35 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.4.2
Content-Language: en-US
From: Peter Saint-Andre <stpeter@stpeter.im>
To: Eld Zierau <elzi@kb.dk>, "urn@ietf.org" <urn@ietf.org>
References: <b5ade01ffcfc42a5b428e0027780e724@kb.dk> <0f58f6fe-4abb-470e-5d8c-d865f806fb11@stpeter.im>
In-Reply-To: <0f58f6fe-4abb-470e-5d8c-d865f806fb11@stpeter.im>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/tJ8CIUQC8biEot8OVxHFJpzcrOA>
Subject: Re: [urn] PWID URN namespace registration with latest draft version 12
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Nov 2022 19:26:43 -0000

This namespace is now registered:

https://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml

On 11/15/22 2:08 PM, Peter Saint-Andre wrote:
> After much discussion, the expert review team has asked IANA to register 
> the `pwid` URN namespace. Multiple revisions have resulted in a 
> registration that conforms to RFC 8141. Because it is not the expert 
> review team's role to determine the usefulness of URN namespaces (as 
> would be done in a standardization venue such as an IETF Working Group), 
> we have found no basis for denying the registration of this namespace. 
> Therefore, despite our own personal doubts about some aspects of the 
> namespace, we have decided to allow its registration.
> 
> Peter
> (on behalf of the expert review team)
> 
> On 4/22/21 4:07 AM, Eld Zierau wrote:
>> Dear all
>> I hope you are all still well in spite of the pandemic.
>> I have not asked to the progress for some time, due to different 
>> reasons related to this difficult time.
>>
>> As two of you have already approved this PWID URN, I would be happy if 
>> we can find a way for it.
>>
>> It is very important for us to have the PWID URN. In Denmark, PWID is 
>> the recommended way to reference web archive material. We use it in 
>> scientific papers, as well as for specification of elements in 
>> collections of web material. As example collections, we have just 
>> produced collection of PWIDs for the "Probing a Nation's Web Domain" 
>> project 
>> (http://www.netlab.dk/research/projects/probing-a-nations-web-domain-the-historical-development-of-the-danish-web/).
>>
>> I have answered all comments and questions, which have been raised, 
>> either by explanation or by slimming the suggested PWID e.g. only 
>> leaving page and part in the precision specification.
>> The only "weak" point I see in the current version is the reference to 
>> the web archive. It would be best if there existed a registry, but 
>> this is not the case. However, I am convinced that the suggested way 
>> is absolutely workable, until a registry is in place - and I will of 
>> course work hard to get such a registry established as soon as the 
>> PWID is accepted - e.g. in cooperation with the International Internet 
>> Preservation Consortium (IIPC) https://netpreserve.org/.
>>
>> The current web archive identification is workable, since it is 
>> findable as long as the domain exists. In case of change, it will be 
>> possible to track the change through archived harvests of the web 
>> archives domain. We actually have the example in Denmark where the 
>> merge of the two libraries (who jointly had the Danish web archive) 
>> has resulted in https://netarkivet.dk being moved to 
>> https://www.kb.dk/find-materiale/samlinger/netarkivet, where 
>> netarkivet.dk redirects to the new place. There are several examples 
>> of web archives changing domains or paths to their web archive 
>> material. However, it has so far been possible to track the change, 
>> usually, because the old URL is redirected to new one.  These 
>> redirects will probably be there for some years, but at some stage 
>> they are also likely to be removed. No matter whether it is redirects 
>> or announcements, it will be information that will be harvested and 
>> kept in web archives. That means it will be traceable by looking in 
>> web archives that have harvested this data. So workable for now, but 
>> of course best placed in a coming registry, which is likely to be in 
>> place before this become an issue.
>>
>> In my involvement in WARCnet (https://cc.au.dk/en/warcnet/), it is my 
>> experience that there is also a need for the PWID internationally, 
>> especially when web archives changes domains or paths to data. There 
>> are examples for references with archived URLs in research datasets, 
>> which overnight have become almost useless for this reason. I am 
>> bringing in the PWID here, as a way to solve this as part of the 
>> research data management work related to web archives.
>>
>> At the International Internet Preservation Consortium (IIPC) 2021 
>> conference, I will present the work we have done with representation 
>> of PWIDs collections (as mentioned above). The collections from the 
>> "Probing a Nation's Web Domain" project contains elements harvested in 
>> a specific year, where each web element only appears once (although 
>> the web element was harvested many times during the year). The 
>> collections were originally produced by an extraction program, which 
>> ran on the Danish web archive (Netarkivet). The collections are now 
>> migrated to collections of PWIDs, which are much more sustainable as 
>> target for preservation, enabling future check of results and enabling 
>> establishment of comparable results. One non-sustainable alternative 
>> would be to save the extraction program. However here, the problem 
>> will be, that we cannot be sure that the extraction program will be 
>> functioning in the future, and even so whether it will produce the 
>> same result (archive can have been enriched with new data). Another 
>> alternative would be to preserve the outcome of the extraction 
>> program. However, this is not a standardized format, and it refers 
>> metadata (crawl-logs) rather than registered archive metadata, - thus 
>> even with thorough documentation, it will be hard to re-track and 
>> understand this output even in a 10 years' time horizon. Therefore, 
>> the PWID collections are in a more sustainable format as well.
>>
>> Please consider final approval of the PWID URN or tell me what is 
>> needed for it to be approved and published as a URN namespace.
>>
>> Best regards, Eld
>>
>> PS: I have attached the latest draft which followed with the mail 
>> included below
>>
>> -----
>> Eld Zierau
>> Digital Preservation Specialist PhD
>> The Royal Danish Library
>> Digital Cultural Heritage
>> P.O. Box 2149, 1016 Copenhagen K
>> Ph. +45 9132 4690
>> Email: elzi@kb.dk
>>
>> -----Original Message-----
>> From: Eld Zierau
>> Sent: Tuesday, May 26, 2020 10:01 AM
>> To: 'Dale R. Worley' <worley@ariadne.com>; urn@ietf.org
>> Subject: RE: PWID URN namespace registration version 10
>>
>> Thank you Dale
>>
>> My comments is given below, and updated version is attached with 
>> following corrections:
>> - added draft information in filename/header
>> - description of archive-domain
>> - syntax for utc-time
>> - date of document
>> Best regards, Eld
>>
>> ---------------------------------------------------------------------------------------
>>
>> My apologies for not giving this attention sooner.
>>
>> I've read version 10, and I think we should approve it.  I have the 
>> following observations, which include one editorial suggestion.
>>
>>
>> I assume that the attachment to message
>> https://mailarchive.ietf.org/arch/msg/urn/x_JVtfKpANKZz6Qr8iOqpsXJ8SU/
>> "PWID URN (shortened title)" is draft verson 10, despite that neither 
>> the attachment's name or contents states that.
>>> Eld: At some stage, I was told that it is version 1, maybe I
>>> misunderstood, but I thought the document date then indicated the 
>>> version.
>>> It was actually draft version 11. In the attached draft version 12, I
>>> have put in the information about it being draft version 12
>>
>>
>> I particularly support the PWID proposal for the reasons I described 
>> in "PWID as citation"
>> https://mailarchive.ietf.org/arch/msg/urn/s-CM7hcWtUeAz7ZVBF94rCHMtsQ/
>> -- namely that what a PWID references is transparent enough that one 
>> could algorithmically transform a PWID pointing to one archive into a 
>> query into another archive.  This is a genuinely new capability for 
>> URNs (as far as I know) and only by deploying it in practice can we 
>> see what benefits might be obtained.
>>> Eld: Agree
>>
>> I still dislike that there's no well-defined way to catalog allowed 
>> values of archive-dimain.  But the number of values that are used will 
>> likely remain small and there are unlikely to be "ownership conflicts"
>> about them, so this is unlikely to be a problem in practice.
>>> Eld: I agree, and I will certainly work on other fronts to make it 
>>> possible to make more
>>> precise reference, but this is what exists at the moment.
>>
>> A minor editorial point:
>>        *  'archive-domain' is defined as in (section 3.5) [RFC1034].
>> "archive-domain" is not defined in RFC 1034.  You need to say
>>        * 'archive-dimain' is <subdomain> as defined in (section 3.5) 
>> [RFC1034].
>> (Oddly, you do want to use <subdomain> rather than <domain> defined in 
>> that section.)
>>> Eld: You are right - it is subdomain because <domain> ::= <subdomain> 
>>> | " " and we want to avoid the " " possibility.
>>> I have corrected in the attached  draft version 12
>>
>>
>> I see that if utc-time is part of archival-time, then it must contain 
>> both utc-hour and utc-minute, whereas utc-second and secfrac can be 
>> added independently.  This is a bit of an inconsistency, but I assume 
>> you intend it.
>>> Eld: That is actually an error, - I have corrected it, so it is 
>>> possible just to specify the hour without minutes
>>
>> Given that precision-spec is currently limited to one of two values, 
>> later extensions can be indicated by additional values defined for 
>> this field.
>>> Eld: Agree
>>
>> Dale
>>
>> -----
>> Eld Zierau
>> Digital Preservation Specialist PhD
>> The Royal Danish Library
>> Digital Cultural Heritage
>> P.O. Box 2149, 1016 Copenhagen K
>> Ph. +45 9132 4690
>> Email: elzi@kb.dk
>>
>> -----Original Message-----
>> From: Dale R. Worley <worley@ariadne.com>
>> Sent: Sunday, May 24, 2020 5:19 AM
>> To: Eld Zierau <elzi@kb.dk>; urn@ietf.org
>> Subject: Re: PWID URN namespace registration version 10
>>
>> My apologies for not giving this attention sooner.
>>
>> I've read version 10, and I think we should approve it.  I have the 
>> following observations, which include one editorial suggestion.
>>
>> I assume that the attachment to message
>> https://mailarchive.ietf.org/arch/msg/urn/x_JVtfKpANKZz6Qr8iOqpsXJ8SU/
>> "PWID URN (shortened title)" is draft verson 10, despite that neither 
>> the attachment's name or contents states that.
>>
>> I particularly support the PWID proposal for the reasons I described 
>> in "PWID as citation"
>> https://mailarchive.ietf.org/arch/msg/urn/s-CM7hcWtUeAz7ZVBF94rCHMtsQ/
>> -- namely that what a PWID references is transparent enough that one 
>> could algorithmically transform a PWID pointing to one archive into a 
>> query into another archive.  This is a genuinely new capability for 
>> URNs (as far as I know) and only by deploying it in practice can we 
>> see what benefits might be obtained.
>>
>> I still dislike that there's no well-defined way to catalog allowed 
>> values of archive-dimain.  But the number of values that are used will 
>> likely remain small and there are unlikely to be "ownership conflicts"
>> about them, so this is unlikely to be a problem in practice.
>>
>> A minor editorial point:
>>        *  'archive-domain' is defined as in (section 3.5) [RFC1034].
>> "archive-domain" is not defined in RFC 1034.  You need to say
>>        * 'archive-dimain' is <subdomain> as defined in (section 3.5) 
>> [RFC1034].
>> (Oddly, you do want to use <subdomain> rather than <domain> defined in 
>> that section.)
>>
>> I see that if utc-time is part of archival-time, then it must contain 
>> both utc-hour and utc-minute, whereas utc-second and secfrac can be 
>> added independently.  This is a bit of an inconsistency, but I assume 
>> you intend it.
>>
>> Given that precision-spec is currently limited to one of two values, 
>> later extensions can be indicated by additional values defined for 
>> this field.
>>
>> Dale
>>
>>
>> _______________________________________________
>> urn mailing list
>> urn@ietf.org
>> https://www.ietf.org/mailman/listinfo/urn
>