Re: [urn] Comments on PWID -05

Martin J. Dürst <duerst@it.aoyama.ac.jp> Thu, 28 February 2019 06:02 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E72912F1A5 for <urn@ietfa.amsl.com>; Wed, 27 Feb 2019 22:02:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.924
X-Spam-Level:
X-Spam-Status: No, score=-0.924 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5lzlMVCCf9zb for <urn@ietfa.amsl.com>; Wed, 27 Feb 2019 22:02:14 -0800 (PST)
Received: from APC01-SG2-obe.outbound.protection.outlook.com (mail-eopbgr1310108.outbound.protection.outlook.com [40.107.131.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B2B4A128701 for <urn@ietf.org>; Wed, 27 Feb 2019 22:02:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=INIvIaKh5A+KJTfE99JWj5YwswCyXgll1K7aPCvPNnI=; b=Ja93h6CTLPYETylyZiIBfSnQ5K9XAWngTAY5opgOEm9JmK86Kbc6eANpZU/jjtSFFhV8JLWZcRBWTsgqQVkBsBoSHPgRGhKV2I8HI9g5f6reMPdP5+2AYBMaCbr8c8N7OoCiVaWtpayRPqY+hkNdMRHPAkrhCSfuF+lRri0fGKU=
Received: from TY2PR01MB5147.jpnprd01.prod.outlook.com (20.179.172.19) by TY2PR01MB4940.jpnprd01.prod.outlook.com (20.179.171.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1643.16; Thu, 28 Feb 2019 06:02:10 +0000
Received: from TY2PR01MB5147.jpnprd01.prod.outlook.com ([fe80::dfd:6d:2d55:8639]) by TY2PR01MB5147.jpnprd01.prod.outlook.com ([fe80::dfd:6d:2d55:8639%2]) with mapi id 15.20.1643.019; Thu, 28 Feb 2019 06:02:10 +0000
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: "Dale R. Worley" <worley@ariadne.com>, Eld Zierau <elzi@kb.dk>
CC: "urn@ietf.org" <urn@ietf.org>, "L.Svensson@dnb.de" <L.Svensson@dnb.de>
Thread-Topic: [urn] Comments on PWID -05
Thread-Index: AQHUzxR4AWaB3nzYrEaWXFe7r6bkc6X0uB6A
Date: Thu, 28 Feb 2019 06:02:10 +0000
Message-ID: <521105f9-9fdc-d88f-e5fb-3efaae7153cd@it.aoyama.ac.jp>
References: <87d0ncha65.fsf@hobgoblin.ariadne.com>
In-Reply-To: <87d0ncha65.fsf@hobgoblin.ariadne.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-clientproxiedby: TY1PR01CA0155.jpnprd01.prod.outlook.com (2603:1096:402:1::31) To TY2PR01MB5147.jpnprd01.prod.outlook.com (2603:1096:404:11a::19)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=duerst@it.aoyama.ac.jp;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [133.2.210.64]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 5bb4478a-190b-4a3f-43e4-08d69d4246c9
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(5600127)(711020)(4605104)(2017052603328)(7153060)(7193020); SRVR:TY2PR01MB4940;
x-ms-traffictypediagnostic: TY2PR01MB4940:
x-ms-exchange-purlcount: 4
x-microsoft-antispam-prvs: <TY2PR01MB494031FB412091E592E973FACA750@TY2PR01MB4940.jpnprd01.prod.outlook.com>
x-forefront-prvs: 0962D394D2
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(136003)(376002)(346002)(366004)(396003)(39840400004)(189003)(199004)(316002)(446003)(6506007)(11346002)(97736004)(786003)(386003)(14444005)(102836004)(31686004)(476003)(68736007)(105586002)(85202003)(186003)(53546011)(106356001)(2906002)(2616005)(99286004)(256004)(26005)(110136005)(54906003)(53936002)(486006)(74482002)(52116002)(81156014)(8936002)(76176011)(8676002)(508600001)(14454004)(966005)(6512007)(6436002)(229853002)(31696002)(6306002)(81166006)(25786009)(6246003)(4326008)(66066001)(53376002)(85182001)(3846002)(6116002)(71190400001)(71200400001)(305945005)(86362001)(5660300002)(6486002)(7736002); DIR:OUT; SFP:1102; SCL:1; SRVR:TY2PR01MB4940; H:TY2PR01MB5147.jpnprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None (protection.outlook.com: it.aoyama.ac.jp does not designate permitted sender hosts)
x-microsoft-exchange-diagnostics: 1;TY2PR01MB4940;23:kYsTkcGvz/BTnGefj3uSJWYL8LWGh5O7uoHfn8GwSV73LtkCCuBS+I1kTZRTznQdBezhn0rYJc06zS0eQvumbINHwpf+UEqwJ2v/ohLpUpxH9V87ZAaK+eqha3ceDzAhNQtexD9fgFY82U+uMpTdXq5VuudHjz2mvnKJTxJDkc/4Am4k93Q2850xON1mWVvs9rz3tEnHO9F74LujZHvQ65YG7ffrCf8l8ayEV2vgu1Lqjc1/DC4JlQ4gutRYSvFdaOPzPLwepRiahYbxY/iybqlr1qRmSrBT5DxDyGLCxplBIocUjocc19di/ByqYuI9RHiopc9Gxu7BfuRwrALdairnmwU0ii1A64kSHLMHbZXg3FmrILlCibohENusZt4dE4cEUoLMM7qp9gwZBQgK34xHl7Jt8DF84eP/20EMR/3FN2v14LxwcgDZtyYvnpZSmgraqq/U8+i7e5gyQORUl8wLiQKTkRJJNgZeYob1CG4icAB5Z5LgNq5zpTeu6UDgcHnt06Nk8cTBlnAK3sBg0uw+CTiFAw2+Zy1Qnr+cWT03mToxlnBPvkpFiZ/h61nG6pVpahf+yIn0WNQZgHF/icS3C2npUMU3gJg4rpB2m7zYW7ZNEDtTuNDCBgfYDQX1No5IUdbDDBqAp4a3vESy+Y+5tq68bcAf+1waxxurkpc9oAseEvIppGJf2dVSLB3c4LhrhMQu4jhgOs6ZgTB7cmOcT+0EpCAXnFThx8diWGA7IltZIUQxos/bIXCq1P21UxxXA0b0q5XJ/W+eES++W80weTp/uqrpCODjaY08yWnMHKG1prl5VvqRa0Ol+7tmBICX9qIRFyD9saH8S/tX2XzqRs0c3olRg8Cj0KsXkgFl8Uol9Idf+F1V0mLmpNsFAFIwHVirbbuCrTLlGQ6RI2+76oKEaMHSJwOFBlRZZ9ry66H3IXxOpD5q4HeQPTZ3EySuPgI3FY19cDJ3cjrI51gwdiOZ6hcZSFkyxq9EbXbTVCqqEl9jtrKfWOSKmDDvCMxtJM+li8Xq8ZK2eaLn/BBDwJ9RMnqyVz15t9CzJRxyjpNEDJOKcyj6ozyWagrtgQSAboYxxP4iB8FVh322ZSobt5K4JzYkPF+ut0QR30VYYolRU4whvR9b21y8zj+B4ISyT/reg59lcZ36nwpUjjytGQX+A1Ecw/l9qZy/r/uDNb7s93RQLxcCNJVs4fajIX1zO9R4xDqh9O7dqMhg7OdY+RcXjtCTlpq7yI6RerjiKt4xmbXm/5dW7DJvrzO+MUmy7qyaFIUFc4eDemyhoxr0CfG5+vam0zp7S0V7swr1/AmYGiLLz+asa3UAqnGxY7YnYgZSlx1ThGQ0F85tk5DJYnFmQKrLmnG9f/iRalZKr1z7eG+WxRFvSrYoXSNk
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: 5Y5+EUJbqPI3wTHsF1rOeAmwrqyzeWihai9G0K6SEpuaOPd2brBvNfw12BP+BSxAXKazlwYNN0blVSRT6ZOs2y2d3J1CW/u8zXrGBR44hWnSKEMRQP9Er5i0JZNqZUEUO4hhMiDmX3NmyhS0nmJI3fhgU8ym+a4lnIsfCpwFQqMWAUc5Wr2I10PdegWvwTUfPoE1tD/auq56ERZ0luJrVozpxwcfDsvYH0skzbHo5ytTdGTIhxY/DhTPa+DUVCP7vvBIRA999o6R8l7K3RVKHPtyVLt7VTNU4rJHXGS1X7gTOifWvxFhBqKwH+u3TYYL0U4BDIkTV5ElGfsxKmf1K1k8caHLqK8UmAJ7oeEGq4w+FJFNACfRjnj/FviGWPgcel7pGZPUlcAs9TKVwFcjo+zVOjZgT8LxOPxtsvmjHqI=
Content-Type: text/plain; charset="utf-8"
Content-ID: <B6CC1034B8A2C749A0B7DD7C8D32AA33@jpnprd01.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 5bb4478a-190b-4a3f-43e4-08d69d4246c9
X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Feb 2019 06:02:09.9500 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TY2PR01MB4940
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/Yq2oW077MLgx0ckKGhzK7HAlSRg>
Subject: Re: [urn] Comments on PWID -05
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Feb 2019 06:02:18 -0000

Hello Dale, others,

On 2019/02/28 12:19, Dale R. Worley wrote:
> The discussion of archive-id is considerably clearer than before but it
> seems to me that there should be some discussion of how to distinguish
> domain name archive-ids (e.g., netarkivet.dk) and archive-ids that have
> to be looked up in the (future) registry.  Being able to reliably make
> this decision seems to be the first step in resolving a PWID.
> 
> I think arranging this is straightforward, since the syntax of
> archive-id is "+( unreserved )", and many of those strings are not
> allowed as DNS names for hosts.  E.g., one could require that any
> archive-id that is not intended to be interpreted as a DNS name to start
> with one of "-", ".", "_", "~".

I haven't looked into the details, but in general, I think this is a bad 
idea. It is much better to have an explicit distinction than to rely on 
some syntax restrictions. Such syntax restrictions may or may not 
actually hold in practice. It's very easy to create a DNS name starting 
with '-' or '_', for example, even though officially, that's not allowed.

Regards,   Martin.

> Similar considerations apply to archived-item-id and distinguishing it
> from URI.  (All URIs must start with a letter.)
> 
>             precision-spec = "part" / "page" / "subsite" / "site"
>                      / "collection" / "recording" / "snapshot"
>                      / "other"
> 
> Is the inclusion of "other" the best way to handle this?  Usually a
> component like this would allow "extension values" (that conform to the
> same syntax as the defined values, e.g., "+letter").  As written,
> everything that cannot be classified as "part", "page", ..., "snapshot"
> would have to be labeled "other", even if a particular archive had
> several different additional precision values that it operated with
> internally.
> 
>        *  'URI' is defined as in [RFC3986] but where occurrences of "[",
>           "]", "?" and "#" are %-encoded in order not to clash with URN
>           reserved characters [RFC8141].
> 
> This gets complicated.  For example "http://example.com/foo#bar" is a
> different URL than "http://example.com/foo%23bar", and might have
> different contents.  You can't use "http://example.com/foo%23bar" as the
> archived-item part of PWIDs for the saved contents of both of these
> URLs.
> 
> One possibility is to set the archived-item string to be URI with [, ],
> ?, #, and % all %-encoded, so that the two URLs have these archived-item
> values:
>      http://example.com/foo%23bar
>      http://example.com/foo%2523bar
> That would be laborious, though, if many URLs contain %-escapes and
> humans have to copy PWID URNs by hand.
> 
>        *  'archival-time' is a UTC timestamp as described in the W3C
>           profile of [ISO8601] [W3CDTF] (also defined in [RFC3339]), for
>           example YYYY-MM-DDThh:mm:ssZ.
> 
> Looking at RFC 3339, I see:
> 
>     date-fullyear   = 4DIGIT
>     date-month      = 2DIGIT  ; 01-12
>     date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
>                               ; month/year
>     time-hour       = 2DIGIT  ; 00-23
>     time-minute     = 2DIGIT  ; 00-59
>     time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap second
>                               ; rules
>     time-secfrac    = "." 1*DIGIT
>     time-numoffset  = ("+" / "-") time-hour ":" time-minute
>     time-offset     = "Z" / time-numoffset
> 
>     partial-time    = time-hour ":" time-minute ":" time-second
>                       [time-secfrac]
>     full-date       = date-fullyear "-" date-month "-" date-mday
>     full-time       = partial-time time-offset
> 
>     date-time       = full-date "T" full-time
> 
> But comparing that to W3CDTF, I see no single nontermainal which
> corresponds to the set of formats allowed in W3CDTF.  I suggest you make
> a more rigid specification as to what is allwed for archival-time.
> 
>     [W3CDTF]   W3C, "Date and Time Formats: note submitted to the W3C. 15
>                September 1997", 1997,
>                <http://www.w3.org/TR/NOTE-datetime>.
> 
>                W3C profile of ISO 8601 urn:pwid:archive.org:2017-04-
>                03T03:37:42Z:page:http://www.w3.org/TR/NOTE-datetime
> 
> The final two lines of this block look like a mis-formatted
> bibliographic reference.
> 
> Dale