Re: [urn] CTS and CITE2 URNs

Martin J. Dürst <duerst@it.aoyama.ac.jp> Fri, 22 March 2019 05:38 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DD131277E0 for <urn@ietfa.amsl.com>; Thu, 21 Mar 2019 22:38:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.922
X-Spam-Level:
X-Spam-Status: No, score=-0.922 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eXdXM7dQUv3f for <urn@ietfa.amsl.com>; Thu, 21 Mar 2019 22:38:25 -0700 (PDT)
Received: from JPN01-TY1-obe.outbound.protection.outlook.com (mail-eopbgr1400113.outbound.protection.outlook.com [40.107.140.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AF2E7127984 for <urn@ietf.org>; Thu, 21 Mar 2019 22:38:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Is98Qc70F3iLKYnQTv6tsASdbzFVCL8guIP+okygg1g=; b=qlyOgqJfAqb2rGLYc2IK4V4ifdMFw+e9aApwHBJuBAFqmzRzS5zgDHau6Hdmbe7cPXJsQJUx1MbyGWCXHBhyesny1AEt0kLXr2q/gaJ93RAfkOIOFQFokXem9uprV8KnrDo3pTPqeTVQfTQVep0frWkldAY6vNmlh0MW20/LA0I=
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com (20.179.187.18) by TYAPR01MB4302.jpnprd01.prod.outlook.com (20.179.173.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1730.15; Fri, 22 Mar 2019 05:38:21 +0000
Received: from TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::98b6:d90e:9ae7:302]) by TYAPR01MB5149.jpnprd01.prod.outlook.com ([fe80::98b6:d90e:9ae7:302%3]) with mapi id 15.20.1709.015; Fri, 22 Mar 2019 05:38:21 +0000
From: =?utf-8?B?TWFydGluIEouIETDvHJzdA==?= <duerst@it.aoyama.ac.jp>
To: Peter Saint-Andre <stpeter@stpeter.im>, "urn@ietf.org" <urn@ietf.org>
Thread-Topic: [urn] CTS and CITE2 URNs
Thread-Index: AQHU4HF1wdEtZKkfOkSj75rBbfl54g==
Date: Fri, 22 Mar 2019 05:38:21 +0000
Message-ID: <1ff8581d-4f29-c6af-a275-c9de064224cc@it.aoyama.ac.jp>
References: <f6f76959-e746-1a21-0d15-21fd88cb5c17@stpeter.im> <HE1PR07MB30973B4C5F2D5B5899D76A75FA720@HE1PR07MB3097.eurprd07.prod.outlook.com> <1dd78448-e7af-b517-397f-e0ef1b99e042@stpeter.im>
In-Reply-To: <1dd78448-e7af-b517-397f-e0ef1b99e042@stpeter.im>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-clientproxiedby: TYCPR01CA0087.jpnprd01.prod.outlook.com (2603:1096:405:3::27) To TYAPR01MB5149.jpnprd01.prod.outlook.com (2603:1096:404:12e::18)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=duerst@it.aoyama.ac.jp;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [133.2.210.64]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 6771c289-bdbd-4983-cb49-08d6ae889849
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(5600127)(711020)(4605104)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(2017052603328)(7153060)(7193020); SRVR:TYAPR01MB4302;
x-ms-traffictypediagnostic: TYAPR01MB4302:
x-ms-exchange-purlcount: 1
x-microsoft-antispam-prvs: <TYAPR01MB4302B7F82CA604B7B2360001CA430@TYAPR01MB4302.jpnprd01.prod.outlook.com>
x-forefront-prvs: 09840A4839
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(396003)(136003)(376002)(39840400004)(366004)(346002)(189003)(199004)(316002)(786003)(68736007)(26005)(110136005)(186003)(52116002)(8936002)(6246003)(99286004)(6116002)(508600001)(3846002)(386003)(102836004)(53546011)(6506007)(966005)(14454004)(71200400001)(71190400001)(2501003)(106356001)(105586002)(85202003)(31686004)(11346002)(256004)(5660300002)(476003)(97736004)(446003)(2616005)(486006)(6306002)(6512007)(6436002)(7736002)(85182001)(53936002)(8676002)(31696002)(76176011)(66066001)(81156014)(81166006)(2906002)(86362001)(229853002)(74482002)(25786009)(6486002)(305945005); DIR:OUT; SFP:1102; SCL:1; SRVR:TYAPR01MB4302; H:TYAPR01MB5149.jpnprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None (protection.outlook.com: it.aoyama.ac.jp does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: aBfKKf2Fe4jC4gJn6oaLaDOXQ42rw6WeDD2y+lgtKoanxBBBmi5hDu7KvX4NSPivfMI+VBmph+BDTAsimOe70qXEcz2xBW3Jv/kwRES6NV1yn8CpakGfgE1nACaV3ta4yan0KTitxzXwrsnFGL9HD41GJGpBrHmkGBc1eR6XKcJcew2BJMmWMwJYV4ZHyD5qSayPC1XrXiyd4ymziUfyqtdVEQqdrxu8ufAOag09ZlYBMxCttgxA6QvXWmsE6bYXud5keGXWFZUBlAruxwtIu2S4Ges15MDs2nNHOO3UCPVDpS/ZjGSOnbdJ4MLZReZy7HhTu5TLQT0Gd1/nh0RioP+Obfc/0hCEx7ra5uFtPq7r+BgjgQ/+L4WIiPrJUOol/CHIF+eYO0ge1/3sVA3ucu5EXopsN1SeLHRxo6gcL7k=
Content-Type: text/plain; charset="utf-8"
Content-ID: <60E13B1DA56E354DA6444AF0440D0889@jpnprd01.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 6771c289-bdbd-4983-cb49-08d6ae889849
X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Mar 2019 05:38:21.5663 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB4302
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/aTWymsJUCP6ZTkqmYyJjuXETXDI>
Subject: Re: [urn] CTS and CITE2 URNs
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Mar 2019 05:38:27 -0000

Hello Peter, others,

Sorry to reply to an older email.

On 2019/03/07 13:02, Peter Saint-Andre wrote:
> On 3/4/19 11:58 PM, Hakala, Juha E wrote:
> 
>> I am in favour of registering these namespaces. The only reservation I have is the usage of "@" to separate subreferences. Would it be possible to use hash character for this purpose?
>>
>> At https://cite-architecture.github.io/ohco2/quick/
>>
>> the researchers themselves give the following example:
>>
>> urn:cts:greekLit:tlg5026.msA.hmt:1.2.lemma#μῆνις

> A follow-on observation. As the genesis of the CTS and CITE2 namespaces
> within the Homer Multitext project indicates, it might be desirable for
> literal strings in the NSS to include characters outside the ASCII range
> (e.g., `μῆνις`). I suspect that such strings would be limited to the
> PASSAGE and SUBREFERENCE constructs, but perhaps the registrants could
> clarify that for us. In any case, here it seems that "the nature of the
> particular URN namespace makes such characters necessary" (RFC 8141, §2.2).
> 
> However, the same section of RFC 8141 also clearly states:
> 
>     In particular, with regard to characters outside the ASCII range,
>     URNs that appear in protocols or that are passed between systems MUST
>     use only Unicode characters encoded in UTF-8 and further encoded as
>     required by RFC 3986.
> 
> As far as I can see, this implies that the foregoing PASSAGE value needs
> to be encoded as follows:
> 
> `1.2.lemma#%CE%BC%E1%BF%86%CE%BD%CE%B9%CF%82`
> 
> Unfortunately, that's not as user-friendly for people who know the
> language of the source text from which the literal string is taken.
> 
> (Yes, the URNBIS Working Group had lengthy discussions about whether to
> allow non-percent-encoded Unicode characters outside the ASCII range in
> the URN syntax, and ultimately decided against it.)

Well, all URNs are URIs (RFC 3986), and for URIs, there's IRIs (RFC 
3987), so urn:cts:greekLit:tlg5026.msA.hmt:1.2.lemma#μῆνις would work in 
any protocol slot that accepted IRIs, and would automatically be 
converted to the ugly 
urn:cts:greekLit:tlg5026.msA.hmt:1.2.lemma#%CE%BC%E1%BF%86%CE%BD%CE%B9%CF%82 
whenever necessary.

Regards,   Martin.