Re: [urn] CTS and CITE2 URNs

Martin J. Dürst <> Fri, 22 March 2019 05:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 4DD131277E0 for <>; Thu, 21 Mar 2019 22:38:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.922
X-Spam-Status: No, score=-0.922 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id eXdXM7dQUv3f for <>; Thu, 21 Mar 2019 22:38:25 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id AF2E7127984 for <>; Thu, 21 Mar 2019 22:38:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Is98Qc70F3iLKYnQTv6tsASdbzFVCL8guIP+okygg1g=; b=qlyOgqJfAqb2rGLYc2IK4V4ifdMFw+e9aApwHBJuBAFqmzRzS5zgDHau6Hdmbe7cPXJsQJUx1MbyGWCXHBhyesny1AEt0kLXr2q/gaJ93RAfkOIOFQFokXem9uprV8KnrDo3pTPqeTVQfTQVep0frWkldAY6vNmlh0MW20/LA0I=
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1730.15; Fri, 22 Mar 2019 05:38:21 +0000
Received: from ([fe80::98b6:d90e:9ae7:302]) by ([fe80::98b6:d90e:9ae7:302%3]) with mapi id 15.20.1709.015; Fri, 22 Mar 2019 05:38:21 +0000
From: =?utf-8?B?TWFydGluIEouIETDvHJzdA==?= <>
To: Peter Saint-Andre <>, "" <>
Thread-Topic: [urn] CTS and CITE2 URNs
Thread-Index: AQHU4HF1wdEtZKkfOkSj75rBbfl54g==
Date: Fri, 22 Mar 2019 05:38:21 +0000
Message-ID: <>
References: <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-clientproxiedby: (2603:1096:405:3::27) To (2603:1096:404:12e::18)
authentication-results: spf=none (sender IP is );
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: []
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 6771c289-bdbd-4983-cb49-08d6ae889849
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(5600127)(711020)(4605104)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(2017052603328)(7153060)(7193020); SRVR:TYAPR01MB4302;
x-ms-traffictypediagnostic: TYAPR01MB4302:
x-ms-exchange-purlcount: 1
x-microsoft-antispam-prvs: <>
x-forefront-prvs: 09840A4839
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(396003)(136003)(376002)(39840400004)(366004)(346002)(189003)(199004)(316002)(786003)(68736007)(26005)(110136005)(186003)(52116002)(8936002)(6246003)(99286004)(6116002)(508600001)(3846002)(386003)(102836004)(53546011)(6506007)(966005)(14454004)(71200400001)(71190400001)(2501003)(106356001)(105586002)(85202003)(31686004)(11346002)(256004)(5660300002)(476003)(97736004)(446003)(2616005)(486006)(6306002)(6512007)(6436002)(7736002)(85182001)(53936002)(8676002)(31696002)(76176011)(66066001)(81156014)(81166006)(2906002)(86362001)(229853002)(74482002)(25786009)(6486002)(305945005); DIR:OUT; SFP:1102; SCL:1; SRVR:TYAPR01MB4302;; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None ( does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: aBfKKf2Fe4jC4gJn6oaLaDOXQ42rw6WeDD2y+lgtKoanxBBBmi5hDu7KvX4NSPivfMI+VBmph+BDTAsimOe70qXEcz2xBW3Jv/kwRES6NV1yn8CpakGfgE1nACaV3ta4yan0KTitxzXwrsnFGL9HD41GJGpBrHmkGBc1eR6XKcJcew2BJMmWMwJYV4ZHyD5qSayPC1XrXiyd4ymziUfyqtdVEQqdrxu8ufAOag09ZlYBMxCttgxA6QvXWmsE6bYXud5keGXWFZUBlAruxwtIu2S4Ges15MDs2nNHOO3UCPVDpS/ZjGSOnbdJ4MLZReZy7HhTu5TLQT0Gd1/nh0RioP+Obfc/0hCEx7ra5uFtPq7r+BgjgQ/+L4WIiPrJUOol/CHIF+eYO0ge1/3sVA3ucu5EXopsN1SeLHRxo6gcL7k=
Content-Type: text/plain; charset="utf-8"
Content-ID: <>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 6771c289-bdbd-4983-cb49-08d6ae889849
X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Mar 2019 05:38:21.5663 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB4302
Archived-At: <>
Subject: Re: [urn] CTS and CITE2 URNs
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Revisions to URN RFCs <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 22 Mar 2019 05:38:27 -0000

Hello Peter, others,

Sorry to reply to an older email.

On 2019/03/07 13:02, Peter Saint-Andre wrote:
> On 3/4/19 11:58 PM, Hakala, Juha E wrote:
>> I am in favour of registering these namespaces. The only reservation I have is the usage of "@" to separate subreferences. Would it be possible to use hash character for this purpose?
>> At
>> the researchers themselves give the following example:
>> urn:cts:greekLit:tlg5026.msA.hmt:1.2.lemma#μῆνις

> A follow-on observation. As the genesis of the CTS and CITE2 namespaces
> within the Homer Multitext project indicates, it might be desirable for
> literal strings in the NSS to include characters outside the ASCII range
> (e.g., `μῆνις`). I suspect that such strings would be limited to the
> PASSAGE and SUBREFERENCE constructs, but perhaps the registrants could
> clarify that for us. In any case, here it seems that "the nature of the
> particular URN namespace makes such characters necessary" (RFC 8141, §2.2).
> However, the same section of RFC 8141 also clearly states:
>     In particular, with regard to characters outside the ASCII range,
>     URNs that appear in protocols or that are passed between systems MUST
>     use only Unicode characters encoded in UTF-8 and further encoded as
>     required by RFC 3986.
> As far as I can see, this implies that the foregoing PASSAGE value needs
> to be encoded as follows:
> `1.2.lemma#%CE%BC%E1%BF%86%CE%BD%CE%B9%CF%82`
> Unfortunately, that's not as user-friendly for people who know the
> language of the source text from which the literal string is taken.
> (Yes, the URNBIS Working Group had lengthy discussions about whether to
> allow non-percent-encoded Unicode characters outside the ASCII range in
> the URN syntax, and ultimately decided against it.)

Well, all URNs are URIs (RFC 3986), and for URIs, there's IRIs (RFC 
3987), so urn:cts:greekLit:tlg5026.msA.hmt:1.2.lemma#μῆνις would work in 
any protocol slot that accepted IRIs, and would automatically be 
converted to the ugly 
whenever necessary.

Regards,   Martin.