[I18nrp] draft-ietf-iri-comparison-02

Larry Masinter <LMM@acm.org> Thu, 28 June 2018 21:28 UTC

Return-Path: <masinter@gmail.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5F342131038 for <i18nrp@ietfa.amsl.com>; Thu, 28 Jun 2018 14:28:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.389
X-Spam-Level:
X-Spam-Status: No, score=-1.389 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_FILL_THIS_FORM_SHORT=0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w-D0khgKhHN4 for <i18nrp@ietfa.amsl.com>; Thu, 28 Jun 2018 14:28:07 -0700 (PDT)
Received: from mail-oi0-x234.google.com (mail-oi0-x234.google.com [IPv6:2607:f8b0:4003:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E0704131088 for <i18nrp@ietf.org>; Thu, 28 Jun 2018 14:28:06 -0700 (PDT)
Received: by mail-oi0-x234.google.com with SMTP id f79-v6so6555179oib.7 for <i18nrp@ietf.org>; Thu, 28 Jun 2018 14:28:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:date:message-id:mime-version :content-transfer-encoding:thread-index:content-language; bh=kTX8FDOLZ17QYAtLs0tVIgZH6Vyq9IQq3MZvAovqoFA=; b=pxp4hp3KhyxVnQRG/DBopuHU2ujed+v/wtA8VHIDjIxwVBgCzgzd5uaTZCfCUO07cL lkgdjHTMlAEKabPGPFQor+yTzu2J75VQgSmHrYQ0PVqiHEzz1bm/iGotDDvFxHOqD04J O3W/wNJOFBDo3z54cu8rVJoT6vdFOqwMPMC/E96ehCpuL1tK+zMRIvCRRKRFoQYvPQFX u/RhOjrpdcZSgmtz5PdqwuAGyEkc2F9k3APC5hbp1EAnLMTw1hRPNlaDk0NZPOB8bWEr P4Nq5RLU1O0Dp+DK95/lT3fgn2OKuSI8rZf2hx9PUelrCevQ2x7hBXEz9K8XfKIFLP2C lzIA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :mime-version:content-transfer-encoding:thread-index :content-language; bh=kTX8FDOLZ17QYAtLs0tVIgZH6Vyq9IQq3MZvAovqoFA=; b=piID3tj3JszG8+yS111YRsUYLIUgWjrVuaPLmgAqcwCMey9WoauajhNfxp7CMxUGSs FIFqDOILV/bNlrCG4JkApZbfdPPfOMDX56R3v6k3oiW871IBshJ6tWnrzl0ni4i9T6TQ gcpbwVKuyKlxeT6HDEOG+ITWsDXJv1RHb6F+z1BhkgiSotX6huQuKcM8ZLYvv7bpNNce Ki+vXZyBNV0pqi2U1Nm6KeO6sSHmWBdVb83WiWhIaFnyldvhV8K9ivxdn/vglUt8qmi5 cNOcbGWYU+/mdzzTIOBJAFctH5YSFHrZzp065vQf2JyVpGcsYJqYnqeUH8uRdKlVTo9h wuuQ==
X-Gm-Message-State: APt69E3Z5vwREVKkSe0Gmf9wuWJEiTw/M/QqY2FJdssDyKCEeu6C3Q10 LxzwwzcHT8wYCza6mHHixeyGzOxz
X-Google-Smtp-Source: AAOMgpe697OiOAYHmK4cep2M1r0CLWoSVtBauKCBBjN1PW2oMCJFRBze9NOfarEROzCbLPBSit044A==
X-Received: by 2002:aca:e255:: with SMTP id z82-v6mr7112762oig.268.1530221285762; Thu, 28 Jun 2018 14:28:05 -0700 (PDT)
Received: from TVPC (c-24-6-174-39.hsd1.ca.comcast.net. [24.6.174.39]) by smtp.gmail.com with ESMTPSA id g36-v6sm3667208otb.54.2018.06.28.14.28.04 for <i18nrp@ietf.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Jun 2018 14:28:04 -0700 (PDT)
Sender: Larry Masinter <masinter@gmail.com>
From: Larry Masinter <LMM@acm.org>
X-Google-Original-From: "Larry Masinter" <lmm@acm.org>
To: <i18nrp@ietf.org>
Date: Thu, 28 Jun 2018 14:28:05 -0700
Message-ID: <01ca01d40f26$e78e17d0$b6aa4770$@acm.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AdQPITzbEppayOeZQUaDUlXwRsV52A==
Content-Language: en-us
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/XRRXiAQ-5JlLsGfEeBKpncIzZrI>
Subject: [I18nrp] draft-ietf-iri-comparison-02
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jun 2018 21:28:08 -0000

https://tools.ietf.org/html/draft-ietf-iri-comparison-02
"Comparison, Equivalence and Canonicalization of Internationalized Resource
Identifiers"
Went through a few revisions as draft-masinter-iri-comparison too.

This draft was abandoned when the working group closed (for lack of
participation), but should be revived.

The IRI comparison/canonicalization problem subsumes the domain name and
email address problems, pretty much. That is, if you could define meaningful
way of determining equivalence of two IRIs, you could trivially use that to
define equivalence for domain names (prepend "http://") and for email
addresses (prepend "mailto:")  So why not take on the bigger problem?

Basically, any process or application that involves a human viewing a
rendering of a string of characters and comparing it to a rendering of
another string, or re-entering it,  or remembering it can depend on the user
sufficiently that it is not possible to define an equivalence or
canonicalization algorithm that would be uniformly acceptable.

My simple idea for CanIUse.name site would be to let a user input a string,
render the input in a couple of simple type styles, and use OCR with
language auto-detection; if you got back the same string, it would be a name
that's pretty safe.   All of the known problems (normalization, confusing of
I l 1 |, 0 O, seem to be handled pretty well.