Re: [precis] I-D Action: draft-ietf-precis-7700bis-01.txt

Erin Millard <> Mon, 05 September 2016 00:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5CCED12B109 for <>; Sun, 4 Sep 2016 17:50:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id EvCLg5eRYfQv for <>; Sun, 4 Sep 2016 17:50:13 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:400c:c08::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5FBDA12B107 for <>; Sun, 4 Sep 2016 17:50:13 -0700 (PDT)
Received: by with SMTP id q42so161556854uaq.1 for <>; Sun, 04 Sep 2016 17:50:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=5z6drAFwWQ6Eyf83UHOGqgYzD8x0U9tmnTWkp2HY3lk=; b=AHOJAxRXLE/fuYewEbZ7GlQe/dQFDSBSR/o8yApXoIV9SPNxRETLsL9lSHk6zuyVeW DZBsPj1lU5o3CVlU+wSquHj6mqkwdr/Kp3YyN1ClFdpwaXhv+Pu8yWoFHo9Q0rls9zOx FjhEYtF2IPvcB2YYq4w+l6pB4S+Q2FYQmphWT7foL08VFNCTB6RoOtKP4z9k/Gj7W4NP NB0FPPP5lUA4x28lSLBFhWdlE1QIXfX2SU7RcitwLRm9n/4MbIuZir0jQK53CwtSz39W KWGDuypExZ5ameGAILirRsZiBjlXzje9dlai/bqE7bwxrKwk9+CoYgFeTKGvlRUNTXKT QmLw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=5z6drAFwWQ6Eyf83UHOGqgYzD8x0U9tmnTWkp2HY3lk=; b=ODlkI8bWoiWm6XCnv9RzEH8105lAeP/dKfpxKvoobODxpw5syH+UC7ENTBU2HyDsNF 2fLgac1N1gswVviUM6pcrRldqJ9eMxG5mf7jqfa9M2dfxvBWikIXNThSSfioaVlDX19D Y1yCA/Q0JdvdNOi/nsuQt2NkcKV3tPUf6yEnu7Yz2jyc5fujRye8xI1pPa3y2WkNEqTj bU8zaAFdr7H9wzaC6qguyVRCk7OblPYzGcOY5o4hsmIY+FxQxG6MIo7v11f7QGGc4Ub6 iBCuoJFudsYUSuQPmHMmOEmAZ+T9EB/sNxg2QnvFH3T/ZPhZgiJ9tXQWtgVddGetw2un SS8g==
X-Gm-Message-State: AE9vXwNXq+D/dEaNmhOKkDQ7CxF76Dd/JkaEeuu82XSTNJBn2PWc5bfmlyEzM+dsInN3rrNYAoUDwTIDET6fUw==
X-Received: by with SMTP id 99mr12191253uai.43.1473036612538; Sun, 04 Sep 2016 17:50:12 -0700 (PDT)
MIME-Version: 1.0
Received: by with HTTP; Sun, 4 Sep 2016 17:50:12 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <> <> <> <>
From: Erin Millard <>
Date: Mon, 5 Sep 2016 10:50:12 +1000
X-Google-Sender-Auth: I6CMWdQi8f4lXepS9HTsnLImTMI
Message-ID: <>
To: Peter Saint-Andre <>
Content-Type: multipart/alternative; boundary=001a113f2ba008154d053bb8109b
Archived-At: <>
Subject: Re: [precis] I-D Action: draft-ietf-precis-7700bis-01.txt
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 05 Sep 2016 00:50:15 -0000

You're probably right. I was not referring to over-the-wire encoding, just
the preparation, enforcement, and comparison algorithms themselves.

The use-case I have in mind is just to give users some immediate feedback
about invalid characters in the browser, before any over-the-wire
transmission takes place.

On 5 September 2016 at 10:34, Peter Saint-Andre <> wrote:

> On 9/4/16 5:30 PM, Erin Millard wrote:
>>     >>> * §2.2 Specifies that UTF-8 MUST be used as the encoding; do we
>> really
>>     >>> want to limit this to UTF-8 only? Is this for comparison purposes?
>>     >>> Then again, 99.99% of the time UTF-8 is what you should be using
>>     >>> anyways, so I'm not sure that it matters.
>>     >>
>>     >> UTF-8 is your friend, and everything in PRECIS is UTF-8.
>>     >
>>     > PRECIS is mostly encoding agnostic; implementations might favor a
>>     > specific encoding, but I don't think anything in the spec
>> specifically
>>     > *needs* UTF-8. That being said, there are so few reasons to use
>>     > anything other than UTF-8 that I don't think it really matters, it
>> was
>>     > just curious to me that some of the PRECIS related specs called out
>>     > UTF-8 and some didn't.
>>     I thought they all did, but will double-check.
>> This actually became a bigger issue when attempting to implement PRECIS
>> prepare in JavaScript for the browser. JavaScript doesn't have native
>> UTF-8 support, so this meant the extra bloat of bringing in a UTF-8
>> library.
>> It didn't make a lot of sense to me either, since all the encoding
>> affects is how you go from string to code points, and vice versa. It had
>> no effect on the rest of my implementation. I could absolutely be
>> missing something, but compared to how focused the rest of the spec is,
>> the UTF-8 requirement seemed like an afterthought.
>> Can anyone explain which parts of PRECIS are actually predicated on the
>> original string being encoded in UTF-8?
> Are we perhaps getting confused between the encoding that is sent over the
> wire and the encoding that is used within the processing application?
> In general, we in the IETF prefer to send UTF-8 over the wire. However,
> it's true that this is a matter for the "using protocol" (e.g., I
> distinctly recall an extremely long thread in the XMPP WG years ago about
> whether to support only UTF-8 or to give clients and servers the ability to
> also use UTF-16 - and "UTF-8 only" won that debate). Given that some
> protocols or other technologies that use PRECIS might use UTF-16 or give
> applications the ability to choose an encoding, you're probably right that
> it makes sense to relax the rule for PRECIS itself.
> I'll think about this some more and propose some text.
> Peter