Re: Delta Compression and UTF-8 Header Values

Frédéric Kayser <f.kayser@free.fr> Sun, 10 February 2013 06:58 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A5E0821F861B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 22:58:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.858
X-Spam-Level:
X-Spam-Status: No, score=-9.858 tagged_above=-999 required=5 tests=[AWL=-0.061, BAYES_00=-2.599, HELO_EQ_FR=0.35, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9LwVvPyBvsUK for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 22:58:39 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 229C721F8617 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 9 Feb 2013 22:58:39 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1U4QqP-00065W-Hg for ietf-http-wg-dist@listhub.w3.org; Sun, 10 Feb 2013 06:56:57 +0000
Resent-Date: Sun, 10 Feb 2013 06:56:57 +0000
Resent-Message-Id: <E1U4QqP-00065W-Hg@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <f.kayser@free.fr>) id 1U4QqJ-00064i-7Y for ietf-http-wg@listhub.w3.org; Sun, 10 Feb 2013 06:56:51 +0000
Received: from smtp5-g21.free.fr ([212.27.42.5]) by maggie.w3.org with esmtp (Exim 4.72) (envelope-from <f.kayser@free.fr>) id 1U4QqI-0001CU-Bf for ietf-http-wg@w3.org; Sun, 10 Feb 2013 06:56:51 +0000
Received: from [192.168.0.1] (unknown [81.56.127.176]) by smtp5-g21.free.fr (Postfix) with ESMTP id A49A5D48096 for <ietf-http-wg@w3.org>; Sun, 10 Feb 2013 07:56:25 +0100 (CET)
From: Frédéric Kayser <f.kayser@free.fr>
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: multipart/mixed; boundary="Apple-Mail-3--79904798"
Date: Sun, 10 Feb 2013 07:56:24 +0100
In-Reply-To: <A4C04DB9-2524-49EC-8774-AF2EBF3EA350@free.fr>
To: ietf-http-wg@w3.org
References: <CABP7RbfRLXPpL4=wip=FvqD3DM7BM8PXi7uRswHAusXUmPO_xw@mail.gmail.com> <CE65E38D-A482-4EA9-BAF4-F6498F643A78@mnot.net> <511642E9.9010607@it.aoyama.ac.jp> <20130209133341.GA8712@1wt.eu> <op.wr8se6rpiw9drz@uranium.westinmy-starwoodgp.com> <A4C04DB9-2524-49EC-8774-AF2EBF3EA350@free.fr>
Message-Id: <65067E47-13CB-4909-87D8-46C529284755@free.fr>
X-Mailer: Apple Mail (2.1085)
Received-SPF: none client-ip=212.27.42.5; envelope-from=f.kayser@free.fr; helo=smtp5-g21.free.fr
X-W3C-Hub-Spam-Status: No, score=-4.2
X-W3C-Hub-Spam-Report: AWL=-2.270, BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001
X-W3C-Scan-Sig: maggie.w3.org 1U4QqI-0001CU-Bf 6a8369d17704c7dfc46fa8d2b150658b
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Delta Compression and UTF-8 Header Values
Archived-At: <http://www.w3.org/mid/65067E47-13CB-4909-87D8-46C529284755@free.fr>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16498
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Ok my mail client normalized the text before sending it...

Text snippet in ZIP archive this time.



Le 10 févr. 2013 à 07:44, Frédéric Kayser a écrit :

> Comparing Unicode strings without prior normalisation can lead to surprising results: "Frédéric" and "Frédéric" will probably look the same in your email client, but try to paste them in the search field and you'll probably see that they don't match since the first one uses precomposed diacritics and the second one combining ones.
> 
> Le 9 févr. 2013 à 15:12, Martin Nilsson a écrit :
> 
>> You don't need to know the length in characters to compare strings. Just comparing byte on byte works fine. Null is encoded the same, and byte zero only appear as null in UTF-8, so strlen works fine. So far strings are hollerith encoded in HTTP/2, so it should be a moot point anyway.
> 
>