Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix)
 with ESMTP id 21F9B21F9305 for
 <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>;
 Mon, 15 Apr 2013 19:24:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.052
X-Spam-Level: 
X-Spam-Status: No, score=-8.052 tagged_above=-999 required=5
 tests=[BAYES_00=-2.599, HELO_EQ_FR=0.35, HTML_MESSAGE=0.001,
 MIME_8BIT_HEADER=0.3, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_HI=-8,
 URI_NOVOWEL=0.5]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com
 [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id W-VUhqKaPRtC for
 <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>;
 Mon, 15 Apr 2013 19:24:47 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com
 (Postfix) with ESMTP id 921E421F925A for
 <httpbisa-archive-bis2Juki@lists.ietf.org>;
 Mon, 15 Apr 2013 19:24:47 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from
 <ietf-http-wg-request@listhub.w3.org>) id 1URvZ7-0004hu-IT for
 ietf-http-wg-dist@listhub.w3.org; Tue, 16 Apr 2013 02:24:13 +0000
Resent-Date: Tue, 16 Apr 2013 02:24:13 +0000
Resent-Message-Id: <E1URvZ7-0004hu-IT@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim
 4.72) (envelope-from <f.kayser@free.fr>) id 1URvZ4-0004hF-DK for
 ietf-http-wg@listhub.w3.org; Tue, 16 Apr 2013 02:24:10 +0000
Received: from smtp5-g21.free.fr ([212.27.42.5]) by maggie.w3.org with esmtp
 (Exim 4.72) (envelope-from <f.kayser@free.fr>) id 1URvZ2-0002tv-Om for
 ietf-http-wg@w3.org; Tue, 16 Apr 2013 02:24:10 +0000
Received: from [192.168.0.1] (unknown [81.56.127.176]) by smtp5-g21.free.fr
 (Postfix) with ESMTP id 64D1CD4808E for <ietf-http-wg@w3.org>;
 Tue, 16 Apr 2013 04:23:42 +0200 (CEST)
From: =?iso-8859-1?Q?Fr=E9d=E9ric_Kayser?= <f.kayser@free.fr>
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: multipart/alternative; boundary=Apple-Mail-1--922718268
Date: Tue, 16 Apr 2013 04:23:41 +0200
In-Reply-To: <CABP7RbfUH=U0hjcmEXKO1jJzy7pPffqFDE4TmAs-ahBX04qwJw@mail.gmail.com>
To: ietf-http-wg@w3.org
References: <CABP7RbfUH=U0hjcmEXKO1jJzy7pPffqFDE4TmAs-ahBX04qwJw@mail.gmail.com>
Message-Id: <24370F45-C4B4-41A2-8515-5B239766A943@free.fr>
X-Mailer: Apple Mail (2.1085)
Received-SPF: none client-ip=212.27.42.5; envelope-from=f.kayser@free.fr;
 helo=smtp5-g21.free.fr
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: AWL=-2.718, BAYES_00=-1.9, FREEMAIL_FROM=0.001,
 HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_NONE=-0.0001,
 URI_NOVOWEL=0.5
X-W3C-Scan-Sig: maggie.w3.org 1URvZ2-0002tv-Om 14962bec68a1f0720d23eb07cc58b3d9
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Header Serialization Discussion
Archived-At: <http://www.w3.org/mid/24370F45-C4B4-41A2-8515-5B239766A943@free.fr>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17246
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

--Apple-Mail-1--922718268
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hello,
If text fields can effectively be encoded as UTF-8 would it be wise to =
use it to send IRIs (RFC3987)?
without punycode:
http://xn--acadmie-franaise-npb1a.fr/ vs. =
http://acad=C3=A9mie-fran=C3=A7aise.fr/
http://www.xn--cigacz-2ib.pl/ vs. http://www.=C5=9Bcigacz.pl/
http://xn--rlcuo9h.xn--wkc4axeaevb3oqbg.xn--xkc2al3hye2a/ vs. =
http://=E0=AE=A4=E0=AE=B3=E0=AE=AE=E0=AF=8D.=E0=AE=86=E0=AE=B3=E0=AF=8D=E0=
=AE=95=E0=AE=B3=E0=AE=AE=E0=AF=88=E0=AE=AF=E0=AE=AE=E0=AF=8D.=E0=AE=87=E0=AE=
=B2=E0=AE=99=E0=AF=8D=E0=AE=95=E0=AF=88/
http://xn--mgbggrfi2ikdb7d.xn--mgberp4a5d4ar/ vs. =
http://=D9=85=D8=B1=D9=83=D8=B2=D8=A7=D9=84=D8=AA=D8=B3=D8=AC=D9=8A=D9=84.=
=D8=A7=D9=84=D8=B3=D8=B9=D9=88=D8=AF=D9=8A=D8=A9/

and without percent encoding:
zdj%C4%99cia vs. zdj=C4=99cia
g%C3%B6r%C3%BCnt%C3%BC vs. g=C3=B6r=C3=BCnt=C3=BC

I wouldn't mind if HTTP/2 clearly took the bull by the horns regarding =
I18N.

The easiest way to (re)encode UTF-8 using variable code length would be =
to collect/define statistics only for the leading octet and store the =
continuation octets as fixed 6-bit values (since they are restricted to =
the 80-BF range, 64 values).

--=20
Fr=C3=A9d=C3=A9ric Kayser

James M Snell wrote :

> Text can be either UTF-8 or ISO-8859-1, indicated by a single bit flag
> following the type code. All text strings are prefixed by it's length
> given as an unsigned variant length integer
>=20
[snip]
>=20
> For ISO-8859-1 Text, the Static Huffman Code used by Delta would be
> used for the value. If we can develop an approach to effectively
> handling Huffman coding for arbitrary UTF-8, then we can apply Huffman
> coding to that as well.


--Apple-Mail-1--922718268
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><div>Hello,</div><div>If text fields can effectively be encoded as =
UTF-8 would it be wise to use it to send IRIs (<a =
href=3D"http://tools.ietf.org/html/rfc3987">RFC3987</a>)?</div><div>withou=
t punycode:</div><div><a =
href=3D"http://xn--acadmie-franaise-npb1a.fr/">http://xn--acadmie-franaise=
-npb1a.fr/</a>&nbsp;vs.&nbsp;<a =
href=3D"http://acad=C3=A9mie-fran=C3=A7aise.fr/">http://acad=C3=A9mie-fran=
=C3=A7aise.fr/</a></div><div><a =
href=3D"http://www.xn--cigacz-2ib.pl/">http://www.xn--cigacz-2ib.pl/</a> =
vs.&nbsp;<a =
href=3D"http://www.=C5=9Bcigacz.pl/">http://www.=C5=9Bcigacz.pl/</a></div>=
<div><a =
href=3D"http://xn--rlcuo9h.xn--wkc4axeaevb3oqbg.xn--xkc2al3hye2a/">http://=
xn--rlcuo9h.xn--wkc4axeaevb3oqbg.xn--xkc2al3hye2a/</a> =
vs.&nbsp;http://=E0=AE=A4=E0=AE=B3=E0=AE=AE=E0=AF=8D.=E0=AE=86=E0=AE=B3=E0=
=AF=8D=E0=AE=95=E0=AE=B3=E0=AE=AE=E0=AF=88=E0=AE=AF=E0=AE=AE=E0=AF=8D.=E0=AE=
=87=E0=AE=B2=E0=AE=99=E0=AF=8D=E0=AE=95=E0=AF=88/</div><div><a =
href=3D"http://xn--mgbggrfi2ikdb7d.xn--mgberp4a5d4ar/">http://xn--mgbggrfi=
2ikdb7d.xn--mgberp4a5d4ar/</a> vs.&nbsp;<a =
href=3D"http://=D9=85=D8=B1=D9=83=D8=B2=D8=A7=D9=84=D8=AA=D8=B3=D8=AC=D9=8A=
=D9=84.=D8=A7=D9=84=D8=B3=D8=B9=D9=88=D8=AF=D9=8A=D8=A9/">http://=D9=85=D8=
=B1=D9=83=D8=B2=D8=A7=D9=84=D8=AA=D8=B3=D8=AC=D9=8A=D9=84.=D8=A7=D9=84=D8=B3=
=D8=B9=D9=88=D8=AF=D9=8A=D8=A9/</a></div><div><br></div><div>and&nbsp;with=
out&nbsp;percent encoding:</div><div>zdj%C4%99cia =
vs.&nbsp;zdj=C4=99cia</div><div>g%C3%B6r%C3%BCnt%C3%BC&nbsp;vs. =
g=C3=B6r=C3=BCnt=C3=BC</div><div><br></div><div>I wouldn't mind if =
HTTP/2 clearly&nbsp;took the bull by the horns =
regarding&nbsp;I18N.</div><div><br></div><div>The easiest way to =
(re)encode UTF-8 using variable code length would be to collect/define =
statistics only for the leading octet and store the continuation octets =
as fixed 6-bit values (since they are restricted to the 80-BF range, 64 =
values).</div><div><br></div>--&nbsp;<div>Fr=C3=A9d=C3=A9ric =
Kayser</div><div><br><div><div>James M Snell wrote :</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><div>Text =
can be either UTF-8 or ISO-8859-1, indicated by a single bit =
flag<br>following the type code. All text strings are prefixed by it's =
length<br>given as an unsigned variant length =
integer<br><br></div></blockquote>[snip]<br><blockquote =
type=3D"cite"><div><br>For ISO-8859-1 Text, the Static Huffman Code used =
by Delta would be<br>used for the value. If we can develop an approach =
to effectively<br>handling Huffman coding for arbitrary UTF-8, then we =
can apply Huffman<br>coding to that as =
well.<br></div></blockquote></div><br></div></body></html>=

--Apple-Mail-1--922718268--

