Re: Delta Compression and UTF-8 Header Values

"Adrien W. de Croy" <adrien@qbik.com> Sat, 09 February 2013 21:36 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 47A4721F86B6 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 13:36:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.162
X-Spam-Level:
X-Spam-Status: No, score=-10.162 tagged_above=-999 required=5 tests=[AWL=0.285, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9xCe2NEwENoT for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 13:36:33 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id BE29221F861F for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 9 Feb 2013 13:36:33 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1U4I41-00006I-VL for ietf-http-wg-dist@listhub.w3.org; Sat, 09 Feb 2013 21:34:25 +0000
Resent-Date: Sat, 09 Feb 2013 21:34:25 +0000
Resent-Message-Id: <E1U4I41-00006I-VL@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <adrien@qbik.com>) id 1U4I3s-00005B-LM for ietf-http-wg@listhub.w3.org; Sat, 09 Feb 2013 21:34:16 +0000
Received: from smtp.qbik.com ([210.55.214.35]) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <adrien@qbik.com>) id 1U4I3q-0000eE-DQ for ietf-http-wg@w3.org; Sat, 09 Feb 2013 21:34:16 +0000
Received: From [192.168.0.10] (unverified [192.168.0.10]) by SMTP Server [192.168.0.1] (WinGate SMTP Receiver v7.5.0 (Build 3489)) with SMTP id <0019501342@smtp.qbik.com>; Sun, 10 Feb 2013 10:35:44 +1300
From: "Adrien W. de Croy" <adrien@qbik.com>
To: Willy Tarreau <w@1wt.eu>, Martin Nilsson <nilsson@opera.com>
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Date: Sat, 09 Feb 2013 21:33:13 +0000
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; format="flowed"; charset="utf-8"
In-Reply-To: <20130209145834.GB8712@1wt.eu>
Message-Id: <emf3ec6632-2c7a-491d-9db4-ad235e66f19c@bombed>
Mime-Version: 1.0
Reply-To: "Adrien W. de Croy" <adrien@qbik.com>
User-Agent: eM_Client/5.0.17263.0
Received-SPF: pass client-ip=210.55.214.35; envelope-from=adrien@qbik.com; helo=smtp.qbik.com
X-W3C-Hub-Spam-Status: No, score=-4.4
X-W3C-Hub-Spam-Report: AWL=-2.499, BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1U4I3q-0000eE-DQ dfedcccb7bf081e4652fdc1c6a379460
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Delta Compression and UTF-8 Header Values
Archived-At: <http://www.w3.org/mid/emf3ec6632-2c7a-491d-9db4-ad235e66f19c@bombed>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16490
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

------ Original Message ------
From: "Willy Tarreau" <w@1wt.eu>
>On Sat, Feb 09, 2013 at 03:12:32PM +0100, Martin Nilsson wrote:
>>  On Sat, 09 Feb 2013 14:33:41 +0100, Willy Tarreau <w@1wt.eu> wrote:
>>
>>  >Also, processing it is
>>  >particularly inefficient as you have to parse each and every byte to 
>>find
>>  >a length, making string comparisons quite slow.
>>
>>  You don't need to know the length in characters to compare strings. 
>>Just
>>  comparing byte on byte works fine.
>
>This is exactly what you want to avoid when comparing with lots of 
>strings.
>It's generally more efficient to first compare lengths, then byte per 
>byte
>only if lengths match.

only if you don't have to count bytes to get the length.  IF you need to 
do that, then byte-by-byte is more efficient.

I guess we're talking about length-prefixed data in this context though 
so it would be an available optimisation.

Adrien

>
>