Re: [hybi] Performance of Vector XOR

Tobias Oberstein <tobias.oberstein@tavendo.de> Wed, 07 September 2011 20:47 UTC

Return-Path: <tobias.oberstein@tavendo.de>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DF3D21F8B6E for <hybi@ietfa.amsl.com>; Wed, 7 Sep 2011 13:47:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.508
X-Spam-Level:
X-Spam-Status: No, score=-2.508 tagged_above=-999 required=5 tests=[AWL=0.091, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kcDCKHigvm25 for <hybi@ietfa.amsl.com>; Wed, 7 Sep 2011 13:47:09 -0700 (PDT)
Received: from EXHUB020-4.exch020.serverdata.net (exhub020-4.exch020.serverdata.net [206.225.164.31]) by ietfa.amsl.com (Postfix) with ESMTP id 72EEF21F8B08 for <HYBI@ietf.org>; Wed, 7 Sep 2011 13:47:09 -0700 (PDT)
Received: from EXVMBX020-12.exch020.serverdata.net ([169.254.3.209]) by EXHUB020-4.exch020.serverdata.net ([206.225.164.31]) with mapi; Wed, 7 Sep 2011 13:48:59 -0700
From: Tobias Oberstein <tobias.oberstein@tavendo.de>
To: Bob Gezelter <gezelter@rlgsc.com>, "len.holgate@gmail.com" <len.holgate@gmail.com>, "rbarnes@bbn.com" <rbarnes@bbn.com>
Date: Wed, 07 Sep 2011 13:47:38 -0700
Thread-Topic: Performance of Vector XOR
Thread-Index: AcxtnSLbjlvUsVROTJmb/9EjIABQiAAARREg
Message-ID: <634914A010D0B943A035D226786325D422C0F6D90D@EXVMBX020-12.exch020.serverdata.net>
References: <20110907133128.ef1fc80126c74c6c202a919c41c7bb0b.a42c8f016b.wbe@email03.secureserver.net>
In-Reply-To: <20110907133128.ef1fc80126c74c6c202a919c41c7bb0b.a42c8f016b.wbe@email03.secureserver.net>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: de-DE, en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Cc: "HYBI@ietf.org" <HYBI@ietf.org>
Subject: Re: [hybi] Performance of Vector XOR
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Sep 2011 20:47:10 -0000

Bob,

my comment wasn't to question masking, and XOR is probably as light-weight
as it gets.

It was also not about how great a tool Python is for bit banging - it isn't;)

And yes, I'd probably use stuff like GCC SIMD intrinsics to get the XOR
really fly .. in C/C++.

Personally, I'm suprised how good Google V8 is a JITting.

I would be interested in 2 more data points:

* LuaJIT2 .. this could trump v8. It's insanely fast, and has special bit ops. handrolled.
* gcc intrinsics as a base line

and
* Java/C#

 ... with the latter since those might also not be the greatest for bit banging,
and there might be no cheap escape route to C.

With Python, if I had more time, I'd probably write a minimal
WS accelerator native module that does just

* XOR
* incr. UTF-8 validation

Those 2 are the pain points.

Cheers,
Tobias

> -----Ursprüngliche Nachricht-----
> Von: Bob Gezelter [mailto:gezelter@rlgsc.com]
> Gesendet: Mittwoch, 7. September 2011 22:31
> An: len.holgate@gmail.com; Tobias Oberstein; rbarnes@bbn.com
> Cc: HYBI@ietf.org
> Betreff: Performance of Vector XOR
> 
> Len, Tobias, Richard,
> 
> I would recommend extreme caution on using rough JavaScript or even
> Python benchmarks as a test for the execution efficiency of the masking
> operation (Note: I say this as one who is unconvinced of the benefits of
> masking).
> 
> A well-written, straight vector XOR of a message should not be an extremely
> expensive operation. I would be loath to believe that modern processors
> cannot effectively perform this operation. With all due respect to the
> implementers, it is far more likely that the JavaScript (and quite possibly
> Python) use data structures and representations that were not designed for
> this type of vector operation.
> 
> In a real situation, the XOR would be done in code that is likely written in
> C/C++ or a similar language. If there is a real concern on this issue, then the
> analysis should be done at the level of precisely what machine instructions
> are actually being executed. As a former code generator writer, I can attest
> that it is quite easy to gain or lose efficiency at this level (for example, I would
> be unsurprised to discover that JavaScript and Python are doing the XOR one
> byte at a time, rather than in 64-bit chunks; that alone would account for
> nearly an order of magnitude performance hit).
> 
> - Bob Gezelter, http://www.rlgsc.com