Re: [sip-ops] [dispatch] SIP-CLF: Results on ASCII vs. binary representation

Dean Willis <dean.willis@softarmor.com> Tue, 05 May 2009 01:43 UTC

Return-Path: <dean.willis@softarmor.com>
X-Original-To: sip-ops@core3.amsl.com
Delivered-To: sip-ops@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9922E3A69FC for <sip-ops@core3.amsl.com>; Mon, 4 May 2009 18:43:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7nzBsuGJifKi for <sip-ops@core3.amsl.com>; Mon, 4 May 2009 18:43:23 -0700 (PDT)
Received: from nylon.softarmor.com (nylon.softarmor.com [66.135.38.164]) by core3.amsl.com (Postfix) with ESMTP id C2B5A3A6819 for <sip-ops@ietf.org>; Mon, 4 May 2009 18:43:23 -0700 (PDT)
Received: from [192.168.2.103] (cpe-76-182-235-115.tx.res.rr.com [76.182.235.115]) (authenticated bits=0) by nylon.softarmor.com (8.14.3/8.14.3/Debian-5) with ESMTP id n451ikGb027037 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 4 May 2009 20:44:48 -0500
Message-Id: <7B9669B6-AAE1-4C4E-AA8E-CEA2895A9B4B@softarmor.com>
From: Dean Willis <dean.willis@softarmor.com>
To: Adam Roach <adam@nostrum.com>
In-Reply-To: <49FF48EE.8070705@nostrum.com>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v930.3)
Date: Mon, 04 May 2009 20:44:41 -0500
References: <WNQHYTPgGvwQ.rzn0RkdD@nylon.softarmor.com> <49FF48EE.8070705@nostrum.com>
X-Mailer: Apple Mail (2.930.3)
Cc: "sip-ops@ietf.org" <sip-ops@ietf.org>, Tom Taylor <tom.taylor@rogers.com>, Hadriel Kaplan <HKaplan@acmepacket.com>
Subject: Re: [sip-ops] [dispatch] SIP-CLF: Results on ASCII vs. binary representation
X-BeenThere: sip-ops@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Operations <sip-ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-ops>, <mailto:sip-ops-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-ops>
List-Post: <mailto:sip-ops@ietf.org>
List-Help: <mailto:sip-ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-ops>, <mailto:sip-ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 May 2009 01:43:24 -0000

On May 4, 2009, at 2:58 PM, Adam Roach wrote:

> Dean Willis wrote:
>> Right. In my experience, binary offers only marginal performance  
>> over structured ASCII.
>
> I'll correct this slightly: binary offers only marginal performance  
> over *properly* structured ASCII. I argue that the Apache log format  
> doesn't qualify for this description. (I say this after having to  
> deal with processing relatively large Apache logs, and suffering  
> from the vast quantities of time such processing takes).
>
> I'm okay with Tom's suggestion to add ASCII length (and tag?) fields  
> to the text format. If people want to take it that direction, I  
> could re-cast my proposed format so it contains fixed-length ASCII  
> number representations for pointers, lengths, and tags; and so that  
> it terminates records with something like an ASCII 0x0D.


One could design things such that each record contains an initial  
"structure" filed with a fixed-format  series of indices (total  
length, number of fields, lengths for each field, followed by  
variable--length data fields, tab (or other reserved char) delimited  
followed by a CRLF terminator (included in total-length index). This  
gives us records that are fast-parseable, but still human-readable.

--
Dean