Re: [sip-clf] [sip-ops] [dispatch] SIP-CLF: Results on ASCII vs. binary representation

Adam Roach <adam@nostrum.com> Tue, 05 May 2009 05:36 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7C2DD3A68AE for <sip-clf@core3.amsl.com>; Mon, 4 May 2009 22:36:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, SPF_PASS=-0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HaxBQz9+6oVW for <sip-clf@core3.amsl.com>; Mon, 4 May 2009 22:36:47 -0700 (PDT)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by core3.amsl.com (Postfix) with ESMTP id 604533A6C0F for <sip-clf@ietf.org>; Mon, 4 May 2009 22:36:47 -0700 (PDT)
Received: from [192.168.0.128] (ppp-70-249-149-101.dsl.rcsntx.swbell.net [70.249.149.101]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id n455c9sR095674 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 5 May 2009 00:38:10 -0500 (CDT) (envelope-from adam@nostrum.com)
Message-ID: <49FFD0C1.7090603@nostrum.com>
Date: Tue, 05 May 2009 00:38:09 -0500
From: Adam Roach <adam@nostrum.com>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b3pre) Gecko/20090223 Lightning/1.0pre Thunderbird/3.0b2
MIME-Version: 1.0
To: Dean Willis <dean.willis@softarmor.com>
References: <WNQHYTPgGvwQ.rzn0RkdD@nylon.softarmor.com> <49FF48EE.8070705@nostrum.com> <7B9669B6-AAE1-4C4E-AA8E-CEA2895A9B4B@softarmor.com>
In-Reply-To: <7B9669B6-AAE1-4C4E-AA8E-CEA2895A9B4B@softarmor.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass (nostrum.com: 70.249.149.101 is authenticated by a trusted mechanism)
Cc: "sip-clf@ietf.org" <sip-clf@ietf.org>, Tom Taylor <tom.taylor@rogers.com>, Hadriel Kaplan <HKaplan@acmepacket.com>
Subject: Re: [sip-clf] [sip-ops] [dispatch] SIP-CLF: Results on ASCII vs. binary representation
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 May 2009 05:36:48 -0000

On 05/04/2009 08:44 PM, Dean Willis wrote:
>
> On May 4, 2009, at 2:58 PM, Adam Roach wrote:
>
>> Dean Willis wrote:
>>> Right. In my experience, binary offers only marginal performance 
>>> over structured ASCII.
>>
>> I'll correct this slightly: binary offers only marginal performance 
>> over *properly* structured ASCII. I argue that the Apache log format 
>> doesn't qualify for this description. (I say this after having to 
>> deal with processing relatively large Apache logs, and suffering from 
>> the vast quantities of time such processing takes).
>>
>> I'm okay with Tom's suggestion to add ASCII length (and tag?) fields 
>> to the text format. If people want to take it that direction, I could 
>> re-cast my proposed format so it contains fixed-length ASCII number 
>> representations for pointers, lengths, and tags; and so that it 
>> terminates records with something like an ASCII 0x0D.
>
>
> One could design things such that each record contains an initial 
> "structure" filed with a fixed-format  series of indices (total 
> length, number of fields, lengths for each field, followed by 
> variable--length data fields, tab (or other reserved char) delimited 
> followed by a CRLF terminator (included in total-length index). This 
> gives us records that are fast-parseable, but still human-readable.
>

Yes. That is what I was proposing.

Followups to the sip-clf list (not sip-ops!).

/a