[Sipping] Alternate CLF syntax proposal
Adam Roach <adam@nostrum.com> Thu, 26 March 2009 02:01 UTC
Return-Path: <adam@nostrum.com>
X-Original-To: sipping@core3.amsl.com
Delivered-To: sipping@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C797A3A680F for <sipping@core3.amsl.com>; Wed, 25 Mar 2009 19:01:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.538
X-Spam-Level:
X-Spam-Status: No, score=-2.538 tagged_above=-999 required=5 tests=[AWL=0.062, BAYES_00=-2.599, SPF_PASS=-0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fKGcI6NEjFJr for <sipping@core3.amsl.com>; Wed, 25 Mar 2009 19:01:15 -0700 (PDT)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by core3.amsl.com (Postfix) with ESMTP id 9F4403A6AAF for <sipping@ietf.org>; Wed, 25 Mar 2009 19:01:14 -0700 (PDT)
Received: from dhcp-17f4.meeting.ietf.org (dhcp-17f4.meeting.ietf.org [130.129.23.244]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id n2Q224tl023749 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 25 Mar 2009 21:02:05 -0500 (CDT) (envelope-from adam@nostrum.com)
Message-ID: <49CAE21C.5060309@nostrum.com>
Date: Wed, 25 Mar 2009 19:02:04 -0700
From: Adam Roach <adam@nostrum.com>
User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302)
MIME-Version: 1.0
To: sipping WG <sipping@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass (nostrum.com: 130.129.23.244 is authenticated by a trusted mechanism)
X-Virus-Scanned: ClamAV 0.94.2/9168/Wed Mar 25 16:01:16 2009 on shaman.nostrum.com
X-Virus-Status: Clean
Cc: draft-gurbani-sipping-clf@tools.ietf.org
Subject: [Sipping] Alternate CLF syntax proposal
X-BeenThere: sipping@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "SIPPING Working Group \(applications of SIP\)" <sipping.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sipping>, <mailto:sipping-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sipping>
List-Post: <mailto:sipping@ietf.org>
List-Help: <mailto:sipping-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sipping>, <mailto:sipping-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Mar 2009 02:01:15 -0000
In the spirit of "send text," I've put together a straw-man proposal for an easy-to-generate and fast-to-process extensible format for saving SIP log messages: http://www.ietf.org/internet-drafts/draft-roach-sipping-clf-syntax-00.txt As an example of the processing that can be performed on this format: consider that I have a large file (on the order of 1 GB of data), with 1,232,896 records in it (to choose a nice, round number). I'd like to extract all the information about messages with a particular "From" value. With a text-based format, I'll be reading and parsing 1,262,485,504 bytes (every byte in the file) in order to find delimiters. With the format proposed in this document, I can open the file and then do the following about 1,232,896 times: - Read 4 bytes (total record length) - Fseek 32 bytes to reach the "To Value" pointer and length - Read 4 bytes - Fseek according to those 4 bytes to the literal value of the to header field - Read the to header field (let's imagine it's 20 bytes) - Fseek to the next record (according to the total record length) In total, I'm reading 28 bytes per record 1,232,896 times, for a grand total of 34,521,088 bytes -- or about 2.7% as much data as I do with a text file. When you're dealing with terabytes of log data, this can make the difference between taking one minute to sift data and taking 37 minutes to do the same operation. And, of course, it has the advantage that you can add more (tagged) data to each record without causing any additional processing load. /a
- [Sipping] Alternate CLF syntax proposal Adam Roach
- Re: [Sipping] Alternate CLF syntax proposal Vijay K. Gurbani
- Re: [Sipping] Alternate CLF syntax proposal Jason Fischl
- Re: [Sipping] Alternate CLF syntax proposal Hadriel Kaplan
- Re: [Sipping] Alternate CLF syntax proposal Hadriel Kaplan
- Re: [Sipping] Alternate CLF syntax proposal Hadriel Kaplan
- Re: [Sipping] Alternate CLF syntax proposal Theo Zourzouvillys
- Re: [Sipping] Alternate CLF syntax proposal Jiri Kuthan
- Re: [Sipping] Alternate CLF syntax proposal Theo Zourzouvillys
- Re: [Sipping] Alternate CLF syntax proposal Cullen Jennings
- Re: [Sipping] Alternate CLF syntax proposal Daryl Malas
- Re: [Sipping] Alternate CLF syntax proposal Hadriel Kaplan
- Re: [Sipping] Alternate CLF syntax proposal Hadriel Kaplan
- Re: [Sipping] Alternate CLF syntax proposal Theo Zourzouvillys
- Re: [Sipping] Alternate CLF syntax proposal Hisham Khartabil
- Re: [Sipping] Alternate CLF syntax proposal Vijay Gurbani
- Re: [Sipping] Alternate CLF syntax proposal Dale Worley
- Re: [Sipping] Alternate CLF syntax proposal Adam Roach
- Re: [Sipping] Alternate CLF syntax proposal Adam Roach