Re: [sip-ops] [dispatch] SIP-CLF: Results on ASCII vs. binary representation

"Vijay K. Gurbani" <> Wed, 29 April 2009 20:27 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id AD9FC3A6EDE; Wed, 29 Apr 2009 13:27:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.516
X-Spam-Status: No, score=-2.516 tagged_above=-999 required=5 tests=[AWL=0.083, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id PycnSXVQttZA; Wed, 29 Apr 2009 13:27:12 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 56EC928C265; Wed, 29 Apr 2009 13:26:01 -0700 (PDT)
Received: from ( []) by (8.13.8/IER-o) with ESMTP id n3TKRKqM015814 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Apr 2009 15:27:20 -0500 (CDT)
Received: from [] ( []) by (8.13.8/TPES) with ESMTP id n3TKRKEZ004184; Wed, 29 Apr 2009 15:27:20 -0500 (CDT)
Message-ID: <>
Date: Wed, 29 Apr 2009 15:27:19 -0500
From: "Vijay K. Gurbani" <>
Organization: Bell Labs Security Technology Research Group
User-Agent: Thunderbird (Windows/20070728)
MIME-Version: 1.0
To: Theo Zourzouvillys <>
References: <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on
Subject: Re: [sip-ops] [dispatch] SIP-CLF: Results on ASCII vs. binary representation
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Operations <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 29 Apr 2009 20:27:13 -0000

Theo Zourzouvillys wrote:
>> 1) On some systems, the value of IOV_MAX is set to a low number.
> (1) is irrelivant.  scatther/gather is not the only optimal method of
> implementing it.  even on crappy old OSes, a simple memcpy() of the
> data is similar in performance, cache coherency caveat emptor:
> Binary: 0m7.400s
> ASCII: 0m7.038s

So after tweaking all kinds of optimizations for binary CLF, the
best we can do is approach ASCII CLF without any optimizations.
This much no one will dispute, I hope.

>> (1) is a real concern because as you can well imagine that URIs,
>> once parsed, can be composed of many different objects (or
>> structs in C.)  As such, the representation of a composed URI
>> in a iov structure will require multiple indexes.
> i'd argue your concerns are moot - a URI composed of many different
> objects will need to be built into a string if it's ASCII anyway.

Precisely my point -- since the URIs arrive in ASCII and leave
in ASCII, let's just write the darn thing out in straight
ASCII without too much computation and be done with it!

> (ps: i'm not too concerned about binary/ascii - although i lean toward
> the binary side - please just don't use performance as an argument
> unless the figures are fair. pretty please!)

In the final analysis, I am not too concerned about binary vs.
ASCII, either; although I lean towards the ASCII side.  That
said, I must admit that I was afraid the only mandated CLF form
would turn out to be binary CLF, and I wanted to at least
raise a point that it is not a panacea.  It has its strong
points -- no doubt -- but it should not be the only format.

Furthermore, consider that a major constituency for the
upkeep of SIP servers will be the IT department.  Which format
do you think the IT folks will prefer?  I have a strong
suspicion that it will be ASCII because that is what they
are used to -- they can see it, read it, are used to it,
and can write tools in perl/ruby/python easily to transform
it according to their needs, etc.  If SIP is to find
more purchase and expand mind share in these departments, then
limiting a log format to binary is hardly a wise choice.

My proposal is that we document both the formats but we
make the ASCII format the mandatory-to-implement.  A well
documented binary format (contained in the same draft) will
keep the telecommunication providers whose NOCs are used
to dealing with binary happy.  Not to mention that searches
are faster with a binary format -- that much is not in


- vijay
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg@{,,}