Re: [sip-clf] AD review: draft-ietf-sipclf-format-05

"Vijay K. Gurbani" <vkg@bell-labs.com> Thu, 09 February 2012 17:58 UTC

Return-Path: <vkg@bell-labs.com>
X-Original-To: sip-clf@ietfa.amsl.com
Delivered-To: sip-clf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 966D121E8014 for <sip-clf@ietfa.amsl.com>; Thu, 9 Feb 2012 09:58:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.877
X-Spam-Level:
X-Spam-Status: No, score=-106.877 tagged_above=-999 required=5 tests=[AWL=-0.278, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r6xlJ08ygBLj for <sip-clf@ietfa.amsl.com>; Thu, 9 Feb 2012 09:58:13 -0800 (PST)
Received: from ihemail3.lucent.com (ihemail3.lucent.com [135.245.0.37]) by ietfa.amsl.com (Postfix) with ESMTP id A56A321F85A4 for <sip-clf@ietf.org>; Thu, 9 Feb 2012 09:58:13 -0800 (PST)
Received: from usnavsmail3.ndc.alcatel-lucent.com (usnavsmail3.ndc.alcatel-lucent.com [135.3.39.11]) by ihemail3.lucent.com (8.13.8/IER-o) with ESMTP id q19HwCQ4003517 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 9 Feb 2012 11:58:12 -0600 (CST)
Received: from umail.lucent.com (umail-ce2.ndc.lucent.com [135.3.40.63]) by usnavsmail3.ndc.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id q19HwBhL015429 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Thu, 9 Feb 2012 11:58:11 -0600
Received: from shoonya.ih.lucent.com (shoonya-135185238235.ih.lucent.com [135.185.238.235]) by umail.lucent.com (8.13.8/TPES) with ESMTP id q19HwAZ2008559; Thu, 9 Feb 2012 11:58:11 -0600 (CST)
Message-ID: <4F340A33.7030606@bell-labs.com>
Date: Thu, 09 Feb 2012 12:02:27 -0600
From: "Vijay K. Gurbani" <vkg@bell-labs.com>
Organization: Bell Laboratories, Alcatel-Lucent
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:9.0) Gecko/20111222 Thunderbird/9.0
MIME-Version: 1.0
To: Robert Sparks <rjsparks@nostrum.com>
References: <4F21DD3E.7000002@nostrum.com> <31A5C897-B767-4527-9346-905A80977F35@cisco.com> <4F33E3D0.7000605@nostrum.com>
In-Reply-To: <4F33E3D0.7000605@nostrum.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.37
X-Scanned-By: MIMEDefang 2.64 on 135.3.39.11
Cc: "sip-clf@ietf.org" <sip-clf@ietf.org>
Subject: Re: [sip-clf] AD review: draft-ietf-sipclf-format-05
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Feb 2012 17:58:14 -0000

On 02/09/2012 09:18 AM, Robert Sparks wrote:
> Inline, trimming to points with responses

Robert: Some feedback on your responses inline.

With respect to extending the Transport flag, you wrote that:

> That's a choice the group should make - the document just needs to be
> explicit about what's extensible (for all the fields) and if a field
> is extensible, what the mechanism looks like. For the transport
> field, as an individual contributor, I prefer the IANA registry
> approach.

I agree that IANA is cleaner.  It will be nice if we hear more views
from the WG so we can proceed accordingly.

With respect to replacing tabs with spaces, you wrote that:

> This leaves out optional fields. It's also not clear why you use a SHOULD
> for the requirement over bodies.

I believe that the SIP CLF record will be used as a medium to do a quick
spot check as well as a medium for detailed analysis.

When used as a medium for a quick spot check, I believe that the SIP
CLF record should be amenable to being manipulated by normal Unix text
processing tools.  Thus, those who use the standard Unix command line
text tools to look for something in the SIP CLF log file will be mostly
interested in the mandatory header fields (i.e., grep "2 INVITE" logfile
or grep "S2388-188" logfile to find all records with server transaction
ID "S2388-188").  The result could be piped to tr(1), awk(1), cut(1) or
perl for further processing.  Quick example:

   $ grep "2 INVITE" logfile | cut -f13

will print the server transaction ID (field 13) of all records
where the CSeq field is "2 INVITE".

That is why I think that tabs in the mandatory fields MUST be changed
to spaces.

Exploring the body and optional fields is part of detailed analysis
of the SIP CLF record.  To do this, one would need an actual
SIP CLF reader, which will be presumably more intelligent and does not
need a LWS delimiter to figure out boundaries (it already has the length
of the body or the optional field).

In fact, the more I think of it the more it seems that the SIP CLF
record should really consist of three distinct stanzas, each
separated by a LF:

   Index-pointers 0x0A  Mandatory-fields 0x0A Optional-fields 0x0A

Right now, we separate the SIP CLF record in two stanzas:

   Index-pointers 0x0A Mandatory-Fields-and-Optional-fields 0x0A

The problem with separating the SIP CLF record in two stanzas is
that a "grep S2388-188 logfile" will return the mandatory field
line PLUS any optional fields logged.  These optional fields are
therefore forced to change tab to spaces.  If we have 3 stanzas,
then the optional fields can retain tabs.

Regarding the SIP CLF record being syntactically amenable to a parse
by a normal SIP parser, you wrote that:

> I think it _was_ the intent to be able to reuse parser code. Several
> early implementers were handing individual fields to the part of
> their parser that would handle a framed header field.

The distinction being made here is whether individual *fields* in the
SIP CLF record can be extracted and given to a SIP parser that would
normally parse those sequence of strings versus an existing SIP parser
that will successfully parse a SIP CLF record in its *entirety*.

Clearly, the latter is not possible.  The former is a possibility.  To
be safe, the document should indicate that escaping '-' and '?'
may produce a field that is not syntactically parsed by a SIP parser
that would normally be able to parse that field if it appeared
in a SIP header (as opposed to a log file).

Thanks,

- vijay
-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60563 (USA)
Email: vkg@{bell-labs.com,acm.org} / vijay.gurbani@alcatel-lucent.com
Web:   http://ect.bell-labs.com/who/vkg/