[sip-clf] WGLC: SIPCLF Problem Statement (draft-gurbani-sipclf-problem-statement-01)

Cullen Jennings <fluffy@cisco.com> Thu, 04 February 2010 04:20 UTC

Return-Path: <fluffy@cisco.com>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 18DAE3A68D3 for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 20:20:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.468
X-Spam-Level:
X-Spam-Status: No, score=-110.468 tagged_above=-999 required=5 tests=[AWL=0.131, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qqZ-7geHptaX for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 20:20:03 -0800 (PST)
Received: from sj-iport-5.cisco.com (sj-iport-5.cisco.com [171.68.10.87]) by core3.amsl.com (Postfix) with ESMTP id 9179A3A6814 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 20:20:02 -0800 (PST)
Authentication-Results: sj-iport-5.cisco.com; dkim=neutral (message not signed) header.i=none
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEAL7ZaUurR7Hu/2dsb2JhbADAbJgHhEYE
X-IronPort-AV: E=Sophos;i="4.49,402,1262563200"; d="scan'208";a="145416508"
Received: from sj-core-5.cisco.com ([171.71.177.238]) by sj-iport-5.cisco.com with ESMTP; 04 Feb 2010 04:20:47 +0000
Received: from [192.168.4.177] (rcdn-fluffy-8711.cisco.com [10.99.9.18]) by sj-core-5.cisco.com (8.13.8/8.14.3) with ESMTP id o144Kk6A023625 for <sip-clf@ietf.org>; Thu, 4 Feb 2010 04:20:46 GMT
Mime-Version: 1.0 (Apple Message framework v1077)
Content-Type: text/plain; charset=us-ascii
Impp: xmpp:cullenfluffyjennings@jabber.org
From: Cullen Jennings <fluffy@cisco.com>
X-Priority: 3
In-Reply-To: <7505A2C58D8F4FD88B47D10EA74649CD@china.huawei.com>
Date: Wed, 3 Feb 2010 21:20:45 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <C1B972DA-0118-4E3F-8C5F-970BE4238577@cisco.com>
References: <7505A2C58D8F4FD88B47D10EA74649CD@china.huawei.com>
To: SIP-CLF Mailing List <sip-clf@ietf.org>
X-Mailer: Apple Mail (2.1077)
Subject: [sip-clf] WGLC: SIPCLF Problem Statement (draft-gurbani-sipclf-problem-statement-01)
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Feb 2010 04:20:09 -0000

Some random comments - all sent in my individual contributor role.

I like this draft, think it is fine work, but it needs a few more things before it is done. These are all things that I think can help us speed up the actual clf work if we can agree on them.


Section 8 should be renamed to "Information Model"

remotehost does not seem like enough info when talking about multi interface machines like SBC with insides and outsides. I would prefer to replace this a source and destination address that contains the IP and port of both the source and destination of whatever was in the layer 3 transport protocol. 

I'd update the method and status to make it clear  that there must be exactly one of a request method name, or a response code 

I would separate the to and to-tag to two separate elements. You want to be able to match the to search the to without including the to-tag in a rapid way. I don' feel strongly about this and can live with them mixed together in one field if that what folks want. 

I'd remove the contact list. It's just a generic header like others. This is debatable so worth discussion but I'm in the "less not more" camp

server and client transaction both need to be optional. For example, a stateless proxy will not have this.

Explicitly state that there all other headers go in an order list of pairs of header field names and header field values

A field to hold one optional body

I don't think this information model need to be extensible. In fact the one thing I am going to continue to strongly argue is that it should not be extensible. Note this does not mean we can not have new sip headers or new body types, we can, they just go in the list of sip headers or body type. 


So to summarize, I think the following need to be the elements of the Information Model,

Layer 3 source IP/port
Layer 3 dest IP/port
Timestamp
server transaction
client transaction
From header field value
To header field value minus the to-tag
to-tag
callID header field value
ordered list of header field name value pairs
status or request line 
message body 

I could be convinced there was value in:

method
sip status code
contact header field value for certain types of messages 
syslog style severity level 

I'm against the needless complexity or arbitrary extensibility and would want to see a good argument made for adding things.  


I would add a data model consideration section where we talk about desirable properties of the data model. We don't pick the data model in this draft, that is the job of  actual spec draft, but we can put requirements or other points about the data model here. 

Specifically in this part I want to make sure we are compatible with the important parts of syslog such that if someone is writing a program to read one of these SIPCLF log files and transmit it over syslog, it is easy and there are no semantic level mapping problems. To do that that I would add to this section "Compatible with the syslog definitions of severity." 

I was considering also saying compatibility with syslog definition of timestamp but my recollection of that (which may be very wrong - been awhile since I looked at it) left me more than a bit confused about syslog tmestamps. It seems that every syslog packet I have ever seen in the wild uses the BSD Syslog timestamp format in RFC 3164 but clearly that is more or less borked for any real correlation across time zones so it seems like 5424 version of timestamp would be better. The 5424 allowed the local offsets which resulted in no real canonical format for time which mean some people implemented the full things and some ignored the local offsets results in interoperability problems. Then there is the "-00:00" offset being something much different than "+00:00". And then there is the leap seconds stuff which is nearly untested as far as I can tell in most products. I really don't understand how one write a 5424 compliant syslog message that occurs during the leap second but I'm probably just confused. What exactly does one do if one wants to write something to a log file when an extra leap second is being added? SIP is using the email style time formats and mandating GMT. That seems pretty trivially translatable into whatever one believes is the right way to do it in syslog so I'd be tempted to just allow SIP implementation use the timestamps they are using for SIP and define that as the data representation for the timestamps. I went to look at stealing the time format from IPFIX but gave up at the point where the 5102 said leap seconds where excluded from the dateTimeMicroseconds but in RFC 5101 it looked like dateTimeMicroseconds included the leap seconds. I like how 5101 uses the NTP stuff - that is very clear to everyone but does not map well to these human readable formats people want to use in the log files. All in all, the email style fixed to GMT in a single canonical format is looking pretty appealing.