[sip-clf] AD review: draft-ietf-sipclf-format-05

Robert Sparks <rjsparks@nostrum.com> Thu, 26 January 2012 23:09 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: sip-clf@ietfa.amsl.com
Delivered-To: sip-clf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DE33721F8646 for <sip-clf@ietfa.amsl.com>; Thu, 26 Jan 2012 15:09:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.6
X-Spam-Level:
X-Spam-Status: No, score=-102.6 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, SPF_PASS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GvOr5aFgjfD8 for <sip-clf@ietfa.amsl.com>; Thu, 26 Jan 2012 15:09:55 -0800 (PST)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id 17CA721F863D for <sip-clf@ietf.org>; Thu, 26 Jan 2012 15:09:55 -0800 (PST)
Received: from unexplicable.local (pool-173-57-95-103.dllstx.fios.verizon.net [173.57.95.103]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id q0QN9onu002816 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 26 Jan 2012 17:09:51 -0600 (CST) (envelope-from rjsparks@nostrum.com)
Message-ID: <4F21DD3E.7000002@nostrum.com>
Date: Thu, 26 Jan 2012 17:09:50 -0600
From: Robert Sparks <rjsparks@nostrum.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0) Gecko/20111222 Thunderbird/9.0.1
MIME-Version: 1.0
To: "sip-clf@ietf.org Mailing" <sip-clf@ietf.org>, sipclf-chairs@tools.ietf.org, draft-ietf-sipclf-format@tools.ietf.org, Gonzalo Camarillo <Gonzalo.Camarillo@ericsson.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Received-SPF: pass (nostrum.com: 173.57.95.103 is authenticated by a trusted mechanism)
Subject: [sip-clf] AD review: draft-ietf-sipclf-format-05
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Jan 2012 23:10:00 -0000

AD review: draft-ietf-sipclf-format-05

Summary: This document has open issues to address before progressing to 
IETF LC

Major (mostly in document order)

1) The version string claims it is hexadecimal encoded. The version for this
document is expressed by the character A. Does that mean this is version 10
(base 10)? Are we limited to versions 10 through 15? (we only have one byte
and section 6 points out that index lines always start with an alphabetical
character). This description needs to be clarified. While you're looking at
it, consider discussing whether a given log file can have records with
different versions in it.

2) The description of Timestamp and Fractional Seconds is very confusing.
What does it mean to be represented in big-endian fashion with a most
significant octet and to be decimal encoded? It's very far from clear that
you intend this to be an ascii string representing an integer, and it's
not clear that you intend that it to be prefix-padded with 0s if necessary.
Did the group consider using ABNF to describe these fields (or the whole
log record) to avoid this kind of ambiguity?

3) The document should be clear about whether the flags field values are
extensible, and if so, how extension is expected to be managed. How will
you handle SIP received over DCCP, Websockets, or something we haven't
heard of yet?

4) Section 4.2 speaks of placing "the actual values for the mandatory
fields" into the record. It's not clear what "actual values" means. There
are other sections of the document that talk about transforming (escaping)
values before putting them in the log record, including dealing with tabs
and edge conditions like field that has only a question mark in it. The
concept of encoding and escaping needs to be much clearer throughout the
document.

5) The document should more explicitly discuss the decision to replace
tabs with spaces. That decision was based on there being no known use
of tabs in SIP messages that means something semantically other than
whitespace. The discussion should point that out, and note that you
will not be able to use the logged information to reconstruct a
signature over the logged fields, and that any future (or proprietary)
SIP header fields that include tabs with semantic meaning (carrying a
blob perhaps) will lose this meaning while being logged. It should also
be made brutally explicit whether you want this substitution to happen
when you are logging the entire message, or a body (that might happen
to have tabs in it).

6) There are a couple of places where you talk about escaping '-' and '?'
characters. The document should note that this could produce a logged record
that is not syntactically parsible with a SIP parser (quoted-pair is only
used in certain places in SIP messages). There should also be some 
discussion
about whether a log reader unescapes anything.

7) Why does Tag -01 restrict message bodies to just the listed types, 
especially
with the open ended "miscellaneous text content" type category? If 
there's a real
restriction here, it needs to be much more clearly stated. Why are you 
allowing
any LWS for the separator between the content type header field value 
and the
message body itself? Why isn't this simply a space? It is not clear what
"byte encoded" means for representing binary bodies - do you mean expanding
it into a hex representation, a uuencode representation, or soemthing else?

8) There's no description of Tag-02. The text following the line that starts
the description of the tag speaks of more general things covering both 
tag-01
and tag-02.

9) When describing repeating tag-01 or tag-02 multiple times to deal with
large entities, the document should be clear that the fragments get logged
in order, and how to reassemble the actual entity. It should also say 
whether
it's ok to interleave (for example) tag-01 fragements with other 
optional fields.

10) The description of the Length field in section 4.3 is ambiguous. It says
"This length does not include the header shown in Figure 5." But figure 
5 has
nothing clearly identified as "the header".

11) Where does the document talk about how to log fields that are line 
folded
in the original message? Is it ok for the logged field to contain CRLFs, 
or is
the intent that the field be unfolded before logging?

12) It's not clear if the \r\n used in the examples of logging the SDP body
are there as text-conventions for showing the example in an RFC, or if you
intend that CRLFs be replaced by \r\n's when logging. If it's just a 
convention,
please make it clear that the actual log record will have CRLFs. If you were
expecting the CRLFs to be escaped in the log, there needs to be text that
says that normatively somewhere, and you need to consider escaping the 
escape
character.

Minor

There is a maintenance risk in having all the detail of each block 
duplicated
in Figure 1. Please consider taking that figure up a level, showing the 
three
blocks and how they're separated, and let the other figures show the 
internal
detail.

There are blocks of text in this document that are really laying out the 
problem.
Can they be moved into the problem statement/information model/framework 
document
or simply deleted?
For example: "Implementing a logging scheme for SIP is a considerable 
challenge.",
"Whilst one may question the value of the From URI [...]", etc.

In section 4.1, you talk of "a single, 52 character hexadecimal encoded 
string"
between bytes 8 and 55. There are only 48 octets between bytes 8 and 55
(inclusive). Please adjust the text to note that you are also talking about
the pointers to the optional fields or describe the shorter length.

In section 4.2's description of the To tag field you say "Value of the
tag parameter (if present)" but don't talk about what to do if the tag
parameter isn't present. In the parallel section on the From tag, you
don't talk about the potential for messages without From tags. If it's
the intent to not be able to log messages from legacy 2543 devices, that's
fine, but the document should call that out as a design decision and it 
wouldn't
hurt to provide some guidance on what to do if such a message shows up.

There is a lot of descriptive text in 4.2's sections on the server and 
client
transaction that is a copy of what's in the 
problem-statement/information model
document. It would be better to just point to the description in the 
other document.


Nits

How is this a "functionally equivalent event logging mechanism"? Suggest 
replacing
that sentence with the single word "analogous".

In the introduction, the document claims to define a format that is no 
more difficult
to generate, but it's not clear what it's being compared to. No more 
difficult than
what?

Section 4.1's description of Record Length can be misinterpreted. By listing
a subset of the fields that are included, it leaves open whether any fields
are excluded. It would be better to say "beginning with the Version octet
and ending with the terminating line-feed".

It would help to be more explicit that you are not logging the CRLF at 
the end
of a given header field value. (Or is it an option for a logger to 
choose to do
so?)

Please consider scrubbing the message used in example 5 back to the minimal
number of octets you need to make the example useful. In particular, 
consider
removing elements that have not already been standardized. I'm surprised 
to see
the literal strings "server-tx" and "client-tx" in the example log 
message. Was
that what you really intended?

The code snippet should be delineated as such, following the
guidance in the TLP (see section 4.b of 
<http://trustee.ietf.org/docs/IETF-Trust-License-Policy.pdf>