Re: [sip-clf] A syslog approach to sip logging

"Vijay K. Gurbani" <vkg@alcatel-lucent.com> Wed, 03 February 2010 23:44 UTC

Return-Path: <vkg@alcatel-lucent.com>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3EF943A689D for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 15:44:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.505
X-Spam-Level:
X-Spam-Status: No, score=-1.505 tagged_above=-999 required=5 tests=[AWL=-0.995, BAYES_05=-1.11, J_CHICKENPOX_31=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VmHHgDgSi2+d for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 15:44:56 -0800 (PST)
Received: from ihemail4.lucent.com (ihemail4.lucent.com [135.245.0.39]) by core3.amsl.com (Postfix) with ESMTP id D22833A689A for <sip-clf@ietf.org>; Wed, 3 Feb 2010 15:44:56 -0800 (PST)
Received: from umail.lucent.com (h135-3-40-63.lucent.com [135.3.40.63]) by ihemail4.lucent.com (8.13.8/IER-o) with ESMTP id o13NidKx020778 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Feb 2010 17:44:39 -0600 (CST)
Received: from [135.185.236.17] (il0015vkg1.ih.lucent.com [135.185.236.17]) by umail.lucent.com (8.13.8/TPES) with ESMTP id o13Nicoe027771; Wed, 3 Feb 2010 17:44:38 -0600 (CST)
Message-ID: <4B6A0A66.1050405@alcatel-lucent.com>
Date: Wed, 03 Feb 2010 17:44:38 -0600
From: "Vijay K. Gurbani" <vkg@alcatel-lucent.com>
Organization: Bell Labs Security Technology Research Group
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To: rgerhards@hq.adiscon.com, David Harrington <ietfdbh@comcast.net>
References: <9B6E2A8877C38245BFB15CC491A11DA71037F7@GRFEXC.intern.adiscon.com>
In-Reply-To: <9B6E2A8877C38245BFB15CC491A11DA71037F7@GRFEXC.intern.adiscon.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.39
Cc: "sip-clf@ietf.org" <sip-clf@ietf.org>
Subject: Re: [sip-clf] A syslog approach to sip logging
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2010 23:44:58 -0000

Rainer Gerhards wrote:
 > I just subscribed to the sip-clf mailing list. I am the author of
 > rsyslog, one of the major open source syslogd's as well as the
 > designer for a number of Windows tools that are syslog-based. I have
 > also worked on the IETF syslog standardization effort.

Rainer, David: Thanks for the information on syslog.  Let me
state for the record that I do like syslog, having used it a lot
in the past ;-)

 > I am not sure what message rates are expected for sip and where the
 > actual problem for syslog was envisioned. If you have some more
 > information on that, it would definitely help me understand the
 > situation at large.

Regarding what message rates are expected for SIP, it is hard
to give an authoritative answer.  However, early work on SIP
performance [1] and more recent work on SIP overload control
[2,3,4] appear to indicate that steady state is obtained
around 200-300 calls per second under certain assumptions.
That said, there may be experimental SIP systems that run on
powerful hardware with all kinds of optimizations turned on and
more simplifying assumptions to give you a higher call rate
in steady state than the published literature.  In other words,
YMMV.

For the sake of argument, let's assume 1,000 calls per second
to be the high watermark and 300 calls per second to be the low
watermark in steady state.

Since a call in SIP is composed of many messages, let's analyze
the number of messages exchanged to set up a call.  Using some
simplifying assumptions such as the SIP server (a proxy)
does not Record-Route and that at most we fork to two
downstream user agents, a simple call consists of 8 messages
(see Figure 1) and a more complex call that involves forking
contains at least 16 messages (Figure 2; forking in SIP
implies amplification of the original request to one or
more down-stream endpoints, and dealing with the resulting
responses, etc.  Note that you can end up with more than 16
messages if multiple provisional responses are received or if
re-transmission of certain messages occur.  But for the sake of
simplicity, let's stick with 16 messages.)

   SIP server                         SIP server
   producing CLF                      producing CLF
INV    |                            INV    |
------>|                            ------>|
100    |INV                         100    |
<------|------>                     <------|INV
        |100                                |------>
        |<------                            |100
        |180                                |<------
        |<------                            |INV
    180 |                                   |------->
<------|                                   |100
        |200                                |<------
        |<------                            |
    200 |                                   |180
<------|                                   |<------
                                     180    |
Figure 1: Simple SIP call           <------|180
                                     180    |<------
                                     <------|
                                            |200 OK INV
                                            |<------
                                 200 OK INV |
                                     <------|
                                            |CANCEL
                                            |------>
                                            |200 OK CANCEL
                                            |<------
                                            |487 INV
                                            |<------
                                            |ACK
                                            |------>
                                     Figure 2: Complex SIP call

Each message here will generate a CLF record.  In steady
state, a server generating SIP CLF records will thus generate
at least 2,400 records per second and at most 16,000 records
per second with the simplifying assumptions I stated (i.e.,
no Record-Route and only fork to two endpoints.)  If the SIP server
Record-Routed or forked to more than two endpoints, the
message load will increase (on the average, to each endpoint
it forked, it will have to handle 6-7 extra messages if we
consider that n-1 endpoints will have a CANCEL generated
towards them.  So three forked endpoints will require the
SIP server to generate about 23,000 CLF records per second.)

Given your data on rsyslog of 50,000+ messages per second on
a single receiving machine, it seems that for rsyslog daemon,
these messages do not pose a problem since it is designed
to handle messages from many applications and hosts.

But is sending 23,000 messages per second a good use of
syslog for an application?  Especially because in your
email you also noted that:

 > As of my experience, it is quite common to transport Apache clf
 > "files" via syslog. There are two was to do this: one is to make
 > apache log in real-time to the syslogd, usually with the help of
 > logger or a similar system tool. This requires proper engineering and
 > can potentially cause notable performance degradation. As I know from
 > the rsyslog user base, these problems can be solved and this mode is
 > used in practice, even for high-performance sites.

I am afraid of forcing SIP vendors to transport SIP CLF via
syslog will just mean that they will choose not to do it if it
will cause notable performance degradations.

 > The other approach is to let apache write to text files and then
 > transfer these text files in near-realtime to a syslogd. That is, a
 > process grabs data as it is appended to the text log. In rsyslog, the
 > omfile module has specifically been written for that use case and, if
 > I remember correctly, the root cause for its implementation was
 > Apache clf transfer.

The above approach is something that is attractive more because
of the fact that -- correct me if I am wrong -- syslog records
have to be ASCII, right?  If so, then one of the requirements
we have been working with is that these records should be
amenable to fast searches (hence the indexed-ASCII [5] approach that
is being discussed as a possible candidate for SIP CLF).  So,
if we go the syslog route, then these records would have to be
changed from ASCII to some indexed representation for faster
searches somewhere after they have been processed by the syslog
daemon.

Alternatively, one could generate SIP CLF using indexed-ASCII
format and have loggers read it and inject syslog records but
only for the most important ones (instead for *all* SIP messages.)
I find the above option more attractive since the implementation
overhead for a specific SIP server is less -- just write the
CLF to a disk file.

And finally, since SIP runs on all kinds of devices and environments,
do you know how pervasive the syslog library is?  Would a developer
find it in BREW operating environment or Android?  You noted log4j
in your email as well, and some Googling seems to show that
Android supports log4j more than syslog (but I could be wrong.)
In any case one of the attractive reasons to stay with disk-based
CLF is precisely that it allows an SIP server to only depend on the
file I/O primitives, which are guaranteed to be provided on all
platforms.

[1] Gurbani, V.K., Jagadeesan, L.J., and Mendiratta, V.,
  "Characterizing Session Initiation Protocol (SIP) Network
  Performance and Reliability," Springer Verlag Lecture Notes
  in Computer Science (LNCS), Vol. 3694, pp. 196-211, October 2005.
[2] Hilt, V. and I. Widjaja, "Controlling Overload in Networks of
  SIP Servers", IEEE International Conference on Network Protocols
  (ICNP'08), Orlando, Florida, October 2008.
[3] Noel, E. and C. Johnson, "Initial Simulation Results That
  Analyze SIP Based VoIP Networks Under Overload",
  International Teletraffic Congress (ITC'07), Ottawa, Canada,
  June 2007.
[4] Shen, C., Schulzrinne, H., and E. Nahum, "Session Initiation
  Protocol (SIP) Server Overload Control: Design and Evaluation,
  Principles", Systems and Applications of IP Telecommunications
  (IPTComm'08), Heidelberg, Germany, July 2008.
[5] Roach, A., "Binary Syntax for SIP Common Log Format",
  IETF Internet-Draft, work in progress, 2009
  http://tools.ietf.org/html/draft-roach-sipping-clf-syntax-01

Thanks,

- vijay
-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg@{alcatel-lucent.com,bell-labs.com,acm.org}
Web:   http://ect.bell-labs.com/who/vkg/