Re: [sip-clf] A syslog approach to sip logging

"Rainer Gerhards" <rgerhards@hq.adiscon.com> Wed, 03 February 2010 16:34 UTC

Return-Path: <rgerhards@hq.adiscon.com>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4AE923A6C67 for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 08:34:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, J_CHICKENPOX_31=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qInDmySaOX4N for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 08:34:08 -0800 (PST)
Received: from mailin.adiscon.com (hetzner.adiscon.com [85.10.198.18]) by core3.amsl.com (Postfix) with ESMTP id 1DE583A6C66 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 08:34:08 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mailin.adiscon.com (Postfix) with ESMTP id 2DE41241C008 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 17:09:20 +0100 (CET)
Received: from mailin.adiscon.com ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yb9uYDzL1tww for <sip-clf@ietf.org>; Wed, 3 Feb 2010 17:09:19 +0100 (CET)
Received: from GRFEXC.intern.adiscon.com (pd95c774a.dip0.t-ipconnect.de [217.92.119.74]) by mailin.adiscon.com (Postfix) with ESMTP id 85E9E241C005 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 17:09:19 +0100 (CET)
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
X-MimeOLE: Produced By Microsoft Exchange V6.5
Date: Wed, 03 Feb 2010 17:34:47 +0100
Message-ID: <9B6E2A8877C38245BFB15CC491A11DA71037F7@GRFEXC.intern.adiscon.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: Re: [sip-clf] A syslog approach to sip logging
Thread-Index: Acqk7szF4N4WJpk/SlWme18v5QyzNg==
From: Rainer Gerhards <rgerhards@hq.adiscon.com>
To: sip-clf@ietf.org
Subject: Re: [sip-clf] A syslog approach to sip logging
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2010 16:34:09 -0000

Hi all,

I just subscribed to the sip-clf mailing list. I am the author of rsyslog,
one of the major open source syslogd's as well as the designer for a number
of Windows tools that are syslog-based. I have also worked on the IETF syslog
standardization effort.

David has made me aware of the current discussion and I am currently working
through the mailing list. Two things that I would like to comment on are
transmission of Apache logs via syslog and syslog performance.

As of my experience, it is quite common to transport Apache clf "files" via
syslog. There are two was to do this: one is to make apache log in real-time
to the syslogd, usually with the help of logger or a similar system tool.
This requires proper engineering and can potentially cause notable
performance degradation. As I know from the rsyslog user base, these problems
can be solved and this mode is used in practice, even for high-performance
sites.

The other approach is to let apache write to text files and then transfer
these text files in near-realtime to a syslogd. That is, a process grabs data
as it is appended to the text log. In rsyslog, the omfile module has
specifically been written for that use case and, if I remember correctly, the
root cause for its implementation was Apache clf transfer.

It may also be worth noting that in the Apache scenario log4j syslog logging
seems to come together with clf - but I don't have insight if this is true
for the majority of cases. 

On syslog performance: I have read that expected message volume was
considered problematic for the syslog use case. It may be worth noting that
high-volume sites log data via syslog. This may be clf, but the larger ISP or
financial institutions (or other service providers) already have lots of log
data that is to be processed. For rsyslog, I know of deployments that average
50,000+ messages per second on a single receiving machine. In lab setup, a
single instance of rsyslog can currently process up to 250,000 msgs per
second, with this rates going up. The Windows products I am responsible for
reach similar or higher message rates. Of course, these number depend much on
the length of the message, parsing overhead and what the final destination
does with the messages (it is a big difference writing them to a flat ascii
file or a database and complex filtering also reduces the throughput). Note
that there is large demand for even faster syslog implementations, which
leads me to believe that transmission of mass data via syslog is often
desired.

I am not sure what message rates are expected for sip and where the actual
problem for syslog was envisioned. If you have some more information on that,
it would definitely help me understand the situation at large.

Rainer Gerhards