[sip-clf] A very rough overview over modern syslog and syslog use cases

"Rainer Gerhards" <rgerhards@hq.adiscon.com> Wed, 03 February 2010 17:32 UTC

Return-Path: <rgerhards@hq.adiscon.com>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3B9BD28C19D for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 09:32:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.449
X-Spam-Level:
X-Spam-Status: No, score=-2.449 tagged_above=-999 required=5 tests=[AWL=0.150, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TtSdQkfIF9Ed for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 09:32:09 -0800 (PST)
Received: from mailin.adiscon.com (hetzner.adiscon.com [85.10.198.18]) by core3.amsl.com (Postfix) with ESMTP id ED5BD28C19C for <sip-clf@ietf.org>; Wed, 3 Feb 2010 09:32:08 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mailin.adiscon.com (Postfix) with ESMTP id 19EE9241C008 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 18:05:17 +0100 (CET)
Received: from mailin.adiscon.com ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9Ybm-UOotikE for <sip-clf@ietf.org>; Wed, 3 Feb 2010 18:05:16 +0100 (CET)
Received: from GRFEXC.intern.adiscon.com (pd95c774a.dip0.t-ipconnect.de [217.92.119.74]) by mailin.adiscon.com (Postfix) with ESMTP id C20C0241C005 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 18:05:16 +0100 (CET)
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
X-MimeOLE: Produced By Microsoft Exchange V6.5
Date: Wed, 3 Feb 2010 18:32:48 +0100
Message-ID: <9B6E2A8877C38245BFB15CC491A11DA71037F9@GRFEXC.intern.adiscon.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: A very rough overview over modern syslog and syslog use cases
Thread-Index: Acqk9ueqdYQfez/RTAuWGy9+s6kW0A==
From: "Rainer Gerhards" <rgerhards@hq.adiscon.com>
To: <sip-clf@ietf.org>
Subject: [sip-clf] A very rough overview over modern syslog and syslog use cases
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2010 17:32:10 -0000

I thought it would be useful to provide a very rough overview of "modern
syslog", specifically how the pieces of the framework are meant to work
together. This is written to provide a quick overlook, not for 100% technical
correctness.

The syslog framework has three layers, where the bottom layer is transport
oriented, the middle layer describes the "user agents", authentication and
authorization, message forwarding, etc and the top layer describes extensible
mechanisms to represent semantic objects. With semantic object, I mean the
actual information to convey, e.g. a web request, routing decision and
probably a sip call setup (that sip case being guesswork).

Existing syslog implementations already provide a rich infrastructure with
solutions for many use cases. Most importantly, the middle layer was designed
with correlation in mind - a problem that is ubiquitous in syslog. Syslog is
trying to solve this by providing the ability to express sequence in various
ways. A very important tool is the high-precision timestamp, which can be
enhanced by providing information about the quality of the time source for a
given system. Also, syslog provides the sequence number facility, which can
be used to provide sequence as seen by a single node. while there is not yet
a RFC for this in syslog, these tools (plus maybe some I forgot to mention)
provide the ability to very precisely track sequence (not necessarily real
time!) inside a networked system, e.g. by implementing Lamport clocks on top
of these them. In my experience, correlation of log records in a
heterogeneous environment is a very hard task. (syslog) log analyzers are
trying hard for years, but still have lots of shortcomings. Best work those
that focus on a smaller subset of the overall log data - at least this is my
impression (I am not so much involved in the analyzer part). Note that syslog
also contains some recommendations on intelligently dropping messages if the
log volume becomes overwhelming (few yet in the RFCs, but this is becoming an
increasingly important topic in practice, at least judging from requests I
get).

The typical logging problem, from the syslog perspective, is:

(1)  there exists events that need to be logged
(2)  a single "higher-level" event E may consist of a 
     number of fine-grained lower level events e_i
(3)  each of the e_i's may be on different
     systems / proxies
(4)  each e_i consists of a subset of properties
     p_j from a set of all possible common properties P
(5)  in order to gain higher-level knowledge, the
     high-level event E must be reconstructed from
     e_i's obtained from *various* sources
(6)  a transport mechanism must exist to move event
     e_i records from one system to another, e.g., to
     a central correlator
(7)  systems from many different suppliers may be involved,
     resulting in different syntax and semantic of
     the higher-level objects
(8)  there is potentially a massive amount of events
(9)  events potentially need to be stored for
     an extended period of time
(10) quick review of at least the current event data
     (today, past week) is often desired
(11) there exists lots of noise data
(12) the data needs to be fed into backend processes,
     like billing systems

Of course, not all of this is present in all installations, but this is the
general picture that I find ever and ever again. Looking at the sip-clf
problem statement, and not knowing any details of sip specifics (!), I had
the impression that this is another incarnation of the general logging
problem. So I lean towards the idea that sip could actually use existing
infrastructure but simply add a semantic layer on top of it.

I have to admit I have no idea which infrastructure is already implemented
for a typical sip operator. Also note that I don't say syslog has solved all
of the issues involved. That is definitely not the case. But we are trying
hard, and for years, to overcome the issues. In my personal experience, the
disagreement on the syntax and semantic of information representation and
thus log consolidation is the hardest problem to solve.

I hope this information is useful.

Rainer