Re: [sip-clf] A very rough overview over modern syslog and syslog usecases

"Spencer Dawkins" <spencer@wonderhamster.org> Wed, 03 February 2010 20:38 UTC

Return-Path: <spencer@wonderhamster.org>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9122C3A6868 for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 12:38:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.475
X-Spam-Level:
X-Spam-Status: No, score=-2.475 tagged_above=-999 required=5 tests=[AWL=0.123, BAYES_00=-2.599, STOX_REPLY_TYPE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EqjmyODWsvIj for <sip-clf@core3.amsl.com>; Wed, 3 Feb 2010 12:38:31 -0800 (PST)
Received: from mout.perfora.net (mout.perfora.net [74.208.4.194]) by core3.amsl.com (Postfix) with ESMTP id 8CA953A67A3 for <sip-clf@ietf.org>; Wed, 3 Feb 2010 12:38:31 -0800 (PST)
Received: from S73602b (cpe-76-182-230-135.tx.res.rr.com [76.182.230.135]) by mrelay.perfora.net (node=mrus4) with ESMTP (Nemesis) id 0LvVAB-1Nlfpp1Gdv-010VI3; Wed, 03 Feb 2010 15:39:08 -0500
Message-ID: <E120A6D967CF40F8AF911F31F2A062E3@china.huawei.com>
From: Spencer Dawkins <spencer@wonderhamster.org>
To: Rainer Gerhards <rgerhards@hq.adiscon.com>, sip-clf@ietf.org
References: <9B6E2A8877C38245BFB15CC491A11DA71037F9@GRFEXC.intern.adiscon.com>
Date: Wed, 03 Feb 2010 14:38:51 -0600
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="iso-8859-1"; reply-type="original"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5843
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
X-Provags-ID: V01U2FsdGVkX1//iZsVh9JOdFsHEaPN/mA3nP6u9uTrEC18qcJ T+Uj14aqGqFjEVbvKAt3XTscrHGTtuIQ7Iqn3ZmrgUOglyUov6 ZcrZ5ekbdWhCJ9BRFPMI756nL3h30ymGrfMYTNGWfU=
Subject: Re: [sip-clf] A very rough overview over modern syslog and syslog usecases
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2010 20:38:32 -0000

Hi, Rainer,

Thank you very much for this background. It's helpful.

Spencer


>I thought it would be useful to provide a very rough overview of "modern
> syslog", specifically how the pieces of the framework are meant to work
> together. This is written to provide a quick overlook, not for 100% 
> technical
> correctness.
>
> The syslog framework has three layers, where the bottom layer is transport
> oriented, the middle layer describes the "user agents", authentication and
> authorization, message forwarding, etc and the top layer describes 
> extensible
> mechanisms to represent semantic objects. With semantic object, I mean the
> actual information to convey, e.g. a web request, routing decision and
> probably a sip call setup (that sip case being guesswork).
>
> Existing syslog implementations already provide a rich infrastructure with
> solutions for many use cases. Most importantly, the middle layer was 
> designed
> with correlation in mind - a problem that is ubiquitous in syslog. Syslog 
> is
> trying to solve this by providing the ability to express sequence in 
> various
> ways. A very important tool is the high-precision timestamp, which can be
> enhanced by providing information about the quality of the time source for 
> a
> given system. Also, syslog provides the sequence number facility, which 
> can
> be used to provide sequence as seen by a single node. while there is not 
> yet
> a RFC for this in syslog, these tools (plus maybe some I forgot to 
> mention)
> provide the ability to very precisely track sequence (not necessarily real
> time!) inside a networked system, e.g. by implementing Lamport clocks on 
> top
> of these them. In my experience, correlation of log records in a
> heterogeneous environment is a very hard task. (syslog) log analyzers are
> trying hard for years, but still have lots of shortcomings. Best work 
> those
> that focus on a smaller subset of the overall log data - at least this is 
> my
> impression (I am not so much involved in the analyzer part). Note that 
> syslog
> also contains some recommendations on intelligently dropping messages if 
> the
> log volume becomes overwhelming (few yet in the RFCs, but this is becoming 
> an
> increasingly important topic in practice, at least judging from requests I
> get).
>
> The typical logging problem, from the syslog perspective, is:
>
> (1)  there exists events that need to be logged
> (2)  a single "higher-level" event E may consist of a
>     number of fine-grained lower level events e_i
> (3)  each of the e_i's may be on different
>     systems / proxies
> (4)  each e_i consists of a subset of properties
>     p_j from a set of all possible common properties P
> (5)  in order to gain higher-level knowledge, the
>     high-level event E must be reconstructed from
>     e_i's obtained from *various* sources
> (6)  a transport mechanism must exist to move event
>     e_i records from one system to another, e.g., to
>     a central correlator
> (7)  systems from many different suppliers may be involved,
>     resulting in different syntax and semantic of
>     the higher-level objects
> (8)  there is potentially a massive amount of events
> (9)  events potentially need to be stored for
>     an extended period of time
> (10) quick review of at least the current event data
>     (today, past week) is often desired
> (11) there exists lots of noise data
> (12) the data needs to be fed into backend processes,
>     like billing systems
>
> Of course, not all of this is present in all installations, but this is 
> the
> general picture that I find ever and ever again. Looking at the sip-clf
> problem statement, and not knowing any details of sip specifics (!), I had
> the impression that this is another incarnation of the general logging
> problem. So I lean towards the idea that sip could actually use existing
> infrastructure but simply add a semantic layer on top of it.
>
> I have to admit I have no idea which infrastructure is already implemented
> for a typical sip operator. Also note that I don't say syslog has solved 
> all
> of the issues involved. That is definitely not the case. But we are trying
> hard, and for years, to overcome the issues. In my personal experience, 
> the
> disagreement on the syntax and semantic of information representation and
> thus log consolidation is the hardest problem to solve.
>
> I hope this information is useful.
>
> Rainer
> _______________________________________________
> sip-clf mailing list
> sip-clf@ietf.org
> https://www.ietf.org/mailman/listinfo/sip-clf