Re: [Syslog] draft-cloud-log-00 / CEE - why not IPFIX?

"Rainer Gerhards" <rgerhards@hq.adiscon.com> Wed, 16 February 2011 11:33 UTC

Return-Path: <rgerhards@hq.adiscon.com>
X-Original-To: syslog@core3.amsl.com
Delivered-To: syslog@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C3D3D3A6C9D for <syslog@core3.amsl.com>; Wed, 16 Feb 2011 03:33:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FBYYXY2p2sqa for <syslog@core3.amsl.com>; Wed, 16 Feb 2011 03:33:36 -0800 (PST)
Received: from vmmail.adiscon.com (vmmail.adiscon.com [178.63.79.189]) by core3.amsl.com (Postfix) with ESMTP id 40FB73A6B6B for <syslog@ietf.org>; Wed, 16 Feb 2011 03:33:36 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by vmmail.adiscon.com (Postfix) with ESMTP id C25B374A478; Wed, 16 Feb 2011 12:34:03 +0100 (CET)
Received: from vmmail.adiscon.com ([127.0.0.1]) by localhost (vmmail.adiscon.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HxpTKJj5NrpU; Wed, 16 Feb 2011 12:34:03 +0100 (CET)
Received: from GRFEXC.intern.adiscon.com (pd95c774a.dip0.t-ipconnect.de [217.92.119.74]) by vmmail.adiscon.com (Postfix) with ESMTPA id 8011B74A44A; Wed, 16 Feb 2011 12:34:03 +0100 (CET)
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: base64
X-MimeOLE: Produced By Microsoft Exchange V6.5
Date: Wed, 16 Feb 2011 12:34:06 +0100
Message-ID: <9B6E2A8877C38245BFB15CC491A11DA71DDC72@GRFEXC.intern.adiscon.com>
In-Reply-To: <4D5BAD69.2060608@unfix.org>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [Syslog] draft-cloud-log-00 / CEE - why not IPFIX?
Thread-Index: AcvNyYYPywr6CMt9RIG7zG/571u9KwAAuVEA
References: <4D5A60C8.3090000@unfix.org><93ED0A84F9A1D74FA65021D940AA588405446C41F9@IMCMBX3.MITRE.ORG> <4D5BA85B.7040007@unfix.org> <9B6E2A8877C38245BFB15CC491A11DA71DDC71@GRFEXC.intern.adiscon.com> <4D5BAD69.2060608@unfix.org>
From: "Rainer Gerhards" <rgerhards@hq.adiscon.com>
To: "Jeroen Massar" <jeroen@unfix.org>
Cc: Sam Johnston <sj@google.com>, cee@mitre.org, syslog@ietf.org
Subject: Re: [Syslog] draft-cloud-log-00 / CEE - why not IPFIX?
X-BeenThere: syslog@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Security Issues in Network Event Logging <syslog.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/syslog>, <mailto:syslog-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/syslog>
List-Post: <mailto:syslog@ietf.org>
List-Help: <mailto:syslog-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/syslog>, <mailto:syslog-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Feb 2011 11:33:37 -0000

Well I really don't like to restart that discussion in this context here
again, but let me note that what you are doing with your converter is very
useful. It actually normalizes data into a canonical format. This is
something that CEE tends to do, but in a protocol agnostic format (and I do
similar things in my projects as well). The general utility is unquestioned.
The question is if such an effort must be bound restricted to a single
protocol. My PoV is that this is counter-productive.

You definitely have a point in that IPFIX may be superior than syslog in many
regards. I do not intend to argue against this. But often a simpler solution
is able to draw more attention, and thus deployments, than a (potentially or
actually) technical superior one (shouldn't we all use the OSI stack by now,
just as one example...). 

I don't think it is useful to include IPFIX in syslog. But it may be an
option that IPFIX makes syslog obsolete. I think you should take that later
route.

But as I said -- I do not intend to spawn another iteration of this lengthy
discussion. It has occurred sooo often in the past years.

Rainer

PS: You are right in one more thing "ASCII" is the wrong term. Most folks
(including me) seem to be sloppy and say ASCII when they actually mean
printable text data, of course including UTF-8.


> -----Original Message-----
> From: Jeroen Massar [mailto:jeroen@unfix.org]
> Sent: Wednesday, February 16, 2011 11:57 AM
> To: Rainer Gerhards
> Cc: Heinbockel, Bill; Sam Johnston; cee@mitre.org; syslog@ietf.org
> Subject: Re: [Syslog] draft-cloud-log-00 / CEE - why not IPFIX?
> 
> On 2011-02-16 11:39, Rainer Gerhards wrote:
> > The SIP CLF WG has just recently rejected IPFIX for it being binary
> and
> > chosen indexed ASCII instead for their format. Their reasoning (after
> a long
> > struggle) is probably educating:
> >
> > http://www.ietf.org/mail-archive/web/sip-clf/current/msg00364.html
> >
> > I don't think that IPFIX is a good solution *in the syslog context*.
> It is
> > very far from what people expect. Other than that, I'd probably need
> to
> > re-iterate the arguments made on the SIP CLF mailing list, so it
> probably is
> > better to refer to their archive ;)
> 
> Why would they expect anything about the *DATA* format of a protocol?
> 
> Note that the whole point that IPFIX (or any other structured data
> format for that matter) 'solves' is that one has to make a parser for
> every single log file format out there. Doing this at the meter tends
> to
> be cheaper due to the ability to distribute that than at the aggregated
> part. (then again sFlow as an example does it exactly the other way
> around, just pushing packets and letting the collector do the hard
> parsing part, but we are talking about sampled flows here thus you will
> miss out on events which is not a decision you can make at the meter if
> you are looking at say breaking attempts or failures ;)
> 
> I think the pro-ascii versus binary argument comes effectively
> primarily
> from organizations who process large amounts of variable-string ascii
> data already and who do not really care about a few extra bits or a bit
> more overhead in processing data as they have large global clusters of
> hosts already doing that work. Their programming languages tend to be
> of
> a scripted-style too which tend to make it harder / less efficient to
> work on binary data but work great with ascii-alike data.
> 
> Nevertheless, I've a generic logline parser which simply converts
> syslog
> and other log file formats into IPFIX. The problem with the whole ascii
> thing though is that one has to teach the parser what fields are what,
> and in the case of for instance the Apache CLF teach it the weird
> delimiters that are present. These are all special cases, something
> that
> one would really like to avoid if one wants to keep it speedy.
> 
> My model partially solves that as I only have to do the special casing
> at the edge, where the log file gets converted into IPFIX. As those are
> considered 'meters' I just deploy more and more of those, while I can
> keep the collector side generally either a single box and otherwise
> easily distribute the data amongst them.
> 
> And of course, the conversion goes the other way too, it can spit out
> reformatted 'ascii' again if needed.
> 
> Greets,
>  Jeroen
> 
>  (who finds it funny to see ASCII btw, as there is this thing called
>   UTF-8 that makes it possible to express things in all languages of
>   the world. I guess those people have to live with punycode etc...)