Re: [sip-clf] A syslog approach to sip logging

Cullen Jennings <> Wed, 03 February 2010 23:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7F40D3A689A for <>; Wed, 3 Feb 2010 15:38:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -110.458
X-Spam-Status: No, score=-110.458 tagged_above=-999 required=5 tests=[AWL=0.141, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id mtpUss29vgKM for <>; Wed, 3 Feb 2010 15:38:51 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 188DF3A690D for <>; Wed, 3 Feb 2010 15:38:43 -0800 (PST)
Authentication-Results:; dkim=neutral (message not signed) header.i=none
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEAO+XaUurR7H+/2dsb2JhbADBfZgChEYE
X-IronPort-AV: E=Sophos;i="4.49,401,1262563200"; d="scan'208";a="83324032"
Received: from ([]) by with ESMTP; 03 Feb 2010 23:39:27 +0000
Received: from [] ( []) by (8.13.8/8.14.3) with ESMTP id o13NdPRo022481; Wed, 3 Feb 2010 23:39:26 GMT
Mime-Version: 1.0 (Apple Message framework v1077)
Content-Type: text/plain; charset=us-ascii
From: Cullen Jennings <>
In-Reply-To: <013201caa438$f19aac50$>
Date: Wed, 3 Feb 2010 16:39:25 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <013201caa438$f19aac50$>
To: David B Harrington <>
X-Mailer: Apple Mail (2.1077)
Cc: 'SIP-CLF Mailing List' <>
Subject: Re: [sip-clf] A syslog approach to sip logging
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 03 Feb 2010 23:38:52 -0000

One requirement that I suspect you will find fairly universal about transporting around SIP log like informations is that 

1) it is reliable

2) we can include complete SIP messages. These can get very large. (Magnus posted a 40k SDP to the mmusic awhile back, don't even ask how large MESSAGE messages get in the wild)

Can you say a bit more about how to use syslog to achieve these. 

On Feb 2, 2010, at 11:53 AM, David B Harrington wrote:

> Hi,
> Operators often use syslog to carry Apache CLF log data.  Syslog, in
> practice, is primarily used for tunneling Apache CLF format. This
> seems to be attractive for operators, because of already existing
> infrastructure and syslog knowledge.
> I suggest it makes sense to develop a sipclf format that can be
> carried within an IETF syslog message. In general, that means
> information in an ascii format dumped into a local file that can be
> parsed with tools like grep, and that can later be secured, filtered,
> transported, aggregated, and correlated using existing infrastructure.
> Multiple use cases raised by contributors in this WG lead to different
> requirements. Some want to pay attention to the data dumped by one
> server; some want to follow traffic flows through the network; some
> want to filter on standardized fields; some want to aggregate and
> correlate log information. 
> It is not enough to figure out how to dump data on a single system;
> that data will need to be compatible with infrastructure used to
> provide secure transport, filtering, correlation, etc. Operators
> already have existing infrastructures designed for long-term archiving
> of (potentially enormous) logging information, and the goal to
> correlate log records, including data-mining.
> Syslog is already widely deployed, is well understood by operators,
> and the IETF syslog WG has standardized many aspects of security and
> transport, such as (D)TLS-secured transport, support for large
> messages, optional digitally signed logging for law enforcement and
> for message stream integrity checking, etc. 
> IETF syslog standardizes a number of parameters useful for
> correlation, such as
> facility (specific applications), severity classification, timestamp,
> hostname, the name of the application sending the message (often
> syslogd), process ID, and message ID that are in the syslog header in
> ascii format. These were designed to be compatible with ITU logging
> standards, and the ALARM-MIB, to provide easier correlation of events
> across different event reporting mechanisms. Also to improve
> correlation, work has been done to translate syslog messages into SNMP
> traps, and SNMP traps into syslog messages.
> The IETF syslog standard also provides structured data elements. SDEs
> are
> designed to supplement the human-readable text with
> application-parseable data fields (also encoded in 7-bit ascii), which
> makes it easier for applications, such as security management systems,
> to extract and correlate the data across vendor implementations, and
> across nodes in a network. 
> The IETF syslog standard already defines some SDEs that would likely
> be useful for the problems sip clf is trying to resolve, like
> precisiely tracking sequence inside a networked system: a
> high-precision timestamp, the quality of the time source for a given
> system, time zone accuracy, whether a node is synched with a network
> time source, the origin of a log entry (useful after aggregation and
> relay), the ip address at time of logging, an enterprise identifier,
> the software that generated the message (i.e., the application that
> asked syslogd to send the message), the software version, a sequence
> number to provide sequence as seen by a single node, the sysUpTime of
> a co-resident SNMP system, and the language used within the
> human-readable MSG. The IETF syslog standard also contains some
> recommendations on intelligently dropping messages if the log volume
> becomes overwhelming.
> The syslog WG deliberately did not standardize the content of the
> human-readable message field. The WG standardized the header, and has
> provided SDEs to standardize certain aspects of the information where
> consensus can be reached. Having both (potentially non-standardized)
> human-readable data, and standardized human-and-machine-readable
> structured data in the same message addresses a wide range of use
> cases, and gives the human more information to work with to interpret
> an event. 
> I propose that the WG reach consensus on specific fields of data that
> would be good to standardize, such as those defined in the problem
> statement doc, and define them as syslog SDEs (which, remember, are
> text fields so they would be greppable and printable and diff-able and
> human-readable). Structured data elements would better support
> application-parsing of the data, such as for training IDS/IPS anomaly
> engines.
> There are only a few restrictions placed on the content of a MSG field
> in a syslog message. According to the problem-statement document,
> there already exist a number of proprietary sip clf formats. Well, if
> those are in a format that can fit within the MSG field within an IETF
> standard syslog message, then that proprietary data can also be
> carried in the syslog message. Any vendor-specific log-parsing tools
> would continue to work with the extracted MSG field, and they could be
> supplemented by tools that can parse the standardized SDE information.
> The IETF syslog standard also supports vendor-specific SDEs for
> extensibility of structured data.
> In a similar manner to the dual stack approach for IPv4/IPv6
> transition, implementers could choose to drop specific fields from
> their proprietary formats as consensus on useful SDEs is reached, and
> their tools are adapted to use the standardized header and SDE
> information.
> This approach would work with the WG goal to constrain its focus to
> the "useful  information" and not need to reinvent solutions such as a
> data modeling language, character sets, delimiters, secure transport,
> log integrity checking, log filtering, log aggregation and correlation
> issues, and so on. 
> I do not see much benefit from designing a whole new ascii file format
> that no existing tools support (except generic text-handling tools
> like grep), and operators would need to learn in addition to the
> semantics in the information model. 
> I recommend the WG focus on specific and actually existing problem
> cases, and build the semantic "information model" incrementally. Then
> use the existing standard syslog format to provide an example data
> model, which inherits the benefits of an existing widely-deployed
> infrastructure for logging.
> David Harrington
> _______________________________________________
> sip-clf mailing list

For corporate legal information go to: