Re: [sip-clf] anomaly detectors

Hadriel Kaplan <> Sun, 26 July 2009 12:24 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 119623A67F8 for <>; Sun, 26 Jul 2009 05:24:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id SYiKE+8mGcey for <>; Sun, 26 Jul 2009 05:24:26 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id E82723A6403 for <>; Sun, 26 Jul 2009 05:24:25 -0700 (PDT)
Received: from ( by ( with Microsoft SMTP Server (TLS) id 8.1.375.2; Sun, 26 Jul 2009 08:24:24 -0400
Received: from ([]) by mail ([]) with mapi; Sun, 26 Jul 2009 08:24:21 -0400
From: Hadriel Kaplan <>
To: Vijay Gurbani <>
Date: Sun, 26 Jul 2009 08:24:20 -0400
Thread-Topic: anomaly detectors
Thread-Index: AcoN0JLDCy8mlATjRlmdqmaeda6RywABvQEw
Message-ID: <E6C2E8958BA59A4FB960963D475F7AC31984655206@mail>
References: <> <E6C2E8958BA59A4FB960963D475F7AC31984654C6C@mail> <> <E6C2E8958BA59A4FB960963D475F7AC31984654FE0@mail> <> <E6C2E8958BA59A4FB960963D475F7AC31984655059@mail> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "" <>
Subject: Re: [sip-clf] anomaly detectors
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 26 Jul 2009 12:24:27 -0000

> -----Original Message-----
> From: Vijay Gurbani []
> Sent: Sunday, July 26, 2009 5:08 AM
> Do SBCs use learning algorithms to do such anomaly detection
> or do policy based triggers form the foundation of such detection?
> I suspect it is the latter.

As far as I know, many of them still use policy-based triggers in practice - but "learning algorithms" for the parser-layer detection and for some load-based behavior limits have been either claimed or hinted-at by some SBC vendors for a while, including the company I work for. (but what each of us do in that space in practice is not public info for competitive reasons) 

> Can existing SBC-based anomaly detection system be trained on
> the infamous SIP PRACK state machine to detect anomalous behavior?
> I suspect not -- though I may be wrong.  I am aware of some
> preliminary work on training machines to recognize the temporal
> association of SIP messages in a dialog (i.e., BYE is preceded by
> an INVITE); but so far this work is just starting and the more
> complex cases like it is okay to send an UPDATE and CANCEL
> before an INVITE finishes, but one must not send a SUBSCRIBE
> in similar scenarios are not supported.  For an SBC to do such
> analysis, an awful lot of dialog state would be needed...

SBC's can and usually do maintain full dialog state for all dialogs (as a b2bua), and enforce protocol compliance for messages and their state machines, with knobs for various control.  That was one of the basic tenets from the beginning.  Of course whether the SBC's do it correctly, or whether they should be doing it to begin with, is a different question. ;)  It's always been a fine line, though, between enforcing protocol rules for security, and letting sessions succeed when they can or even fixing them so they do succeed.  One man's anomalous signaling, is another man's paying customer.

> I suspect that what we have been discussing in the mailing list, i.e.,
> having a base set of headers and an extensibility model to account
> for other headers is generally a good thing.  


> Reverting to logging
> the whole SIP message seems to be throwing our hands up in the air
> and saying that we don't really know what we want so we'll save the
> whole kitchen sink just in case.  HTTP CLF has worked without
> saving the whole message, and with some judicious thought I think
> we can make SIP CLF work as well.

It's not that I don't think CLF has value - it's the central claim that's it's for "anomaly detectors" that makes me wince.  Obviously it can be used for some anomaly purposes - just that we shouldn't be saying that's its main purpose/use-case, because it's not good enough for it, imho.

It's sorta like the sip-ipfix draft, which has as one of its use-cases "billing".  While it may be technically possible to do billing for very trivial cases using a CLF formatted as ipfix, it is missing so much meta-data required for billing in the real-world that the claim is distracting because it's so clearly not useable for that.