Re: [sip-clf] anomaly detectors

Vijay Gurbani <vkg@alcatel-lucent.com> Sun, 26 July 2009 09:07 UTC

Return-Path: <vkg@alcatel-lucent.com>
X-Original-To: sip-clf@core3.amsl.com
Delivered-To: sip-clf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6F95828C0EA for <sip-clf@core3.amsl.com>; Sun, 26 Jul 2009 02:07:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.949
X-Spam-Level:
X-Spam-Status: No, score=-1.949 tagged_above=-999 required=5 tests=[AWL=0.650, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9E8ynpMn9d9p for <sip-clf@core3.amsl.com>; Sun, 26 Jul 2009 02:07:44 -0700 (PDT)
Received: from ihemail3.lucent.com (ihemail3.lucent.com [135.245.0.37]) by core3.amsl.com (Postfix) with ESMTP id 3CDF73A6909 for <sip-clf@ietf.org>; Sun, 26 Jul 2009 02:07:44 -0700 (PDT)
Received: from umail.lucent.com (h135-3-40-61.lucent.com [135.3.40.61]) by ihemail3.lucent.com (8.13.8/IER-o) with ESMTP id n6Q97fBO020003 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 26 Jul 2009 04:07:41 -0500 (CDT)
Received: from shoonya.ih.lucent.com (guard.research.bell-labs.com [135.104.2.10]) by umail.lucent.com (8.13.8/TPES) with ESMTP id n6Q97eDI012713; Sun, 26 Jul 2009 04:07:41 -0500 (CDT)
Message-ID: <4A6C1D08.9020301@alcatel-lucent.com>
Date: Sun, 26 Jul 2009 04:08:24 -0500
From: Vijay Gurbani <vkg@alcatel-lucent.com>
User-Agent: Thunderbird 2.0.0.19 (X11/20090105)
MIME-Version: 1.0
To: Hadriel Kaplan <HKaplan@acmepacket.com>
References: <4A69DFBB.3010307@alcatel-lucent.com> <E6C2E8958BA59A4FB960963D475F7AC31984654C6C@mail> <4A6A1A29.9010504@alcatel-lucent.com> <E6C2E8958BA59A4FB960963D475F7AC31984654FE0@mail> <4A6A285C.6050007@alcatel-lucent.com> <E6C2E8958BA59A4FB960963D475F7AC31984655059@mail>
In-Reply-To: <E6C2E8958BA59A4FB960963D475F7AC31984655059@mail>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.37
Cc: "sip-clf@ietf.org" <sip-clf@ietf.org>
Subject: Re: [sip-clf] anomaly detectors
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Jul 2009 09:07:45 -0000

Hadriel Kaplan wrote:
> SBC's do anomaly detection both at the parser layer _and_ at higher
> layers, and have been for years.  Some of us just don't call the
> higher layer things "anomaly detection" - because we market it more
> on the actions we take once we detect an anomaly (like reject or log
> or blacklist), or the benefit to the customer (like "Fraud
> Prevention", "Hijack prevention", "DDoS protection", "SPIT
> Protection", etc.).

Do SBCs use learning algorithms to do such anomaly detection
or do policy based triggers form the foundation of such detection?
I suspect it is the latter.

Can existing SBC-based anomaly detection system be trained on
the infamous SIP PRACK state machine to detect anomalous behavior?
I suspect not -- though I may be wrong.  I am aware of some
preliminary work on training machines to recognize the temporal
association of SIP messages in a dialog (i.e., BYE is preceded by
an INVITE); but so far this work is just starting and the more
complex cases like it is okay to send an UPDATE and CANCEL
before an INVITE finishes, but one must not send a SUBSCRIBE
in similar scenarios are not supported.  For an SBC to do such
analysis, an awful lot of dialog state would be needed; OTOH,
doing such detection at the individual SIP actor level (i.e.,
a proxy receiving such requests from UAs and producing a CLF
record for analysis) may be more manageable.

> But anyway, there are centralized "anomaly detectors" that purport to
> analyze "anomalies" too - I don't know what all they mean by that,
> but they've asked us to provide them SIP message feeds (which right
> now is basically everything).  If I was them and looked at a CLF, I'd
> want to get quite a bit more than what we define: Via's, Route's,
> Record-Route's, Refer-To's, P-Asserted/P-Preferred/Remote-Party-ID,
> Path's, Max-Forwards, Content-Type's, History-Info, and Diversion...
> just off the top of my head.

I suspect that what we have been discussing in the mailing list, i.e.,
having a base set of headers and an extensibility model to account
for other headers is generally a good thing.  Reverting to logging
the whole SIP message seems to be throwing our hands up in the air
and saying that we don't really know what we want so we'll save the
whole kitchen sink just in case.  HTTP CLF has worked without
saving the whole message, and with some judicious thought I think
we can make SIP CLF work as well.

> And really, as an "anomaly detector" I would want to filter which
> headers I *don't* care-about, not which I do.

Possibly; however, for training and subsequent use of an anomaly
system, you would need the headers which you care about (i.e., if
two colluding UAs leak information by subscribing to event package
X but sending notifications for event package Y, I'd want to log
the appropriate headers that I care about.)

Thanks,

- vijay
-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg@{alcatel-lucent.com,bell-labs.com,acm.org}
WWW:   http://ect.bell-labs.com/who/vkg