Re: [Nmlrg] Machine Learning in network - solicitation for use cases
Sebastian Abt <sabt@sabt.net> Thu, 17 September 2015 19:01 UTC
Return-Path: <sabt@sabt.net>
X-Original-To: nmlrg@ietfa.amsl.com
Delivered-To: nmlrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 7486B1B3142
for <nmlrg@ietfa.amsl.com>; Thu, 17 Sep 2015 12:01:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.79
X-Spam-Level:
X-Spam-Status: No, score=0.79 tagged_above=-999 required=5
tests=[BAYES_50=0.8, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id ezJUaRoHCb8v for <nmlrg@ietfa.amsl.com>;
Thu, 17 Sep 2015 12:01:45 -0700 (PDT)
Received: from sephina.sabt.net (mail.sabt.net [IPv6:2001:1a50:1::3])
(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
(No client certificate requested)
by ietfa.amsl.com (Postfix) with ESMTPS id 74FCC1B313A
for <nmlrg@irtf.org>; Thu, 17 Sep 2015 12:01:22 -0700 (PDT)
Received: from [62.216.164.250] (helo=mbpro.fritz.box)
by sephina.sabt.net with esmtpsa (TLSv1:AES256-SHA:256)
(Exim 4.69 (FreeBSD)) (envelope-from <sabt@sabt.net>)
id 1ZceQh-0001sp-Qp; Thu, 17 Sep 2015 21:01:11 +0200
Content-Type: multipart/signed;
boundary="Apple-Mail=_F355F6E7-E482-40FC-A230-AB4A279C587D";
protocol="application/pkcs7-signature"; micalg=sha1
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\))
From: Sebastian Abt <sabt@sabt.net>
In-Reply-To: <D2130D6D.26ABF%dacheng.zdc@alibaba-inc.com>
Date: Thu, 17 Sep 2015 21:01:11 +0200
Message-Id: <865D648A-233F-4426-8A65-5430270A38EA@sabt.net>
References: <D20A251E.25E52%dacheng.zdc@alibaba-inc.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2B192@nkgeml512-mbx.china.huawei.com>
<D20B2C03.25EC7%dacheng.zdc@alibaba-inc.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2D062@nkgeml512-mbx.china.huawei.com>
<D211D160.26495%dacheng.zdc@alibaba-inc.com>
<D211D7F2.2651C%dacheng.zdc@alibaba-inc.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2D300@nkgeml512-mbx.china.huawei.com>
<D2130D6D.26ABF%dacheng.zdc@alibaba-inc.com>
To: Dacheng Zhang <dacheng.zdc@alibaba-inc.com>
X-Mailer: Apple Mail (2.2104)
Archived-At: <http://mailarchive.ietf.org/arch/msg/nmlrg/4Wjuf_Bo7olx47fsysPOzYAl31U>
Cc: "nmlrg@irtf.org" <nmlrg@irtf.org>, Sebastian Abt <sabt@sabt.net>,
Sheng Jiang <jiangsheng@huawei.com>
Subject: Re: [Nmlrg] Machine Learning in network - solicitation for use cases
X-BeenThere: nmlrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Network Machine Learning Research Group <nmlrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nmlrg>,
<mailto:nmlrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nmlrg/>
List-Post: <mailto:nmlrg@irtf.org>
List-Help: <mailto:nmlrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nmlrg>,
<mailto:nmlrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Sep 2015 19:01:47 -0000
Hi all. I’m a bit late in the threat, but I’d like to share some insights/experience I gained when researching ML-based systems and also when deploying them in operational networks... > Am 07.09.2015 um 03:52 schrieb Dacheng Zhang <dacheng.zdc@alibaba-inc.com>om>: > > > > 在 15-9-6 下午5:01, "Sheng Jiang" <jiangsheng@huawei.com> 写入: > >>> Some more detailed introduction. >>> >>> DDoS and APT are very active research topics. Application layer DDoS >>> attacks are more difficult to detect than layer 4 DDoS attacks. In many >>> cases, the application layer DDoS does not introduce large amount >>> traffics. However, by using big data and data mining tech, it is possible >>> to find out the clues of such attacks. >> >> Hi, Dacheng, >> >> Applying machine learning in DDoS protection is an interest use case. For >> my understanding, the machine would learn the potential attack behaviors, >> am I right? > > Yes, you are right. >> >> If yes, I have two questions: a) does the machine learning has the >> possibility to learn/identify new attack behaviors, which was not >> recognized before? If yes, what is the working principles? > > Normally we need to generate a normal behavior model and some “abnormal > behavior models”, the machine will detect whether certain behavior of a > client will be located in an ‘abnormal’ area. > I need to check with my colleagues to see whether we could disclose more > detailed information for the moment. This depends on your anomaly detection system. In general, you can apply two-class or one-class systems (multi-class systems are typically deployed as many two-class comparisons). So, for the DDoS case you could either: 1. Build a model that learns a normal class (benign traffic) and an attack class (DDoS traffic). For this, you typically need labelled traffic of both classes which are equally balanced in size. If one class is more dominant than the other, models tend to generalise towards the more dominant class, just because guessing-probability in that case is higher and, hence, classification error is lower. Using such a model, an ML-based ADS would tell you to which class a specific input sample belongs. 2. Build a model that learns just the normal class. In that case your system typically needs to be able to map input samples to a point in an euclidean space. For anomaly detection, you then typically compare distance between new data points and the center/representative of the previously learned normal class. If distance is too large, than anomaly is assumed. For learning the normal class you „simply“ need a clean sample of normal traffic. From my experience and in my opinion, I would only call the second case a true anomaly detection system as the system would not make an assumption of what might be an attack. Instead, it simply makes an assumption (i.e., the learnt model) of what normality looks like. The drawback of such systems is that normality might change, e.g. you deploy a new application or - even more difficult - one of your customers deploys a new application or uses a new protocol. In that case you raise false alarms. >> b) is it possible for autonomic reaction from the network operational >> perspective after detect such DDoS attack? Give the machine learning may >> not be accurate, my guess is human intervention is needed. > > In the current practice, machine learning procedure is normally offline. > 1) machine learning may not very that accurate. 2) big data processing > needs time and computing resources. Human involvement is required. There are also online learning/continuous learning systems which update their models according to learning decisions. This may also be enhanced by operator feedback (i.e., manually accepting or declining a predicted class after inspection). This is a very interesting approach, but unfortunately has some fallbacks: attackers can trick systems to displace the decision boundary in a way that attacks get unnoticed. Regards, sebastian
- [Nmlrg] Machine Learning in network - solicitatio… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- [Nmlrg] Using Machine Learning for Network Device… Liubing (Leo)
- Re: [Nmlrg] Using Machine Learning for Network De… Sheng Jiang
- Re: [Nmlrg] Using Machine Learning for Network De… Liubing (Leo)
- Re: [Nmlrg] Using Machine Learning for Network De… Sheng Jiang
- Re: [Nmlrg] Using Machine Learning for Network De… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang