Re: [Nmlrg] Machine Learning in network - solicitation for use cases
Brian E Carpenter <brian.e.carpenter@gmail.com> Tue, 08 September 2015 23:05 UTC
Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: nmlrg@ietfa.amsl.com
Delivered-To: nmlrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 362E11B341F
for <nmlrg@ietfa.amsl.com>; Tue, 8 Sep 2015 16:05:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9,
DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id CCD5uiZm2eTY for <nmlrg@ietfa.amsl.com>;
Tue, 8 Sep 2015 16:05:24 -0700 (PDT)
Received: from mail-pa0-x234.google.com (mail-pa0-x234.google.com
[IPv6:2607:f8b0:400e:c03::234])
(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
(No client certificate requested)
by ietfa.amsl.com (Postfix) with ESMTPS id DAE151B3591
for <nmlrg@irtf.org>; Tue, 8 Sep 2015 16:05:24 -0700 (PDT)
Received: by padhk3 with SMTP id hk3so51725517pad.3
for <nmlrg@irtf.org>; Tue, 08 Sep 2015 16:05:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
h=subject:to:references:from:organization:message-id:date:user-agent
:mime-version:in-reply-to:content-type:content-transfer-encoding;
bh=o8WKlSAPI8wkUxv0/gI8Ziwgk0hIIP3jEM6g0epxMis=;
b=slfMkFsvizgxQF6RUMnH1FaTdEpIEjd5Cr4PMyYu8xdnoo6wacal+GMEj859Sj8R5k
O3QnBAlS7qAA1A6rYrlDbjiBq65gM0kvZwNVTtdVc+jYYlt1XLgfgt/Qaul8LFpUORSP
GmDIpLqZ0NiOP+eeATK0T7PxPA2IrQWjDQiLGhnN4np0tr+o/uQ7afMDFG1bPPFZiqee
9T6TUcESOyvIisTzRyBn+9r3tDIZ9Pc5U4I7QtaTXO5Q/0RXdeWFLptvBPlZQ/UlSnZb
SKmjDHEW6Q26ADi9iy4Y1ewPIaWXkz6YqB5gQFHHfpWmIQyMRsksDzlIAYgZtCMVUty2
zBpA==
X-Received: by 10.66.253.129 with SMTP id aa1mr53366021pad.24.1441753524291;
Tue, 08 Sep 2015 16:05:24 -0700 (PDT)
Received: from [192.168.178.25] (132.219.69.111.dynamic.snap.net.nz.
[111.69.219.132])
by smtp.gmail.com with ESMTPSA id wk6sm4656168pab.30.2015.09.08.16.05.20
(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Tue, 08 Sep 2015 16:05:22 -0700 (PDT)
To: "Liubing (Leo)" <leo.liubing@huawei.com>,
Sheng Jiang <jiangsheng@huawei.com>,
Dacheng Zhang <dacheng.zdc@alibaba-inc.com>, "nmlrg@irtf.org"
<nmlrg@irtf.org>
References: <D20A251E.25E52%dacheng.zdc@alibaba-inc.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2B192@nkgeml512-mbx.china.huawei.com>
<D20B2C03.25EC7%dacheng.zdc@alibaba-inc.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2D062@nkgeml512-mbx.china.huawei.com>
<D211D160.26495%dacheng.zdc@alibaba-inc.com>
<D211D7F2.2651C%dacheng.zdc@alibaba-inc.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2D300@nkgeml512-mbx.china.huawei.com>
<55EC9987.9030002@gmail.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2D65D@nkgeml512-mbx.china.huawei.com>
<55ED09ED.3090406@gmail.com>
<5D36713D8A4E7348A7E10DF7437A4B927BB2DD75@nkgeml512-mbx.china.huawei.com>
<8AE0F17B87264D4CAC7DE0AA6C406F45C227BE52@nkgeml506-mbx.china.huawei.com>
<55EE6648.4040804@gmail.com>
<8AE0F17B87264D4CAC7DE0AA6C406F45C227CF25@nkgeml506-mbx.china.huawei.com>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <55EF69B3.2070305@gmail.com>
Date: Wed, 9 Sep 2015 11:05:23 +1200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101
Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <8AE0F17B87264D4CAC7DE0AA6C406F45C227CF25@nkgeml506-mbx.china.huawei.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/nmlrg/lqeqyDptKQ3BnTpbkxG5CaCYE-M>
Subject: Re: [Nmlrg] Machine Learning in network - solicitation for use cases
X-BeenThere: nmlrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Network Machine Learning Research Group <nmlrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nmlrg>,
<mailto:nmlrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nmlrg/>
List-Post: <mailto:nmlrg@irtf.org>
List-Help: <mailto:nmlrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nmlrg>,
<mailto:nmlrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Sep 2015 23:05:27 -0000
Bing, We are at the limits of my knowledge here. However, as I said, I think that automatically extracting event signatures from a real time packet stream is the hard part. I think a typical approach would be to run a variety of statistical algorithms over a sliding window of incoming packet headers. This paper talks about one technique (but not used in real time): http://www.cs.auckland.ac.nz/CDMTCS//researchreports/266eimann.pdf and the corresponding PhD thesis: https://researchspace.auckland.ac.nz/bitstream/handle/2292/3427/02whole.pdf Regards Brian On 08/09/2015 20:13, Liubing (Leo) wrote: > Hi Brian, > > Thanks for your elaborating explanation. > Please see inline. > >> -----Original Message----- >> From: Brian E Carpenter [mailto:brian.e.carpenter@gmail.com] >> Sent: Tuesday, September 08, 2015 12:39 PM >> To: Liubing (Leo); Sheng Jiang; Dacheng Zhang; nmlrg@irtf.org >> Subject: Re: [Nmlrg] Machine Learning in network - solicitation for use cases >> >> On 08/09/2015 16:00, Liubing (Leo) wrote: >> >> ... >>> But I'm curious about what is the item that could be labeled as "This is not >> an attack " or " You missed an attack ". E.g., the item is an packet, a stream, >> or any other kind of N-tuple things. >> >> The two cases are rather different. >> >> 1. The system signals "attack in progress" to the NOC. The operators have a >> look and decide that there is no attack, it is just some unusual traffic. >> (Example: you are live-streaming the Olympic Games. Two seconds after the >> end of the 100 metres final, there is an enormous burst of traffic. >> The machine learning system signals an attack, because it was not trained on >> the data set from the previous Olympic Games.) >> >> In this case the NOC operators urgently tell the algorithm it is wrong. >> It needs to learn that the signature of a sudden burst just after the end of an >> event is less likely to be an attack than a sudden burst at another time. >> >> 2. Someone invents a new kind of DDoS attack, which is therefore not in the >> historical training data. The system doesn't identify it. >> In this case, the NOC operators tell the algorithm "Attack started at <time>." > > [Bing] It feels like the learning objects are mostly traffic burst events? When traffic burst happens, the machine judges whether it's a DDoS or not. > Then the training data might be a bunch of traffic burst evens marked as normal or abnormal. And the challenge should be how to pick a set of features out of a burst event for the machine learning program to discover the pattern of normal/abnormal classification. > > This is just my hypothetical case, could be all wrong. > >> This automatically becomes high quality training data for the algorithm: the >> signature of the new traffic at that time is 100% certain to be an attack. > >> I think the hard part is extracting useful signatures from the traffic stream in >> real time; the learning/training part is fairly standard. > [Bing] When you said the signature of "100% certain", my perception is that it is something like the typical virus detection approach, which doing exact match of a particular piece of code that could identify a virus. > If my perception was correct, I think the signatures are some special combinations of the features (as mentioned above) where the classification pattern has a 100% confidence. Then I think it doesn't need to extract the signatures in real time, because the features are all pre-defined. > > However, this is only from my view, I might have totally misunderstood you. > > Best regards, > Bing > >> Brian
- [Nmlrg] Machine Learning in network - solicitatio… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- [Nmlrg] Using Machine Learning for Network Device… Liubing (Leo)
- Re: [Nmlrg] Using Machine Learning for Network De… Sheng Jiang
- Re: [Nmlrg] Using Machine Learning for Network De… Liubing (Leo)
- Re: [Nmlrg] Using Machine Learning for Network De… Sheng Jiang
- Re: [Nmlrg] Using Machine Learning for Network De… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- Re: [Nmlrg] Machine Learning in network - solicit… Dacheng Zhang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Sebastian Abt
- Re: [Nmlrg] Machine Learning in network - solicit… Brian E Carpenter
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Jérôme François
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang
- Re: [Nmlrg] Machine Learning in network - solicit… Liubing (Leo)
- Re: [Nmlrg] Machine Learning in network - solicit… Sheng Jiang