Re: [Nmlrg] Machine Learning in network - solicitation for use cases

Brian E Carpenter <brian.e.carpenter@gmail.com> Tue, 08 September 2015 04:38 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: nmlrg@ietfa.amsl.com
Delivered-To: nmlrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 059801A902E for <nmlrg@ietfa.amsl.com>; Mon, 7 Sep 2015 21:38:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id whADenpAqdw2 for <nmlrg@ietfa.amsl.com>; Mon, 7 Sep 2015 21:38:44 -0700 (PDT)
Received: from mail-pa0-x22f.google.com (mail-pa0-x22f.google.com [IPv6:2607:f8b0:400e:c03::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9CDBF1ACEBA for <nmlrg@irtf.org>; Mon, 7 Sep 2015 21:38:44 -0700 (PDT)
Received: by pacfv12 with SMTP id fv12so115415037pac.2 for <nmlrg@irtf.org>; Mon, 07 Sep 2015 21:38:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:organization:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=nJ7o0K2ZqfltvfSc49MrMCaDAvm5S+nX5xKjatLYR7A=; b=xaso/8uSrvaWghLLdYo3yAaN6islS+bifaAkjC3ldXbjQ1mJNpAnRMBlofo67yPqwD 5D8qshNxjJ+AVrrtncVSCk4s0WRo/b51Jz9o+J026AqCH/Kn6UkhhuOfP76RwuCnnqE8 RAMP/OVoGZuKjusNQkK+vG6uplaQreMekmgXaeCjKtBn2/OIyE2z4dDGgQKtK+FO+Tmk nCsJIlONCCPXYWDwiq0JnG3Zh6YtHfSx37AW/5oirRv0IxSXooe0PzVRV8U6FmpHHr6T z42h2jURQhOrihLhN1m7+apJ9LtZs2Q5OBKn/HPAb/t1W/9BOEooi+0zjxYG5UO/NIaw 2h7A==
X-Received: by 10.68.223.162 with SMTP id qv2mr54502260pbc.6.1441687124016; Mon, 07 Sep 2015 21:38:44 -0700 (PDT)
Received: from [192.168.178.25] (118.229.69.111.dynamic.snap.net.nz. [111.69.229.118]) by smtp.gmail.com with ESMTPSA id t1sm1529488pdf.90.2015.09.07.21.38.40 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Sep 2015 21:38:43 -0700 (PDT)
To: "Liubing (Leo)" <leo.liubing@huawei.com>, Sheng Jiang <jiangsheng@huawei.com>, Dacheng Zhang <dacheng.zdc@alibaba-inc.com>, "nmlrg@irtf.org" <nmlrg@irtf.org>
References: <D20A251E.25E52%dacheng.zdc@alibaba-inc.com> <5D36713D8A4E7348A7E10DF7437A4B927BB2B192@nkgeml512-mbx.china.huawei.com> <D20B2C03.25EC7%dacheng.zdc@alibaba-inc.com> <5D36713D8A4E7348A7E10DF7437A4B927BB2D062@nkgeml512-mbx.china.huawei.com> <D211D160.26495%dacheng.zdc@alibaba-inc.com> <D211D7F2.2651C%dacheng.zdc@alibaba-inc.com> <5D36713D8A4E7348A7E10DF7437A4B927BB2D300@nkgeml512-mbx.china.huawei.com> <55EC9987.9030002@gmail.com> <5D36713D8A4E7348A7E10DF7437A4B927BB2D65D@nkgeml512-mbx.china.huawei.com> <55ED09ED.3090406@gmail.com> <5D36713D8A4E7348A7E10DF7437A4B927BB2DD75@nkgeml512-mbx.china.huawei.com> <8AE0F17B87264D4CAC7DE0AA6C406F45C227BE52@nkgeml506-mbx.china.huawei.com>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <55EE6648.4040804@gmail.com>
Date: Tue, 8 Sep 2015 16:38:32 +1200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <8AE0F17B87264D4CAC7DE0AA6C406F45C227BE52@nkgeml506-mbx.china.huawei.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/nmlrg/guhI4kz5VQCNbkaPFXltvwpQS_M>
Subject: Re: [Nmlrg] Machine Learning in network - solicitation for use cases
X-BeenThere: nmlrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Network Machine Learning Research Group <nmlrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nmlrg/>
List-Post: <mailto:nmlrg@irtf.org>
List-Help: <mailto:nmlrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Sep 2015 04:38:46 -0000

On 08/09/2015 16:00, Liubing (Leo) wrote:

...
> But I'm curious about what is the item that could be labeled as "This is not an attack " or " You missed an attack ". E.g., the item is an packet, a stream, or any other kind of N-tuple things.

The two cases are rather different.

1. The system signals "attack in progress" to the NOC. The operators have a look
and decide that there is no attack, it is just some unusual traffic.
(Example: you are live-streaming the Olympic Games. Two seconds after the
end of the 100 metres final, there is an enormous burst of traffic.
The machine learning system signals an attack, because it was not trained
on the data set from the previous Olympic Games.)

In this case the NOC operators urgently tell the algorithm it is wrong.
It needs to learn that the signature of a sudden burst just after the
end of an event is less likely to be an attack than a sudden burst
at another time.

2. Someone invents a new kind of DDoS attack, which is therefore not
in the historical training data. The system doesn't identify it.
In this case, the NOC operators tell the algorithm "Attack started
at <time>." This automatically becomes high quality training data
for the algorithm: the signature of the new traffic at that time
is 100% certain to be an attack.

I think the hard part is extracting useful signatures from the
traffic stream in real time; the learning/training part is fairly
standard.

    Brian