Re: [Idnet] 答复: IDN dedicated session call for case

yanshen <yanshen@huawei.com> Mon, 14 August 2017 03:19 UTC

Return-Path: <yanshen@huawei.com>
X-Original-To: idnet@ietfa.amsl.com
Delivered-To: idnet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B5FBC1204DA for <idnet@ietfa.amsl.com>; Sun, 13 Aug 2017 20:19:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.21
X-Spam-Level:
X-Spam-Status: No, score=-4.21 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4BfhhpdJ2Pok for <idnet@ietfa.amsl.com>; Sun, 13 Aug 2017 20:19:19 -0700 (PDT)
Received: from lhrrgout.huawei.com (lhrrgout.huawei.com [194.213.3.17]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B62481201F8 for <idnet@ietf.org>; Sun, 13 Aug 2017 20:19:17 -0700 (PDT)
Received: from 172.18.7.190 (EHLO LHREML712-CAH.china.huawei.com) ([172.18.7.190]) by lhrrg02-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DMO74921; Mon, 14 Aug 2017 03:19:15 +0000 (GMT)
Received: from DGGEMM401-HUB.china.huawei.com (10.3.20.209) by LHREML712-CAH.china.huawei.com (10.201.108.35) with Microsoft SMTP Server (TLS) id 14.3.301.0; Mon, 14 Aug 2017 04:19:13 +0100
Received: from DGGEMM505-MBX.china.huawei.com ([169.254.1.229]) by DGGEMM401-HUB.china.huawei.com ([10.3.20.209]) with mapi id 14.03.0301.000; Mon, 14 Aug 2017 11:19:02 +0800
From: yanshen <yanshen@huawei.com>
To: 김민석 <mskim16@etri.re.kr>, "idnet@ietf.org" <idnet@ietf.org>
CC: "dingxiaojian (A)" <dingxiaojian1@huawei.com>, Jérôme François <jerome.francois@inria.fr>
Thread-Topic: [Idnet] 答复: IDN dedicated session call for case
Thread-Index: AQHTEoXVGwUBroS0jUmCiZR9dTKBOaKDLsFA
Date: Mon, 14 Aug 2017 03:19:01 +0000
Message-ID: <6AE399511121AB42A34ACEF7BF25B4D2982EEB@DGGEMM505-MBX.china.huawei.com>
References: <6AE399511121AB42A34ACEF7BF25B4D297A34A@DGGEMM505-MBS.china.huawei.com> <CAGE_QeztLKUF55OjKcsxqW=MUMAX60vR+6935-n+nnKPRVX2zg@mail.gmail.com> <7e6d507a-e8bf-b334-e394-6dc08b4dc3b1@inria.fr> <5BC916BD50F92F45870ABA46212CB29C019C7592@SMTP1.etri.info> <6AE399511121AB42A34ACEF7BF25B4D297B7A9@DGGEMM505-MBS.china.huawei.com> <5BC916BD50F92F45870ABA46212CB29C019C7857@SMTP1.etri.info> <28ab908b-304f-fa8b-1fcc-df05b0f3105c@inria.fr> <3B110B81B721B940871EC78F107D848CFB07EA@DGGEMM506-MBS.china.huawei.com>
In-Reply-To: <3B110B81B721B940871EC78F107D848CFB07EA@DGGEMM506-MBS.china.huawei.com>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.130.179.89]
Content-Type: multipart/alternative; boundary="_000_6AE399511121AB42A34ACEF7BF25B4D2982EEBDGGEMM505MBXchina_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090202.599116B4.0019, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=169.254.1.229, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 5f0d853bd726caf484f5da5e1d269a3c
Archived-At: <https://mailarchive.ietf.org/arch/msg/idnet/dv_mQvLAj_vvRbCu5qBoh8rMYy8>
Subject: Re: [Idnet] 答复: IDN dedicated session call for case
X-BeenThere: idnet@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "The IDNet \(Intelligence-Defined Network\) " <idnet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idnet>, <mailto:idnet-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idnet/>
List-Post: <mailto:idnet@ietf.org>
List-Help: <mailto:idnet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idnet>, <mailto:idnet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Aug 2017 03:19:24 -0000

Hi Min-Suk Kim,

Personally, I agree with Jerome and Ding. The reason has been described that the use case should be method/algorithm independent. It should face to the process, the functional entity, the problem and the benefit after solving the problem. Please let me know if I miss some key differences surely.

BTW, as you said that you are currently focusing on the dataset. May I invite you to give some suggestions about the data organizing and data format? As you known that this part has been discussed in our mail list for quite a long time.  Especially the format and aspects that I proposed currently, your research is related with the practical data processing, I think you must have lot of experience to help improving.

Many thanks,

Yansen

From: IDNET [mailto:idnet-bounces@ietf.org] On Behalf Of dingxiaojian (A)
Sent: Friday, August 11, 2017 5:38 PM
To: Jérôme François <jerome.francois@inria.fr>; idnet@ietf.org
Subject: [Idnet] 答复: IDN dedicated session call for case

Agree, this is exactly what I’m thought.
If you use deep learning method, the network problem (use case n+4) should be very fit with this method.

发件人: IDNET [mailto:idnet-bounces@ietf.org] 代表 Jér?me Fran?ois
发送时间: 2017年8月11日 14:55
收件人: idnet@ietf.org<mailto:idnet@ietf.org>
主题: Re: [Idnet] IDN dedicated session call for case

Hi,

As you said, there might be several algorithms or techniques to be used in ML problems.

However, I understand from the first use case description that use case description should be independent of the ML algorithm as much as possible.
Otherwise, we will mutliply the number of use cases.


jerome
Le 11/08/2017 à 04:32, 김민석 a écrit :

Hi Yansen,



Thank you for check my usecase.



I know that the usecase is similar topic witht Jerome's one.

However, I'm focusing on creative dataset for ML-based model. We already discussed dataset applying of learning process for a network architecture in last IETF side meeting, but we lost some of points that pre-processing data to apply ML-based learning model is needed with much more efforts. Especially, in trendy deep learning models such as CNN & RNN, cretive dataset is a significant part for efficiently deciding and making system performance. As many guys knows, traffic classification using classical ML algorithms such as anomaly detection or random decision forest had discussed in last NMLRG so that we need more hot trendy issues in aspect of new network machine learning.



Acually, our team is developing real time deep learning model for traffic classification and makes an effort of pre-processing to create ml dataset to apply a couple of deep models. In case of CNN, we collect features for information of applications in payload, then transfer it as like an image[MxN] of dataset. We have another approach of pre-processing of RNN that we are collecting specific patterns from # of packets per application. We also consider a few different methods of ml-based pre-processing for deep learning models in a network achitecture.



If possible, we should set of a new usecase that how ml-based dataset for deep learning models are created by pre-processing in a network architecture.



Best,


Min-Suk Kim

Senior Researcher / Ph.D.






________________________________
보낸 사람 : "yanshen" <yanshen@huawei.com><mailto:yanshen@huawei.com>
보낸 날짜 : 2017-08-10 21:40:15 ( +09:00 )
받는 사람 : 김민석 <mskim16@etri.re.kr><mailto:mskim16@etri.re.kr>
참조 : idnet@ietf.org<mailto:idnet@ietf.org> <idnet@ietf.org><mailto:idnet@ietf.org>, Jérôme François <jerome.francois@inria.fr><mailto:jerome.francois@inria.fr>
제목 : RE: [Idnet] IDN dedicated session call for case


Hi Kim,

Thanks for your case in advance.

BTW, have you ever check the one that Jerome mentioned on Tuesday? It is also a traffic classification case.

Apologized that I have no more insight in this area. What is the difference between these two?

At least, whatever, this topic is high focused in current.

Yansen

From: 김민석 [mailto:mskim16@etri.re.kr]


Sent: Thursday, August 10, 2017 10:55 AM


To: Jérôme François <jerome.francois@inria.fr><mailto:jerome.francois@inria.fr>; Albert Cabellos <albert.cabellos@gmail.com><mailto:albert.cabellos@gmail.com>; yanshen <yanshen@huawei.com><mailto:yanshen@huawei.com>


Cc: idnet@ietf.org<mailto:idnet@ietf.org>


Subject: RE: [Idnet] IDN dedicated session call for case


HI,



We have an use-case for this:



Use case n+4: Real time traffic classfication using deep learning


Description: continuously collect packet data, then applying learning process for traffic classification with generating application using deep learning models such as CNN (convolutional neural network) and RNN (recurrent neural network). Data-set to apply into the models are generated by propecessing with features of information from flow in packet data.



process: 1. collect packet data in real-time, 2. preprocessing data-set for deep learning models, 3. Training model using deep learning (CNN & RNN), 4. On-line data learning & classifying 5. Monitoring and analyzing traffic in the web



Data Format: Time : [Start, End, Unit, Number of Value, Sampling Period]


                            Position: [Device ID, Port ID]


                            Direction: IN / OUT


                            Flow level metric: packet & flow size, number of packet(RNN), payload parsing


 Message: Request: ask for the data


                          Reply: Data


                          Notice: For notification or others


                          Policy: Control policy



Regards,


Min-Suk Kim

Senior Researcher / Ph.D.










________________________________
보낸 사람 : "Jérôme François" <jerome.francois@inria.fr<mailto:jerome.francois@inria.fr>>
보낸 날짜 : 2017-08-08 23:49:47 ( +09:00 )
받는 사람 : Albert Cabellos <albert.cabellos@gmail.com<mailto:albert.cabellos@gmail.com>>, yanshen <yanshen@huawei.com<mailto:yanshen@huawei.com>>
참조 : idnet@ietf.org<mailto:idnet@ietf.org> <idnet@ietf.org<mailto:idnet@ietf.org>>
제목 : Re: [Idnet] IDN dedicated session call for case


Hi all,




Here is another use case about traffic classification.




Use case N+3: (encrypted) traffic classification




    Description: collect flow-level traffic metrics such as protocol information but also meta metrics such as distribution of packet sizes, inter-arrival times... Then use such information to label the trafic with the underlying application assuming that the granularity of classification may vary (type of application, exact application name, version...)


    Process: 1. collect packet information 2. flow reassembly (using directly flow format such as IPFIX might be possible but depends on the type of traffic, e.g. extracting the TLS application data is useful for encrypted traffic) 3. Collect application specific information (useful when targeting a single type of application) = out of network information 4. train the model 5. Online or offline testing 4. Apply application level policies.


    Data Format:    Time : [Start, End, Unit, Number of Value, Sampling Period]


                                Position: [Device ID, Port ID]


                                Direction: IN / OUT


                                Flow level metric: packet size distributions, number of packets, inter-arrival time distribution,


                                 (+ application specific knowledge : payload parsing)




    Message :       Request: ask for the data


                           Reply: Data


                           Notice: For notification or others


                           Policy: Control policy






Best regards,


jerome





Le 08/08/2017 à 06:52, Albert Cabellos a écrit :


Hi all


Here´s another use-case:


Use case N+2: QoE


style="font-size: 12px;">        Description: Collect low-level metrics (SNR, latency, jitter, losses, etc) and measure QoE. Then use ML to understand what is the relation between satisfactory QoE and the low-level metrics. As an example learn that when delay>N then QoE is degraded, but when M<delay<N then QoE is satisfactory for the customers (please note that QoE cannot be measured directly over your network). This is useful to understand how the network must be operated to provide satisfactory QoE.


style="font-size: 12px;">        Process: 1. Low-level data collection and QoE measurement ; 2. Training Model (input low-level metrics, output QoE); 3. Real-time data capture and input; 4. Predict QoE; 5. Operate network to meet target QoE requirement, go to 3.


style="font-size: 12px;">        Data Format:    Time : [Start, End, Unit, Number of Value, Sampling Period]


style="font-size: 12px;">                                Position: [Device ID, Port ID]


style="font-size: 12px;">                                Direction: IN / OUT


style="font-size: 12px;">                                Low-level metric : SNR, Delay, Jitter, queue-size, etc


style="font-size: 12px;">        Message :       Request: ask for the data


style="font-size: 12px;">                                Reply: Data


style="font-size: 12px;">                                Notice: For notification or others


style="font-size: 12px;">                                Policy: Control policy




Kind regards


Albert


On Wed, Aug 2, 2017 at 7:12 PM, yanshen <yanshen@huawei.com<mailto:yanshen@huawei.com>> wrote:


Dear all,




Since we plan to organize a dedicated session in NMRG, IETF100, for applying AI into network management (NM), I’d try to list some Use Cases and propose a roadmap and ToC before Nov.




These might be rough. You are welcome to refine them and propose your focused use cases or ideas.




Use case 1: Traffic Prediction


        Description: Collect the history traffic data and external data which may influence the traffic. Predict the traffic in short/long/specific term. Avoid the congestion or risk in previously.


        Process: 1. Data collection (e.g. traffic sample of physical/logical port ); 2. Training Model; 3. Real-time data capture and input; 4. Predication output; 5. Fix error and go back to 3.


        Data Format:    Time : [Start, End, Unit, Number of Value, Sampling Period]


                                Position: [Device ID, Port ID]


                                Direction: IN / OUT


                                Route : [R1, R2, ..., RN]  (might be useful for some scenarios)


                                Service : [Service ID, Priority, ...]  (Not clear how to use it but seems useful)


                                Traffic: [T0, T1, T2, ..., TN]


        Message :       Request: ask for the data


                                Reply: Data


                                Notice: For notification or others


                                Policy: Control policy




Use case 2: QoS Management


        Description: Use multiple paths to distribute the traffic flows. Adjust the percentages. Avoid congestion and ensure QoS.


        Process: 1. Data capture (e.g. traffic sample of physical/logical port ); 2. Training Model; 3. Real-time data capture and input; 4. Output percentages; 5. Fix error and go back to 3.


        Data Format:    Time : [Timestamp, Value type (Delay/Packet Loss/...), Unit, Number of Value, Sampling Period]


                                Position: [Link ID, Device ID]


                                Value: [V0, V1, V2, ..., VN]


        Message :       Request: ask for the data


                                Reply: Data


                                Notice: For notification or others


                                Policy: Control policy




Use case N: Waiting for your Ideas




Also I suggest a roadmap before Nov if possible.




### Roadmap ###


Aug. : Collecting the use cases (related with NM). Rough thoughts and requirements


Sep. : Refining the cases and abstract the common elements


Oct. : Deeply analysis. Especially on Data Format, control flow, or other key points


Nov.: F2F discussions on IETF100


### Roadmap End ###




A rough ToC is listed in following. We may take it as a scope before Nov. Hope that the content could become the draft of draft.




###Table of Content###


1. Gap and Requirement Analysis


        1.1 Network Management requirement


        1.2 TBD


2. Use Cases


        2.1 Traffic Prediction


        2.2 QoS Management


        3.3 TBD


3. Data Focus


        3.1 Data attribute


        3.2 Data format


        3.3 TBD


4. Aims


        4.1 Benchmarking Framework


        4.2 TBD


###ToC End###






Yansen




_______________________________________________


IDNET mailing list


IDNET@ietf.org<mailto:IDNET@ietf.org>


https://www.ietf.org/mailman/listinfo/idnet









_______________________________________________

IDNET mailing list

IDNET@ietf.org<mailto:IDNET@ietf.org>

https://www.ietf.org/mailman/listinfo/idnet





_______________________________________________

IDNET mailing list

IDNET@ietf.org<mailto:IDNET@ietf.org>

https://www.ietf.org/mailman/listinfo/idnet