Re: [Idnet] A few ideas/suggestions to get us going

Adeel Rehman <adeelrehman85@gmail.com> Thu, 23 March 2017 18:00 UTC

Return-Path: <adeelrehman85@gmail.com>
X-Original-To: idnet@ietfa.amsl.com
Delivered-To: idnet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A4665129AEB for <idnet@ietfa.amsl.com>; Thu, 23 Mar 2017 11:00:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.748
X-Spam-Level:
X-Spam-Status: No, score=-1.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ITqezoX3sL_m for <idnet@ietfa.amsl.com>; Thu, 23 Mar 2017 11:00:40 -0700 (PDT)
Received: from mail-qt0-x233.google.com (mail-qt0-x233.google.com [IPv6:2607:f8b0:400d:c0d::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 713A21294B8 for <idnet@ietf.org>; Thu, 23 Mar 2017 11:00:40 -0700 (PDT)
Received: by mail-qt0-x233.google.com with SMTP id x35so182083251qtc.2 for <idnet@ietf.org>; Thu, 23 Mar 2017 11:00:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=WIUTdJHGXEO/W482r43DsE2iwyEay+/jeEOPj1ZC0MM=; b=cxl75nZn+UwR5UBee6DUnFrsq4fmphcr0eXabnZJ+dmgXd8UbxBvG/CK/Bd914ZNTT 6ddXcA8XijUjNRf253U/KJ40ONrRmVFFWnOUwFLISZqDRfcPgqNcIdbpr2FxuHathtdq J+kNqurdiZQOe2bgsw9P/2hxgg1R29uwfFFSPFHtPdz7fnk81xLmpaWjVJ6LV6tyGOxk QlBe5b7t39UZC9fc4ldGj/eJjpkoEWia05zuqWT1UZ9GvpKeK6brOekdGpIEp/rIAB8p WBueeiz4zCCMdEDlwPPP+JT1/DOVj+vVnKGV7Mo3Te23m1rP7/GI08NqBRd2+Bnqcdwx GUNQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=WIUTdJHGXEO/W482r43DsE2iwyEay+/jeEOPj1ZC0MM=; b=NSYj5KyJj11vz00K74zG60kHsT+xSXLYle8XuUwWQzeVIr1M+ySB8uMxCCyBeW2cEq Umbpnrwj1REuSF9V1qVAUFZfBB/BlWuSObdpMhvK47LeJQiV7X1m+7SKvOIahT3rM/Nb KwiF6PVLlKT4wWzFdzyV3yvM8NJYn9jcjDOxUoHuBLcCs6BtnWfyao8nhl3zqVs4sCnl J5zvOoXEFU2IC9PPaqg3UQ/1WdEBlVGAgNzl7AqF6plWTBny7qDHXH6cqY+PNroBel2S AtBrhD8RQlggzPGwqHpwwHWXHvI1mhhtU/2Ghp64E1gduiEqdxhNaH7gEHbHbKL1Oh89 n4KQ==
X-Gm-Message-State: AFeK/H0wj3//OGgXrkoRs6wK7yUD1LY4VpB6UTAT/xL3OukCaRvj7VXxbe1XHa4kps1s6HGr+gthB/guqShEHg==
X-Received: by 10.237.37.71 with SMTP id w7mr3497413qtc.34.1490292039362; Thu, 23 Mar 2017 11:00:39 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.237.52.196 with HTTP; Thu, 23 Mar 2017 11:00:38 -0700 (PDT)
In-Reply-To: <CAN5YCF2xERjvHtkv18XK3Wvmi6WwumHzk1iW0zktvZ9_3WTONQ@mail.gmail.com>
References: <CAHiKxWh26ciY-Pf78EH3CLO1+d3utikMr1N8GwKWJzkZQAAu9g@mail.gmail.com> <CAN5YCF2xERjvHtkv18XK3Wvmi6WwumHzk1iW0zktvZ9_3WTONQ@mail.gmail.com>
From: Adeel Rehman <adeelrehman85@gmail.com>
Date: Thu, 23 Mar 2017 14:00:38 -0400
Message-ID: <CAN5YCF0eeAS2N250+AS4Sm+xfZuzjVd9Xe6Dn5OrUBbhe3zLjQ@mail.gmail.com>
To: David Meyer <dmm@1-4-5.net>, idnet@ietf.org
Content-Type: multipart/alternative; boundary="001a114203d69e351f054b69a763"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idnet/y3ynRG5MrZ4dJHFxy16V1IgqZqI>
Subject: Re: [Idnet] A few ideas/suggestions to get us going
X-BeenThere: idnet@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "The IDNet \(Intelligence-Defined Network\) " <idnet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idnet>, <mailto:idnet-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idnet/>
List-Post: <mailto:idnet@ietf.org>
List-Help: <mailto:idnet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idnet>, <mailto:idnet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Mar 2017 18:00:43 -0000

Thank you David for your feedback :)

Sorry i forgot to include distro initially.

On Thu, Mar 23, 2017 at 1:40 PM, Adeel Rehman <adeelrehman85@gmail.com>
wrote:

> Hi David
>
>
>
> You have point out excellent shortcomings of Machine learning applications
> in Networking field. I have been chasing the same question for a while now.
>
>
> I think part of the reason Network Packet data is very well
> structured/designed as compare to other data sources (image, text etc). The
> network data can be exploited using domain based algorithms very
> effectively. Applying ML algorithm to learn the rules of network traffic is
> somewhat costly compare to domain based algorithm. For e.g.
>
>
> a.       Learning shortest path, we have Dijkstra algorithm. Do we need
> ML for this?
>
>
> b.      TCP optimization. We have 3 or 4 Optimization algorithms that are
> very cost effective, run in client and server stack. Do we need ML for this?
>
>
> c.       There are ML papers that show HTTPS classification with good
> accuracy using SVM, and Decision tree algorithms. But do we need these
> algorithms? We can get classification using SSL SNI packet during SSL
> handshake and that would be 100% accurate.
>
> Having said that I think there are areas where Machine learning can be
> really helpful in detection non-structured behavior in Network Traffic
>
> a.       Anomaly Detection and Threat prevention. There are several
> algorithms out there but I think ML algorithms can outperform in this area.
> I know one security vendor effectively uses ML to prevent DDOS attacks.
>
> b.      Subscriber behavior, this is a hot topic for Telco operators. I
> think unsupervised topic modeling ML method can provide grouping of
> subscribers based on their usage. I actually have not seen any
> vendor/operator doing it currently, may be my knowledge is limited.
>
> c.       Self-orchestrated Network, this can be a big thing with NFV and
> Cloud applications. ML algorithms can play a vital part here. I see Cognet
> 5GPPP is taking initiative on this, but not much work from other vendors.
>
> and i apolo
>
> On Wed, Mar 22, 2017 at 1:29 PM, David Meyer <dmm@1-4-5.net> wrote:
>
>> Folks,
>>
>> I thought I'd try to get some discussion going by outlining some of my
>> views as to why networking is lagging other areas in the development and
>> application of Machine Learning (ML). In particular, networking is way
>> behind what we might call the "perceptual tasks" (vision, NLP, robotics,
>> etc) as well as other areas (medicine, finance, ...). The attached slide
>> from one of my decks tries to summarize the situation, but I'll give a bit
>> of an outline below.
>>
>> So why is networking lagging many other fields when it comes to the
>> application of machine learning? There are several reasons which I'll try
>> to outline here (I was fortunate enough to discuss this with the
>> packetpushers crew a few weeks ago, see [0]). These are in no particular
>> order.
>>
>> First, we don't have a "useful" theory of networking (UTON). One way to
>> think about what such a theory would look like is by analogy to what we see
>> with the success of convolutional neural networks (CNNs) not only for
>> vision but now for many other tasks. In that case there is a theory of how
>> vision works, built up from concepts like receptive fields, shared weights,
>> simple and complex cells, etc. For example, the input layer of a CNN isn't
>> fully connected; rather connections reflect the receptive field of the
>> input layer, which is in a way that is "inspired" by biological vision
>> (being very careful with "biological inspiration"). Same with the
>> alternation of convolutional and pooling layers; these loosely model the
>> alternation of simple and complex cells in the primary visual cortex (V1),
>> the secondary visual cortex(V2) and the Brodmann area (V3). BTW, such a
>> theory seems to be required for transfer learning [1], which we'll need if
>> we don't want every network to be analyzed in an ad-hoc, one-off style
>> (like we see today).
>>
>> The second thing that we need to think about is publicly available
>> standardized data sets. Examples here include MNIST, ImageNet, and many
>> others. The result of having these data sets has been the steady ratcheting
>> down of error rates on tasks such as object and scene recognition, NLP, and
>> others to super-human levels. Suffice it to say we have nothing like these
>> data sets for networking. Networking data sets today are largely
>> proprietary, and because there is no UTON, there is no real way to compare
>> results between them.
>>
>> Third, there is a large skill set gap. Network engineers (us!) typically
>> don't have the mathematical background required to build effective machine
>> learning at scale. See [2] for an outline of some of the mathematical
>> skills that are essential for effective ML. There is a lot more to this,
>> involving how progress is made in ML (open data, open source, open models,
>> in general open science and associated communities, see e.g., OpenAi [3],
>> Distill [4], and many others). In any event we need build community and
>> gain new skills if we want to be able to develop and apply state of the art
>> machine learning algorithms to network data, at scale. The bottom line is
>> that it will be difficult if not impossible to be effective in the ML space
>> if we ourselves don't understand how it works and further, if we can build
>> explainable systems (noting that explaining what the individual neurons in
>> a deep neural network are doing is notoriously difficult; that said much
>> progress is being made). So we want to build explainable, end-to-end
>> trained systems, and to accomplish this we ourselves need to understand how
>> these algorithms work, but in training and in inference.
>>
>> This email is already TL;DR but I'll add one more here: We need to learn
>> control, not just prediction. Since we live in an inherently adversarial
>> environment we need to take advantage of Reinforcement Learning as well as
>> the various attacks being formulated against ML; [5] gives one interesting
>> example of attacks against policy networks using adversarial examples. See
>> also slides 31 and 32 of [6] for some more on this topic.
>>
>> I hope some of this gets us thinking about the problems we need to solve
>> in order to be successful in the ML space. There's plenty more of this on
>> http://www.1-4-5.net/~dmm/ml and http://www.1-4-5.net/~dmm/vita.html.
>> I'm looking forward to the discussion.
>>
>> Thanks,
>>
>> --dmm
>>
>>
>>
>>
>> [0]  http://packetpushers.net/podcast/podcasts/pq-show-107-a
>> pplicability-machine-learning-networking/
>>
>> [1]  http://sebastianruder.com/transfer-learning/index.html
>> [2]  http://datascience.ibm.com/blog/the-mathematics-of-machine-learning/
>> [3] https://openai.com/blog/
>> [4] http://distill.pub/
>> [5] http://rll.berkeley.edu/adversarial/arXiv2017_AdversarialAttacks.pdf
>> [6]  http://www.1-4-5.net/~dmm/ml/talks/2016/cor_ml4networking.pptx
>>
>>
>> _______________________________________________
>> IDNET mailing list
>> IDNET@ietf.org
>> https://www.ietf.org/mailman/listinfo/idnet
>>
>>
>
>
> --
> Syed  Rehman
>
>


-- 
Adeel Rehman
978-551-5511