Re: [Idnet] A few ideas/suggestions to get us going

Henk Birkholz <henk.birkholz@sit.fraunhofer.de> Thu, 23 March 2017 09:56 UTC

Return-Path: <henk.birkholz@sit.fraunhofer.de>
X-Original-To: idnet@ietfa.amsl.com
Delivered-To: idnet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D17C3131511 for <idnet@ietfa.amsl.com>; Thu, 23 Mar 2017 02:56:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.9
X-Spam-Level:
X-Spam-Status: No, score=-6.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8ZFbToPplUX3 for <idnet@ietfa.amsl.com>; Thu, 23 Mar 2017 02:56:06 -0700 (PDT)
Received: from mailext.sit.fraunhofer.de (mailext.sit.fraunhofer.de [141.12.72.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1BBBC13150E for <idnet@ietf.org>; Thu, 23 Mar 2017 02:56:05 -0700 (PDT)
Received: from mail.sit.fraunhofer.de (mail.sit.fraunhofer.de [141.12.84.171]) by mailext.sit.fraunhofer.de (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id v2N9tw34011476 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <idnet@ietf.org>; Thu, 23 Mar 2017 10:56:01 +0100
Received: from [134.102.161.4] (134.102.161.4) by mail.sit.fraunhofer.de (141.12.84.171) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 23 Mar 2017 10:55:53 +0100
To: idnet@ietf.org
References: <CAHiKxWh26ciY-Pf78EH3CLO1+d3utikMr1N8GwKWJzkZQAAu9g@mail.gmail.com>
From: Henk Birkholz <henk.birkholz@sit.fraunhofer.de>
Message-ID: <c91aa69e-6aa8-0d30-10eb-4afcf003f5cf@sit.fraunhofer.de>
Date: Thu, 23 Mar 2017 10:55:48 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <CAHiKxWh26ciY-Pf78EH3CLO1+d3utikMr1N8GwKWJzkZQAAu9g@mail.gmail.com>
Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg="sha-256"; boundary="------------ms080100010201010305000302"
X-Originating-IP: [134.102.161.4]
Archived-At: <https://mailarchive.ietf.org/arch/msg/idnet/prK_rJtXNKl0jLJSqXt6eZxa8k0>
Subject: Re: [Idnet] A few ideas/suggestions to get us going
X-BeenThere: idnet@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "The IDNet \(Intelligence-Defined Network\) " <idnet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idnet>, <mailto:idnet-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idnet/>
List-Post: <mailto:idnet@ietf.org>
List-Help: <mailto:idnet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idnet>, <mailto:idnet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Mar 2017 09:56:10 -0000

Hello,

maybe an excerpt from my personal point of view can be a contribution to 
the discussion starting on this list.


In my experience, the gap between... work-flows (in lack of a better 
term) how problem statements are created in the domain of 
network/management/security and - in contrast - the domain of machine 
learning is (aka "appears to me, subjectively") astonishingly vast.

One way to illustrate that gap (and there are multiple ways, I think), 
using a bit of hyperbole:

"network: least viable solution" -> "I only produce the information I
require"

meets

"machine learning: combination of heterogeneous most viable solutions" 
-> "Please provide me with everything you got, including the best 
qualified semantic annotation of characteristics and context so I can 
identify and select the features that are relevant to provide a 
contribution".


Also - as trivial and repetitive as that might sound - terminology again 
is key. Please note the following quote I actually encountered in my 
very early days collaborating with machine learning architects: "The 
maximum number of ports really is 2^16? Wow, how big can these routers be?".
While this is of course "one of these entertaining anecdotes" everybody 
already heard at least once already, it also highlights quite 
prominently the existing gap form a different angle.


In consequence, guidance that enables an individual with a 
specialization in machine learning skills to just better understand the 
network domain itself - maybe by illustrating very simple, well-known 
and already solved problem statements - might already be a contribution 
of high value, just because it is specifically provided for that group 
of individuals and using terminology that is common and well-understood 
in that domain.


Viele Grü0e,

Henk

On 03/22/2017 06:29 PM, David Meyer wrote:
> Folks,
>
> I thought I'd try to get some discussion going by outlining some of my
> views as to why networking is lagging other areas in the development and
> application of Machine Learning (ML). In particular, networking is way
> behind what we might call the "perceptual tasks" (vision, NLP, robotics,
> etc) as well as other areas (medicine, finance, ...). The attached slide
> from one of my decks tries to summarize the situation, but I'll give a
> bit of an outline below.
>
> So why is networking lagging many other fields when it comes to the
> application of machine learning? There are several reasons which I'll
> try to outline here (I was fortunate enough to discuss this with the
> packetpushers crew a few weeks ago, see [0]). These are in no particular
> order.
>
> First, we don't have a "useful" theory of networking (UTON). One way to
> think about what such a theory would look like is by analogy to what we
> see with the success of convolutional neural networks (CNNs) not only
> for vision but now for many other tasks. In that case there is a theory
> of how vision works, built up from concepts like receptive fields,
> shared weights, simple and complex cells, etc. For example, the input
> layer of a CNN isn't fully connected; rather connections reflect the
> receptive field of the input layer, which is in a way that is "inspired"
> by biological vision (being very careful with "biological inspiration").
> Same with the alternation of convolutional and pooling layers; these
> loosely model the alternation of simple and complex cells in the primary
> visual cortex (V1), the secondary visual cortex(V2) and the Brodmann
> area (V3). BTW, such a theory seems to be required for transfer learning
> [1], which we'll need if we don't want every network to be analyzed in
> an ad-hoc, one-off style (like we see today).
>
> The second thing that we need to think about is publicly available
> standardized data sets. Examples here include MNIST, ImageNet, and many
> others. The result of having these data sets has been the
> steady ratcheting down of error rates on tasks such as object and scene
> recognition, NLP, and others to super-human levels. Suffice it to say we
> have nothing like these data sets for networking. Networking data sets
> today are largely proprietary, and because there is no UTON, there is no
> real way to compare results between them.
>
> Third, there is a large skill set gap. Network engineers (us!) typically
> don't have the mathematical background required to build effective
> machine learning at scale. See [2] for an outline of some of the
> mathematical skills that are essential for effective ML. There is a lot
> more to this, involving how progress is made in ML (open data, open
> source, open models, in general open science and associated communities,
> see e.g., OpenAi [3], Distill [4], and many others). In any event we
> need build community and gain new skills if we want to be able to
> develop and apply state of the art machine learning algorithms to
> network data, at scale. The bottom line is that it will be difficult if
> not impossible to be effective in the ML space if we ourselves don't
> understand how it works and further, if we can build explainable systems
> (noting that explaining what the individual neurons in a deep neural
> network are doing is notoriously difficult; that said much progress is
> being made). So we want to build explainable, end-to-end trained
> systems, and to accomplish this we ourselves need to understand how
> these algorithms work, but in training and in inference.
>
> This email is already TL;DR but I'll add one more here: We need to learn
> control, not just prediction. Since we live in an inherently adversarial
> environment we need to take advantage of Reinforcement Learning as well
> as the various attacks being formulated against ML; [5] gives one
> interesting example of attacks against policy networks using adversarial
> examples. See also slides 31 and 32 of [6] for some more on this topic.
>
> I hope some of this gets us thinking about the problems we need to solve
> in order to be successful in the ML space. There's plenty more of this
> on http://www.1-4-5.net/~dmm/ml and http://www.1-4-5.net/~dmm/vita.html.
> I'm looking forward to the discussion.
>
> Thanks,
>
> --dmm
>
>
>
>
> [0]  http://packetpushers.net/podcast/podcasts/pq-show-107-applicability-machine-learning-networking/
>
> [1]  http://sebastianruder.com/transfer-learning/index.html
> [2]  http://datascience.ibm.com/blog/the-mathematics-of-machine-learning/
> [3] https://openai.com/blog/
> [4] http://distill.pub/
> [5] http://rll.berkeley.edu/adversarial/arXiv2017_AdversarialAttacks.pdf
> [6]  http://www.1-4-5.net/~dmm/ml/talks/2016/cor_ml4networking.pptx
>
>
>
> _______________________________________________
> IDNET mailing list
> IDNET@ietf.org
> https://www.ietf.org/mailman/listinfo/idnet
>