[Idnet] A few ideas/suggestions to get us going
David Meyer <dmm@1-4-5.net> Wed, 22 March 2017 17:29 UTC
Return-Path: <dmm@1-4-5.net>
X-Original-To: idnet@ietfa.amsl.com
Delivered-To: idnet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B8807129AD1 for <idnet@ietfa.amsl.com>; Wed, 22 Mar 2017 10:29:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DC_PNG_UNO_LARGO=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=1-4-5-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r0nE7up3S1Tc for <idnet@ietfa.amsl.com>; Wed, 22 Mar 2017 10:29:16 -0700 (PDT)
Received: from mail-qk0-x22a.google.com (mail-qk0-x22a.google.com [IPv6:2607:f8b0:400d:c09::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5D5691294DF for <idnet@ietf.org>; Wed, 22 Mar 2017 10:29:13 -0700 (PDT)
Received: by mail-qk0-x22a.google.com with SMTP id y76so161981379qkb.0 for <idnet@ietf.org>; Wed, 22 Mar 2017 10:29:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1-4-5-net.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=fd2vN1LcmtSBHXWYhG0zVnWcPiCo7vKqJsvBwDbOBbE=; b=YVghGqUriGbnrgixxsP4C/KIZEp9x/6D8KN0hkDgxBc/JnfB8G2dNdO76QaE1Rd8Jd 0jlIItLl4qvz0ShBW3VCKJ+rcArWpIOs2Pji0LV0aIUSg4ULyJ2YomsaZyfUasXdqRmA pU4Ixz/kCEa88+KrjoP1/92ty294ORktgFMj6Umr91t4IkI/FLaek5U8LaePY3WXSNuG 1VL76vEfBGioor4dY5IZYOJApEM8foSF67uCz4m/YvQcA8KSFzR6P2xF/pRtoL3gVKtp 6gohVSYjDLVyetFyV3bdnBmRDkOzztGJncVtchpaLvOQDRcmssX1xyurQfb8fB20eYyU PO4Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=fd2vN1LcmtSBHXWYhG0zVnWcPiCo7vKqJsvBwDbOBbE=; b=R72yxF0Cz/qMxE9IT9/TceKZ2hS7UyF3XT07LQ6IlYNMnYi0hpqQbxLASHPcjkSSEE 3sErpj4/YaFGtZn6ahxnK6CPkxN9LLAm0TLczHAmePDGihxwGk4u2pHBUITlmJkj5TfI RjPxzeCE4ogzHQn5LTsbx3PIMq2EVtPx/AAV8A6t3H9ZBrWswxX9LhulUVlx3y5zv5cB zTH41pzV6HvXjOyGy2WyulGzT8GNona+NSA70eI1wOYPtiRQvxvB6nHiiU3i7jc2z79Y R1qw2F3WwQbSxgidgrIIra/fOBWgd1A3lwYlVvvDkkpvO1Va/RLwyf/qpj7cYZgmOq/A x+fw==
X-Gm-Message-State: AFeK/H2CIXA3WmX1EyNIwLJyjMYSV1ByA9UzvTW1F/MWN32mMotylDFbbxnqj5uhlOMNsODG2UhB282vk7jXhg==
X-Received: by 10.55.65.81 with SMTP id o78mr17444453qka.82.1490203752186; Wed, 22 Mar 2017 10:29:12 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.12.149.34 with HTTP; Wed, 22 Mar 2017 10:29:11 -0700 (PDT)
X-Originating-IP: [98.207.11.149]
From: David Meyer <dmm@1-4-5.net>
Date: Wed, 22 Mar 2017 10:29:11 -0700
Message-ID: <CAHiKxWh26ciY-Pf78EH3CLO1+d3utikMr1N8GwKWJzkZQAAu9g@mail.gmail.com>
To: idnet@ietf.org
Content-Type: multipart/mixed; boundary="001a114ac54e4b8be0054b551931"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idnet/bi0jnNvs3htnkwnXVS14Hjkj0ng>
Subject: [Idnet] A few ideas/suggestions to get us going
X-BeenThere: idnet@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "The IDNet \(Intelligence-Defined Network\) " <idnet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idnet>, <mailto:idnet-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idnet/>
List-Post: <mailto:idnet@ietf.org>
List-Help: <mailto:idnet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idnet>, <mailto:idnet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Mar 2017 17:29:20 -0000
Folks, I thought I'd try to get some discussion going by outlining some of my views as to why networking is lagging other areas in the development and application of Machine Learning (ML). In particular, networking is way behind what we might call the "perceptual tasks" (vision, NLP, robotics, etc) as well as other areas (medicine, finance, ...). The attached slide from one of my decks tries to summarize the situation, but I'll give a bit of an outline below. So why is networking lagging many other fields when it comes to the application of machine learning? There are several reasons which I'll try to outline here (I was fortunate enough to discuss this with the packetpushers crew a few weeks ago, see [0]). These are in no particular order. First, we don't have a "useful" theory of networking (UTON). One way to think about what such a theory would look like is by analogy to what we see with the success of convolutional neural networks (CNNs) not only for vision but now for many other tasks. In that case there is a theory of how vision works, built up from concepts like receptive fields, shared weights, simple and complex cells, etc. For example, the input layer of a CNN isn't fully connected; rather connections reflect the receptive field of the input layer, which is in a way that is "inspired" by biological vision (being very careful with "biological inspiration"). Same with the alternation of convolutional and pooling layers; these loosely model the alternation of simple and complex cells in the primary visual cortex (V1), the secondary visual cortex(V2) and the Brodmann area (V3). BTW, such a theory seems to be required for transfer learning [1], which we'll need if we don't want every network to be analyzed in an ad-hoc, one-off style (like we see today). The second thing that we need to think about is publicly available standardized data sets. Examples here include MNIST, ImageNet, and many others. The result of having these data sets has been the steady ratcheting down of error rates on tasks such as object and scene recognition, NLP, and others to super-human levels. Suffice it to say we have nothing like these data sets for networking. Networking data sets today are largely proprietary, and because there is no UTON, there is no real way to compare results between them. Third, there is a large skill set gap. Network engineers (us!) typically don't have the mathematical background required to build effective machine learning at scale. See [2] for an outline of some of the mathematical skills that are essential for effective ML. There is a lot more to this, involving how progress is made in ML (open data, open source, open models, in general open science and associated communities, see e.g., OpenAi [3], Distill [4], and many others). In any event we need build community and gain new skills if we want to be able to develop and apply state of the art machine learning algorithms to network data, at scale. The bottom line is that it will be difficult if not impossible to be effective in the ML space if we ourselves don't understand how it works and further, if we can build explainable systems (noting that explaining what the individual neurons in a deep neural network are doing is notoriously difficult; that said much progress is being made). So we want to build explainable, end-to-end trained systems, and to accomplish this we ourselves need to understand how these algorithms work, but in training and in inference. This email is already TL;DR but I'll add one more here: We need to learn control, not just prediction. Since we live in an inherently adversarial environment we need to take advantage of Reinforcement Learning as well as the various attacks being formulated against ML; [5] gives one interesting example of attacks against policy networks using adversarial examples. See also slides 31 and 32 of [6] for some more on this topic. I hope some of this gets us thinking about the problems we need to solve in order to be successful in the ML space. There's plenty more of this on http://www.1-4-5.net/~dmm/ml and http://www.1-4-5.net/~dmm/vita.html. I'm looking forward to the discussion. Thanks, --dmm [0] http://packetpushers.net/podcast/podcasts/pq-show-107-applicability-machine-learning-networking/ [1] http://sebastianruder.com/transfer-learning/index.html [2] http://datascience.ibm.com/blog/the-mathematics-of-machine-learning/ [3] https://openai.com/blog/ [4] http://distill.pub/ [5] http://rll.berkeley.edu/adversarial/arXiv2017_AdversarialAttacks.pdf [6] http://www.1-4-5.net/~dmm/ml/talks/2016/cor_ml4networking.pptx
- [Idnet] A few ideas/suggestions to get us going David Meyer
- Re: [Idnet] A few ideas/suggestions to get us goi… Rana Pratap Sircar
- Re: [Idnet] A few ideas/suggestions to get us goi… Henk Birkholz
- Re: [Idnet] A few ideas/suggestions to get us goi… David Meyer
- Re: [Idnet] A few ideas/suggestions to get us goi… David Meyer
- Re: [Idnet] A few ideas/suggestions to get us goi… David Meyer
- Re: [Idnet] A few ideas/suggestions to get us goi… Adeel Rehman
- Re: [Idnet] A few ideas/suggestions to get us goi… Pedro Martinez-Julia
- Re: [Idnet] A few ideas/suggestions to get us goi… David Meyer
- Re: [Idnet] A few ideas/suggestions to get us goi… João Paulo S. Medeiros
- Re: [Idnet] A few ideas/suggestions to get us goi… Pedro Martinez-Julia
- Re: [Idnet] A few ideas/suggestions to get us goi… João Paulo S. Medeiros
- Re: [Idnet] A few ideas/suggestions to get us goi… David Meyer
- [Idnet] FPS game traffic datasets... Re: A few id… grenville armitage