Re: [Nmlrg] links to each slide of presentations//RE: slides of NMLRG #3, June 27th, Athens

David Meyer <dmm@1-4-5.net> Tue, 28 June 2016 15:30 UTC

Return-Path: <dmm@1-4-5.net>
X-Original-To: nmlrg@ietfa.amsl.com
Delivered-To: nmlrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ECE9012D52F for <nmlrg@ietfa.amsl.com>; Tue, 28 Jun 2016 08:30:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=1-4-5-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uUUNmPafL7Ri for <nmlrg@ietfa.amsl.com>; Tue, 28 Jun 2016 08:30:40 -0700 (PDT)
Received: from mail-it0-x22c.google.com (mail-it0-x22c.google.com [IPv6:2607:f8b0:4001:c0b::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9C8C512D541 for <nmlrg@irtf.org>; Tue, 28 Jun 2016 08:30:40 -0700 (PDT)
Received: by mail-it0-x22c.google.com with SMTP id f6so18031594ith.0 for <nmlrg@irtf.org>; Tue, 28 Jun 2016 08:30:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1-4-5-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=DdGGVOsQcvlGY8mhpIoNF0wkE6r9DWdDXMaTCBoIfSY=; b=FrgTB++f+qFynf7LXKxdCvJY/82y4SHGVPHumAezPPEKK0TIf8yQ28Om5pJsNrOGx0 u/KsU3TkChz4LCnuI9xL1RBsY8BDK1+ETJBHVfkrQcXZYNWnv1CBnQujZYiIKodhewCw ODQGXR0LQpNzKAU1RQawDc1QDW9rTs8412QJ8HwKFFP6XpavPYRGc602ZLG9TJOqn5AT MrrfjZwgST4B9UUHWCFZoS1RT57/WA3pbIXvw9Nv79R/aanVgvzKofq+6ztKb5c6E/nv QBSQTA175bY8XIn81L/9V+n8rThtThyE+T9966iMWMKYpRt9LroegGcuxLBONsMJ7eu0 4ncA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=DdGGVOsQcvlGY8mhpIoNF0wkE6r9DWdDXMaTCBoIfSY=; b=MWCHE6UPFj98ObPFQ64qA160LEXjt9lAqMNKkp8B56iKwith2uHMBbiXrk53EZZVmE 8Ux046fk6nRtBs2DUsLFdfKgujZXztFjtdFnnazAlyHAr3N0Bs03pat1zxlIueJbWmGZ yM5gF+aejjmcC/HFAI96IoZZJrW/0C3V2b3Efu3k4RpAdLY+Ch2MRrHLTaLjvsKsq6A7 HGlW3ptqVXmQODr2M3rz1rr9KXhzdFtfDxuhfqSjtxxBUBk6yO0hllpjcXTjWwWfSe7Q hi+Gav1MC8MGWvBgDHu2L8DZuO2VBzvmaujfv2dfCkrO8vPdtijS0MRR98y0VWHAfP+q AB7g==
X-Gm-Message-State: ALyK8tJox6lREEddsbaLjTiKf1nJgPI146QiUtGgvwuKLZGpCS4N8f69vCpLFrjEqhHJdXzSUK8v8Uv0Qu31VQ==
X-Received: by 10.36.61.204 with SMTP id n195mr15862115itn.92.1467127839758; Tue, 28 Jun 2016 08:30:39 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.64.148.133 with HTTP; Tue, 28 Jun 2016 08:30:38 -0700 (PDT)
X-Originating-IP: [128.223.156.253]
In-Reply-To: <5D36713D8A4E7348A7E10DF7437A4B927CA893E9@NKGEML515-MBX.china.huawei.com>
References: <5D36713D8A4E7348A7E10DF7437A4B927CA893E9@NKGEML515-MBX.china.huawei.com>
From: David Meyer <dmm@1-4-5.net>
Date: Tue, 28 Jun 2016 08:30:38 -0700
Message-ID: <CAHiKxWhfdukjRVnSKnbLwMwamiqZYoCD350BoNqO4njqkQoU1g@mail.gmail.com>
To: Sheng Jiang <jiangsheng@huawei.com>
Content-Type: multipart/alternative; boundary="001a1144404ebacac80536585173"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nmlrg/eG98vUN3EaAiyeVGIkcv1r9xDYw>
Cc: "nmlrg@irtf.org" <nmlrg@irtf.org>
Subject: Re: [Nmlrg] links to each slide of presentations//RE: slides of NMLRG #3, June 27th, Athens
X-BeenThere: nmlrg@irtf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Network Machine Learning Research Group <nmlrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nmlrg/>
List-Post: <mailto:nmlrg@irtf.org>
List-Help: <mailto:nmlrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jun 2016 15:30:45 -0000

Sheng,

Thanks for the pointers. One thing I notice is that we don't have much on
what the characteristics of network data might be, and as such, what kinds
of existing learning algorithms might be suitable and where additional
development might be required. For example, is the collected data IID? Is
it time series? Is the underlying distribution stationary? If not, how does
that constrain algorithms we might use or develop? For example, how does a
given algorithm deal with concept drift/internal covariate shift (in the
case of DNNs; see e.g., batch normalization [0]). There are many other such
questions, such as is the data categorical (e.g., ports, IP addresses) or
is it continuous/discrete (e.g., counters). And if the data in question is
categorical, what is the cardinality of the categories (this will inform
how such data can be encoded); in the case of IP addresses we can't really
one-hot encode addresses because their cardinality is too large (2^32 or
2^128); this has implications for how we build classifiers (in particular,
for softmax layers in DNNs of various kinds).

Related to the above is the question of features? What are good features
for networking? Where do they come from? Are they domain specific? Can we
learn features the way a DNNs does in the network space? Can we use
autoencoders to discover such features? Or can we use GANs to train DNNs
for network classification tasks in an unsupervised manner? Are there
other, non-ad-hoc (well founded) methods we can use, or is every use case a
one-off (one would hope not).

We can carry the same kinds of analyzes to the algorithms applied. For
example, while something k-means is an effective way to get a feeling for
how continuous/discrete hangs together, if our data is categorical
statistical clustering approaches such as LDA might provide a more
well-founded approach (of course, as with most Bayesian techniques is the
question of approximate inference arises since in most interesting cases
the integral that we need to solve, namely the marginal probability data
isn't tractable so we need to resort to MCMC or more likely variational
inference). And what about the use of SGD/batch normalization etc with
DNNs, and perhaps more importantly, can we use network data to train DNN
policy networks for reinforcement learning like we saw in deep Q-learning
and AlphaGo?

These comments are all by way of saying that we don't have a solid
theoretical understanding (yet) of how techniques that have been so
successful in other domains (e.g., DNNs for perceptual tasks) generalize to
networking use cases. We will need this understanding if our goal is to
provide accurate, repeatable, and explainable results.

In order to accomplish all of this we need, as I have been saying , not
only a good understanding of how these algorithms work but also
standardized data sets and associated benchmarks so we can tell if we are
making progress (or even if our techniques work). Analogies here include
MNIST and ImageNet and their associated benchmarks, among others. As
mentioned  standardized data sets are key to making progress in the ML for
networking space (otherwise how do you know your technique works and/or
improves on another techniques?). One might assume that these data sets
would need to be labeled (as supervised learning is where most of the
progress is being made these days), but not necessarily; Generative
Adversarial Networks (GANs) have emerged as a new way to train DNNs in an
unsupervised manner (this is moving very rapidly; see e.g.,
https://openai.com/blog/generative-models/).

The summary here is that the "distance" between theory and practice in ML
is effectively zero right now due to the incredible rate of progress in the
field; this means we need  to understand both sides of the theory/practice
coin in order to be effective. None of the slide decks provide much
background on what the proposed algorithms are, how they work, or why they
should be expected to work on network data.

Finally, if you are interested in LDA or other algorithms there are a few
short explanatory pieces I have written for my team on
http://www.1-4-5.net/~dmm/ml (works in progress).

Thanks,

Dave

[0] https://arxiv.org/pdf/1502.03167.pdf

On Tue, Jun 28, 2016 at 7:03 AM, Sheng Jiang <jiangsheng@huawei.com> wrote:

> Oops... The proceeding page for interim meeting seems not be as
> intelligent as the proceeding pages of IETF meetings. Our proceeding page
> does not autonomically show the slides. I have sent an email to IETF
> secretary to ask them to fix it. Meanwhile, in this email, here is the
> links for each presentations:
>
> Slide Filename Edit Replace Delete
> Chair Slides
>
> https://www.ietf.org/proceedings/interim-2016-nmlrg-01/slides/slides-interim-2016-nmlrg-01-0.pdf
>
> Introduction to Network Machine Learning & NMLRG
>
> https://www.ietf.org/proceedings/interim-2016-nmlrg-01/slides/slides-interim-2016-nmlrg-01-1.pdf
>
> Data Collection and Analysis At High Security Lab
>
> https://www.ietf.org/proceedings/interim-2016-nmlrg-01/slides/slides-interim-2016-nmlrg-01-2.pdf
>
> Use Cases of Applying Machine Learning Mechanism with Network Traffic
>
> https://www.ietf.org/proceedings/interim-2016-nmlrg-01/slides/slides-interim-2016-nmlrg-01-3.pdf
>
> Mobile network state characterization and prediction
>
> https://www.ietf.org/proceedings/interim-2016-nmlrg-01/slides/slides-interim-2016-nmlrg-01-4.pdf
>
> Learning how to route
>
> https://www.ietf.org/proceedings/interim-2016-nmlrg-01/slides/slides-interim-2016-nmlrg-01-5.pdf
>
> Regards,
>
> Sheng
> ________________________________________
> From: nmlrg [nmlrg-bounces@irtf.org] on behalf of Sheng Jiang [
> jiangsheng@huawei.com]
> Sent: 28 June 2016 21:03
> To: nmlrg@irtf.org
> Subject: [Nmlrg] slides of NMLRG #3, June 27th, Athens
>
> Hi, nmlrg,
>
> All slides that have been presented in our NMLRG #3 meeting, June 27th,
> 2016, Athens, Greece, with EUCNC2016, have been uploaded. They can be
> accessed through below link
>
> https://www.ietf.org/proceedings/interim/2016/06/27/nmlrg/proceedings.html
>
> Best regards,
>
> Sheng
> _______________________________________________
> nmlrg mailing list
> nmlrg@irtf.org
> https://www.irtf.org/mailman/listinfo/nmlrg
> _______________________________________________
> nmlrg mailing list
> nmlrg@irtf.org
> https://www.irtf.org/mailman/listinfo/nmlrg
>