Re: [Idnet] Intelligence-Defined Network Architecture and Call for Interests

Hi,

equally important to having datasets* is proposing benchmarks with the
datasets.* Right now, everyone has different expectations and from my
personal experience, many reviewers at networking conferences won't accept
a "typical" machine learning paper with applications in networking
consisting of data+model+evaluation. Expectations range from something like
a large-scale practical application case to addressing worries about
adversarial influence on the learned model, or systems that take not only
care of machine-learning malicious/anomalous/interesting behavior but want
to have guidelines how to apply it, e.g. filtering rules and
parameter/threshold setting guidelines. Often, these values have to be set
with domain knowledge, adjusted to the specific application case (as
networks can be very different).

The lack of benchmarks makes it incredibly hard to compare different
solutions to each other, or even to just apply a new method to an old
problem; and indeed, reviewers have pointed out to me that the problem of
machine learning from network traffic has been solved already and there is
no point in further papers on this topic. A *shared benchmark* solves this
problem by delegating the argument for the relevance ONCE while setting up
the benchmark, rather than distributing the task to each author.

Cheers
Christian

On 28 March 2017 at 20:46, Jérôme François <jerome.francois@inria.fr> wrote:

> Hi
>
> Le 28/03/2017 à 20:25, Pedro Martinez-Julia a écrit :
> > On Tue, Mar 28, 2017 at 10:59:38AM -0700, David Meyer wrote:
> >> Hey Sheng,
> >>
> >> I just wanted to revive my key concern on [0] (same one I made at the
> >> NMRL): The hard parts of getting Machine Learning intelligence into
> >> Networking is the Machine Learning part. In addition, successful
> deployment
> >> of ML requires knowledge of ML combined with domain knowledge. We
> >> definitely have the domain knowledge; the problem is that we don't have
> the
> >> ML knowledge, and this is one of the big factors holding us back; see
> e.g.
> >> Andrew's discussion of talent in [1].  Slides such as [0] seem to imply
> >> that *someone else* (in particular, not us)  will handle the ML part of
> all
> >> of this. I'll just note that in general successful deployments of ML
> don't
> >> work this way; the domain experts will have to learn ML (and vice versa)
> >> for us to be successful (again, see [1] and many others).
> > Dear Dave,
> >
> > You are true in that ML/domain knowledge is necessary but, however it is
> > worth to take into account that it is not strictly required and it will
> > even be counterproductive in some (or maybe most) situations. At the end
> > of the day, encouraging (or forcing) a network expert to learn ML is
> > quite difficult, the results will be delayed until the learning phase
> > ends, and (most probably) s/he will never get a better solution than a
> > person that has been an expert in ML from a long time ago. Therefore, it
> > is better to make separate experts (in ML and the domain itself) to
> > collaborate in a common solution. Therefore, and I think it has been
> > mentioned before, we have to (try to) enroll experts in ML to the IDNET
> > group and see what can we do together...
> This is a general trend that only a single person cannot be expert in
> everything. Actually, a good network expert may require good ML but also
> good software skills (including software formal verification knowledge).
> So, in my opinion the problem is larger.
>
> Enhancing collaboration between network and ML expert is a path that
> starts in many company and insitutes I think. Discussing with ML
> experts, they are usually open and happy to discover new "use cases"
> but their first question will be "do you have some labelled datasets
> that we can work with" which relates to the problem raised in previous
> emails about open datasets.
>
> In my opinion, if we wan to attract ML experts in our dicussions, we
> should identify few scenrios, defined them precisely and provide an open
> dataset. By defining them, I mean we have to give them all background
> they need to understand (that can be built incrementally through
> discussion) in a well-documented format.
>
> jerome
>
> >> Perhaps a useful exercise would be to write an ID that makes your
> >> assumptions explicit?
> >>
> >> Thanks,
> >> Dave
> > Regards,
> > Pedro
> >
>
>
> _______________________________________________
> IDNET mailing list
> IDNET@ietf.org
> https://www.ietf.org/mailman/listinfo/idnet
>