Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-machine-learning-00.txt
Jérôme François <jerome.francois@inria.fr> Sun, 17 July 2016 21:04 UTC
Return-Path: <jerome.francois@inria.fr>
X-Original-To: nmlrg@ietfa.amsl.com
Delivered-To: nmlrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 92A3B12D0B6 for <nmlrg@ietfa.amsl.com>; Sun, 17 Jul 2016 14:04:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.186
X-Spam-Level:
X-Spam-Status: No, score=-8.186 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-1.287] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BuX5sLRhSYIh for <nmlrg@ietfa.amsl.com>; Sun, 17 Jul 2016 14:04:06 -0700 (PDT)
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 39D0612B04D for <nmlrg@irtf.org>; Sun, 17 Jul 2016 14:04:06 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="5.28,380,1464645600"; d="scan'208,217";a="226939911"
Received: from unknown (HELO [10.141.69.144]) ([46.189.28.189]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 17 Jul 2016 23:04:03 +0200
Message-ID: <578BF2BD.1060900@inria.fr>
Date: Sun, 17 Jul 2016 23:03:57 +0200
From: Jérôme François <jerome.francois@inria.fr>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: Albert Cabellos <albert.cabellos@gmail.com>, draft-jiang-nmlrg-traffic-machine-learning@ietf.org, nmlrg@irtf.org
References: <CAGE_QewtGRL58K-XLrFOE9a-vMjJEV8v5sthMQ3OeHdzAOKK8A@mail.gmail.com>
In-Reply-To: <CAGE_QewtGRL58K-XLrFOE9a-vMjJEV8v5sthMQ3OeHdzAOKK8A@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------050509060708010607030708"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nmlrg/IiRqdG7pfmB4qeMs32xWyo8fGzs>
Subject: Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-machine-learning-00.txt
X-BeenThere: nmlrg@irtf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Network Machine Learning Research Group <nmlrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nmlrg/>
List-Post: <mailto:nmlrg@irtf.org>
List-Help: <mailto:nmlrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sun, 17 Jul 2016 21:04:08 -0000
Hi Albert, > 4.1. HTTPS Traffic Classification > > > [snip] > > As a concrete example, Google, Facebook or Amazon are service > providers while maps, drive, gmail are services of Google. To > identify them when they are accessed by a user, IP addresses > and DNS > (Domain Name System) names based identification is not reliable as > the users can relies on intermediates to respectively serve as > proxy > or resolve DNS requests. The SNI (Server Name Indication) > [RFC5246] > is an extension of HTTPS which is indicated by the user when > initiating the TLS handshake (Client Hello). SNI actually contains > the hostname to which the request is addressed. Such an > hostname is > significative of the service and service provider name. > However, SNI > is an optional field and can be easily forged to circumvent HTTPS > filtering without impacting service use [bypasssni]. More advanced > mechanisms are hence necessary to improve the robustness of > identification even in the case of non collaborative users. > > > I suggest being vendor-agnostic in the examples, the specific examples > do not improve the draft by any means. I guess that the examples helps to understand what we mean by service provider and service, i.e. to illustrate that having two levels is something common nowadays. > > [snip] > > > > HTTPS Connection > + > |(1) > +-------v------+ > |TLS Connection| > |Reconstruction| > +-------+------+ > |(2) > +-------v------+ (3') (4') > | Features +-------------+----------------------------+ > | Extraction | | | > +-------+------+ +-------v---------+ +----v----+ > | |Service Provider +------------->Services | > |(3) |L1 model | Load |L2 model | > | +-------^---------+ services +----^----+ > +-------v------+ | model X | > |SNI Labelling | +----------------------------+ > +-------+------+ |(5) > | +-----------------------------------------+ > +------------> Training and | > (4) | Models building | > +-----------------------------------------+ > > Two-levels HTTPS traffic classification > > In figure above, step(1) consists in reconstructing the HTTPS > connection and retrieving packets on top of which the following > metrics are observed (2): > > o Inter Arrival Time > > o Packet size > > o Encrypted data size: this feature has the advantage to be > strongly > related to the service accessed instead of the packet size which > is biased by other lower layer headers > > Based on these values, aggregated features are computed: average, > minimum, maximum, 25th percentile, median, 75th percentile. > > > Does the authors see value on listing all the traffic features in an > ANNEX? > All can be found in the referenced paper and those that contribute to a good classification are the ones given in the draft. So, as the author, I would say "no". Best regards, Jérôme
- Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-… Albert Cabellos
- Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-… Jérôme François
- [Nmlrg] Review for draft-jiang-nmlrg-traffic-mach… Albert Cabellos