Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-machine-learning-00.txt
Albert Cabellos <albert.cabellos@gmail.com> Mon, 18 July 2016 11:33 UTC
Return-Path: <albert.cabellos@gmail.com>
X-Original-To: nmlrg@ietfa.amsl.com
Delivered-To: nmlrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 136E112D8E1 for <nmlrg@ietfa.amsl.com>; Mon, 18 Jul 2016 04:33:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Jd9onJPWVtMz for <nmlrg@ietfa.amsl.com>; Mon, 18 Jul 2016 04:33:53 -0700 (PDT)
Received: from mail-wm0-x22b.google.com (mail-wm0-x22b.google.com [IPv6:2a00:1450:400c:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 415AC12D8F7 for <nmlrg@irtf.org>; Mon, 18 Jul 2016 04:33:51 -0700 (PDT)
Received: by mail-wm0-x22b.google.com with SMTP id i5so112188770wmg.0 for <nmlrg@irtf.org>; Mon, 18 Jul 2016 04:33:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=UxuHaaZ28vseobFT7M/MOsKeVrs+xpA8qESuQ6UAzzM=; b=C4C9UyIQqUDJO5TnsDc9Bt6z87dxY10kx6sC/bQKpWHq73cQaV7aO8VZaKjmG1wCai dKH/KhV4JogLba2WcbomlvmKbLwNXgEjk8vR7PXq3vSq8+pqbBiwB8/yk2kgemr6BEB4 t02/inIjWNNJ6nOvkUOWHakNZ7rlWvKU4TyZGa+g7irRshTWV+aH/DC3emXn1JaALwS3 aYxCPaCf+VkRIdfHN3B+KqNhQEtIWqEw6hA5KAFEMGeIDZ7P3Z/EAcCqE37jrI6ByOKq dJpkPowqcyiA6qAva/56KmoOAysVbvO15auo2dOHtnzfWllT4CQWedvWIZbCUx2i3OfB a9Vg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=UxuHaaZ28vseobFT7M/MOsKeVrs+xpA8qESuQ6UAzzM=; b=URxnyY1uNfM527NQbj8WCg/JOPElrPPrhi/mszR1rFHyWoIy3zsm0XvzUBVdwSq4LQ aeH8FRXeun7g32Vqj0ia/HVHbBCuNXfy8KnsM1HVgKmoaghNZlRApydXZCCaDWM1bKZV IV4ksR6BHkUg7HaHazU05uydCjTAcDOcTizvOHa+EVdYDJfVqdZ5ya5JRbyRm3TXJzB8 uOOTh+nYGgktgLgdm8Sgm81uPHEPShqzc5ag8Dtxgo43QbwWaCfPFAzPPYlhmL/E4U9+ GReM3DU18qAz2yTqwjglF+Q8bqLI/fO7GIy2DaSvy7eDXXRwvWqIHUFeD1F6jGir6AG5 g/LQ==
X-Gm-Message-State: ALyK8tKbWmoUAglMIcHpb1nmIYx4mfegsNtOSoysZOzBMtVr/OFqu+LzGJ4W45y1Cenz6OUbRKGwnv2BEacYBQ==
X-Received: by 10.28.154.21 with SMTP id c21mr38512451wme.63.1468841629745; Mon, 18 Jul 2016 04:33:49 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.56.131 with HTTP; Mon, 18 Jul 2016 04:33:49 -0700 (PDT)
In-Reply-To: <578BF2BD.1060900@inria.fr>
References: <CAGE_QewtGRL58K-XLrFOE9a-vMjJEV8v5sthMQ3OeHdzAOKK8A@mail.gmail.com> <578BF2BD.1060900@inria.fr>
From: Albert Cabellos <albert.cabellos@gmail.com>
Date: Mon, 18 Jul 2016 13:33:49 +0200
Message-ID: <CAGE_QewKEGcLqb1XD-h98_sqHxxFAFzt22A-jDG-bWNN0ATQXA@mail.gmail.com>
To: Jérôme François <jerome.francois@inria.fr>
Content-Type: multipart/alternative; boundary="001a114bde2c9289810537e757d9"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nmlrg/Yzcm2krjr8Zamv0PQdtVBmJfhWE>
Cc: nmlrg@irtf.org, draft-jiang-nmlrg-traffic-machine-learning@ietf.org
Subject: Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-machine-learning-00.txt
X-BeenThere: nmlrg@irtf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Network Machine Learning Research Group <nmlrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nmlrg/>
List-Post: <mailto:nmlrg@irtf.org>
List-Help: <mailto:nmlrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nmlrg>, <mailto:nmlrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Jul 2016 11:33:55 -0000
Hi Jérôme Please see inline: On Sun, Jul 17, 2016 at 11:03 PM, Jérôme François <jerome.francois@inria.fr> wrote: > Hi Albert, > > 4.1. HTTPS Traffic Classification > > > [snip] > > As a concrete example, Google, Facebook or Amazon are service >> providers while maps, drive, gmail are services of Google. To >> identify them when they are accessed by a user, IP addresses and DNS >> (Domain Name System) names based identification is not reliable as >> the users can relies on intermediates to respectively serve as proxy >> or resolve DNS requests. The SNI (Server Name Indication) [RFC5246] >> is an extension of HTTPS which is indicated by the user when >> initiating the TLS handshake (Client Hello). SNI actually contains >> the hostname to which the request is addressed. Such an hostname is >> significative of the service and service provider name. However, SNI >> is an optional field and can be easily forged to circumvent HTTPS >> filtering without impacting service use [bypasssni]. More advanced >> mechanisms are hence necessary to improve the robustness of >> identification even in the case of non collaborative users. > > > I suggest being vendor-agnostic in the examples, the specific examples do > not improve the draft by any means. > > I guess that the examples helps to understand what we mean by service > provider and service, i.e. to illustrate that having two levels is > something common nowadays. > > I think that everyone understands this, I suggest to have a vendor-neutral document. > > [snip] > >> >> >> HTTPS Connection >> + >> |(1) >> +-------v------+ >> |TLS Connection| >> |Reconstruction| >> +-------+------+ >> |(2) >> +-------v------+ (3') (4') >> | Features +-------------+----------------------------+ >> | Extraction | | | >> +-------+------+ +-------v---------+ +----v----+ >> | |Service Provider +------------->Services | >> |(3) |L1 model | Load |L2 model | >> | +-------^---------+ services +----^----+ >> +-------v------+ | model X | >> |SNI Labelling | +----------------------------+ >> +-------+------+ |(5) >> | +-----------------------------------------+ >> +------------> Training and | >> (4) | Models building | >> +-----------------------------------------+ >> >> Two-levels HTTPS traffic classification >> >> In figure above, step(1) consists in reconstructing the HTTPS >> connection and retrieving packets on top of which the following >> metrics are observed (2): >> >> o Inter Arrival Time >> >> o Packet size >> >> o Encrypted data size: this feature has the advantage to be strongly >> related to the service accessed instead of the packet size which >> is biased by other lower layer headers >> >> Based on these values, aggregated features are computed: average, >> minimum, maximum, 25th percentile, median, 75th percentile. >> >> > Does the authors see value on listing all the traffic features in an ANNEX? > > All can be found in the referenced paper and those that contribute to a > good classification > are the ones given in the draft. So, as the author, I would say "no". > > Thanks! In any case, can the authors try to put together the features that they are using and see if there are any common ones? The document should be -to a reasonable extent- self-contained. cheers Albert > Best regards, > Jérôme >
- Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-… Albert Cabellos
- Re: [Nmlrg] Review for draft-jiang-nmlrg-traffic-… Jérôme François
- [Nmlrg] Review for draft-jiang-nmlrg-traffic-mach… Albert Cabellos