Re: [lmap] more comments on draft-akhter-lmap-framework

"Aamer Akhter (aakhter)" <aakhter@cisco.com> Fri, 26 July 2013 22:58 UTC

Return-Path: <aakhter@cisco.com>
X-Original-To: lmap@ietfa.amsl.com
Delivered-To: lmap@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BA5CA11E8178 for <lmap@ietfa.amsl.com>; Fri, 26 Jul 2013 15:58:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.598
X-Spam-Level:
X-Spam-Status: No, score=-110.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NBV7ksyN5nrf for <lmap@ietfa.amsl.com>; Fri, 26 Jul 2013 15:58:13 -0700 (PDT)
Received: from rcdn-iport-2.cisco.com (rcdn-iport-2.cisco.com [173.37.86.73]) by ietfa.amsl.com (Postfix) with ESMTP id 793D211E8162 for <lmap@ietf.org>; Fri, 26 Jul 2013 15:58:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=19515; q=dns/txt; s=iport; t=1374879493; x=1376089093; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=uf4ZUqbdcUsp3skzfhTQWi81784PSn5/DzCSkihp4Gw=; b=JeEbiKGDo2EgZTay4alnZhq2eFKM6oAQlGGvLqy+6vkbsM5FB7BtNd8g /10+t592oHI/tKdFKYEAGEdmUYRp8alH0RoygFoSO7gks2UAbR3MU3PXf 9PT1oo2QSU3qGRwZCZHNuQloeKDgY4imX6zetuz/K6PUnGtwgMfNyzOLj w=;
X-IronPort-AV: E=Sophos; i="4.89,754,1367971200"; d="scan'208,217"; a="240094812"
Received: from rcdn-core-1.cisco.com ([173.37.93.152]) by rcdn-iport-2.cisco.com with ESMTP; 26 Jul 2013 22:58:12 +0000
Received: from xhc-aln-x15.cisco.com (xhc-aln-x15.cisco.com [173.36.12.89]) by rcdn-core-1.cisco.com (8.14.5/8.14.5) with ESMTP id r6QMwCbr007405 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 26 Jul 2013 22:58:12 GMT
Received: from xmb-rcd-x15.cisco.com ([169.254.5.80]) by xhc-aln-x15.cisco.com ([173.36.12.89]) with mapi id 14.02.0318.004; Fri, 26 Jul 2013 17:58:11 -0500
From: "Aamer Akhter (aakhter)" <aakhter@cisco.com>
To: "philip.eardley@bt.com" <philip.eardley@bt.com>, "lmap@ietf.org" <lmap@ietf.org>
Thread-Topic: more comments on draft-akhter-lmap-framework
Thread-Index: Ac6KAlnFRWmvDRXbSs6U8NaTz3KU5gATHF6Q
Date: Fri, 26 Jul 2013 22:58:11 +0000
Message-ID: <75C0E47A1889264493A2DCB2869AC0963337932E@xmb-rcd-x15.cisco.com>
References: <9510D26531EF184D9017DF24659BB87F35CA37F523@EMV65-UKRD.domain1.systemhost.net>
In-Reply-To: <9510D26531EF184D9017DF24659BB87F35CA37F523@EMV65-UKRD.domain1.systemhost.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [64.102.40.39]
Content-Type: multipart/alternative; boundary="_000_75C0E47A1889264493A2DCB2869AC0963337932Exmbrcdx15ciscoc_"
MIME-Version: 1.0
Cc: "Paul Aitken (paitken)" <paitken@cisco.com>
Subject: Re: [lmap] more comments on draft-akhter-lmap-framework
X-BeenThere: lmap@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Large Scale Measurement of Access network Performance <lmap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lmap>, <mailto:lmap-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lmap>
List-Post: <mailto:lmap@ietf.org>
List-Help: <mailto:lmap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lmap>, <mailto:lmap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jul 2013 22:58:19 -0000

Hi Phil,

Thanks for the comments.

Inline with AA>

From: lmap-bounces@ietf.org [mailto:lmap-bounces@ietf.org] On Behalf Of philip.eardley@bt.com
Sent: Friday, July 26, 2013 10:46 AM
To: lmap@ietf.org
Subject: [lmap] more comments on draft-akhter-lmap-framework

My final batch of comments

First, I think some of the material in Section 3 could be helpful expansion (on what was rather briefly mentioned in draft-eardley-lmap-framework). Sections 4 & 5 have lots of interesting stuff - question is whether this is a general framework for large-scale measurements (where we need to talk in detail about the tests) or whether it's a framework for LMAP WG (which only needs a brief mention of areas outside the scope of lmap wg, and so wouldn't have much about the tests, as this is ippm).

AA> Thanks and do not disagree completely. However, when it comes to the methods used for controlling MA to MA we do need to be mindful of what the options are as well as the tradeoffs. I do like the earlier suggestion (I think from Dan) that suggests separating out the properties of the active test control protocol that we (in the LMAP case) would need to worry about. This might even suggest if there is even a direct MA to MA test control channel or if it all needs to go via the controller.


Secondly, there were several comments about multiple measurement agents (& peers?) acting together. (Section 3.2 & 6.2.1). you mention several different cases, I'm not sure I understood properly or got them all.

AA> The multiple measurement agent case is when there is no anchor measurement agent (or the anchor measurement agent does not have enough capacity to properly saturate the path-in the case of a path capacity measurement). The lack of an anchor measurement agent may come about when the regulator is trying to make an access link test (within the ISP) but as it's a non-ISP entity it would not have test resources that are well bandwidth connected within the ISP. Also, the regulator does not want to stress the ISP upstream interface. If we were to pool the resources of multiple ISP connected MAs (sitting in other broadband sites) we would be able create the saturation w/o stressing the other broadband sites links. I put a diagram in the presentation that Paul will be presenting that might be more helpful.

(1) I'm sure there's a case where one test triggers another (say, the dns result followed by request to website) - this is ok.
(2) I think you have a 'latency under load' test (S3.2), where the latency is measured for a pkt/file being sent between MA & Measurement Peer, but another Peer is generating other traffic so that a bottleneck queue is loaded. I haven't thought about this much - do you think there are other tests involving a second Peer? or even 3rd, 4th...? Or is it just 'latency under load'?  I'm wondering whether it would be best to think about the specifics of how to do the 'latency under load' test, rather than a general capability for coordinating multiple Peers in one test.

AA> it was meant to just be an aggregate path capacity measurement. Sorry it was not clear-I think this section needs more verbiage etc.

(3) you mention (S6.2.1) a case where multiple Measurement Agents submit results all with the same Measurement Task ID. I guess this is a test where lots of coordinated measurements are made. You also mention a shared key between multiple MAs, which I think must be related. I don't understand this case, please explain more.
Am interested to understand this whole area better, regardless of where the WG decides to draw the line between in scope & out.

AA> The measurement task ID is simply the index that is used to collate the individual results submitted by the MAs post-collection. This would be when the MAs are all given the same instruction etc.. I did a search in the document for 'shared key' and this is only in S4.2.2. This shared key is completely different and is a security device used to secure test control protocol (eg. OWAMP) communications between the MAs.


Third, you have some stuff about task scheduling where it seems important to have highly synchronised time between the controller, MAs and collector (and measurement peer?). why would this be needed? You mention about NTP and the issue that NTP doesn't give very accurate time sync to a device on the end of an access line (because you can't triangulate to multiple time servers). I suppose the error might be a few millisecs - the sorts of cases I had in mind it would easily be accurate enough. Incidentally I don't see why your solution (NTP clock sent from controller) helps, since you still have the lack of triangulation on the end of an access line.

AA> The context of the time synchronization was when (for example) a one-way-latency test is run between two MAs. In the case of a round-trip-time test such time synchronization would not be needed, and of course we simply can't do RTT/2. The time concerns are really just bubbling up from the OWAMP spec, but I believe them to be valid.

AA> regarding the NTP source as the controller - this was simply an attempt to get the MAs on to the same time hierarchy rather than differing unknown ones. It doesn't really have to be the controller itself, just known time source that everybody agrees on. Will clarify.

AA> that said ( I don't believe this is in the draft-aakhter-lmap-framework) I do see some value in having the collector being loosely (vs tightly) synchronised with the MAs. This could be used as a way to detect a MA that is completely out of time sync when it submits its reports. Perhaps useful....

Thanks again,
aa


Thanks
Best wishes
phil