Re: [ippm] [spring] Monitoring metric to detect and locate congestion

<Ruediger.Geib@telekom.de> Wed, 26 February 2020 11:01 UTC

Return-Path: <Ruediger.Geib@telekom.de>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C7D5E3A0775; Wed, 26 Feb 2020 03:01:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=telekom.de
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zQKHll--C3jf; Wed, 26 Feb 2020 03:01:21 -0800 (PST)
Received: from mailout31.telekom.de (mailout31.telekom.de [194.25.225.143]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 909F33A07C8; Wed, 26 Feb 2020 03:01:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telekom.de; i=@telekom.de; q=dns/txt; s=dtag1; t=1582714881; x=1614250881; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=oHUtFvdtYyN197AcgmwbOfgVU151WCJBghgMi6XsGKQ=; b=Jrb2sBg1bgvEZVGDjQl61RoNjt3TksWJF7PD0U+7hGduMrwFztbU9TLP dM6OLnuK8ICLQTxx7YADyM5VvHw2sgb6h3fBNCYXG5j5Mb9OrSdTIf/bb uNZ/oBaBkMf1NwhLIVYkuc81+NVgb2e4s//1qkmBwrHmU9bAUzvhN37wK Ate+u78iDQ7+HvudGkZtITTo141RIEETZgYloip/pm6QL0Vt8pmcUAmcz dv+kfykmMA6AoYQbkb8gKHD0emM31r4m4aHvtPk3Spy36ZqE8cL40toV8 Bb6l5LtZxnvNV5tjfdA1ZRyrEATOHvJq/gAigBOjoCOjGhfOFl/h/3We3 A==;
IronPort-SDR: hbL2AhRu4sgqpt212l444KWpXSNxta3UkbcQ9LlkaV9RJncpzih312k7SMUR4zWf53hV7MzXwV hi2sR5APjP3A==
Received: from qdec94.de.t-internal.com ([10.171.255.41]) by MAILOUT31.dmznet.de.t-internal.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2020 12:01:18 +0100
IronPort-SDR: /7jF3xm0SQTwveDLfLSNWV1vz15CnAFF5zj6/qTHvaA4z9oXvepjTEei22Ulew61VBtQ6n6gH/ kzPe8rqtFVCslmBRLDxRCF0TIBC/RSllA=
X-IronPort-AV: E=Sophos; i="5.70,487,1574118000"; d="scan'208,217"; a="55160726"
X-MGA-submission: MDF3wB2rYspMOxw6wzOd6G2pTN/w31nlqmwjGvY4lmMm7XK3gMUC1Qe3mA3LkBrtzKM3B9mDQ5reKsroAjMAB1k8vmOA9KV/l80yxIJ8aXB8tkmfH4WKmNKlH7BRPGClFgSUzgQXItD+3vHVrX7KHy41VsEhm92u4qqadqZu8ZN1nA==
Received: from he105864.emea1.cds.t-internal.com ([10.169.119.41]) by QDEC97.de.t-internal.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA384; 26 Feb 2020 12:01:18 +0100
Received: from HE199743.EMEA1.cds.t-internal.com (10.169.119.51) by HE105864.emea1.cds.t-internal.com (10.169.119.41) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 26 Feb 2020 12:01:17 +0100
Received: from HE104162.emea1.cds.t-internal.com (10.171.40.37) by HE199743.EMEA1.cds.t-internal.com (10.169.119.51) with Microsoft SMTP Server (TLS) id 15.0.1497.2 via Frontend Transport; Wed, 26 Feb 2020 12:01:17 +0100
Received: from GER01-LEJ-obe.outbound.protection.outlook.de (51.5.80.22) by O365mail04.telekom.de (172.30.0.231) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 26 Feb 2020 12:01:15 +0100
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BvRcq9yFRauP0YGsMCPVtVG+hdfZ12RGjRB0zF3LKqPrz3kNpKUaybEPpudYS6xkxDlmgZM4e9U1BpubFl9qgBuHlhVIMznad00RmdIAB67Ipi9wN0gbyz069gXKMre5i7h8mxbP2oGhfM/SEZlftNlTqhH6RMt60sRxQgsqjGHPVVmzFG+NLcCvZDmDzrwi13OssTqaUihkhovfdJwFtt27lgLe329uqu7Ht3vVZjinXinqhAnt19uFN9aCTCKry0pyvqHduHV4c0PL9HQ0TAK+6fY6ivonGf+Fov7/2pVuMbOB9oPd6RHaheAIshTWGAXWlIVHxLeeVsBIoPkLCA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oHUtFvdtYyN197AcgmwbOfgVU151WCJBghgMi6XsGKQ=; b=LNbXBBNhY5mgaZwtrPFta/UPHpM/3D1iF8gZ7noZTsinaIU3vYxe/usWiYmgMoVMvceDYuWFP2rUWw8gc2C+ZNOj9g/GVeeqcpRRTAQt68iwaekzNmsHUr6YAV6zntaHUlj5W4YuNS2PZeSn8C/KimwKRoZE8tI/X9oDrTDiAKLkZkm6wwmfIRpFIqwaLH7jnDjgJslaRWATxYssgc/tIH4iB0tb7adhUGjzcUT4o6mE3eYfo096zvVPu5lqqVPdwgSKLmFFyHzhg1UyKLCsb2jTUejCkmx5u2Q3LOPG9MRkW2qP+JAV0BbzKo1cfAs06bSF06uSf0q7ieYQX/FM4A==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=telekom.de; dmarc=pass action=none header.from=telekom.de; dkim=pass header.d=telekom.de; arc=none
Received: from FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE (10.158.152.135) by FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE (10.158.152.135) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2750.21; Wed, 26 Feb 2020 11:01:12 +0000
Received: from FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE ([fe80::99c5:47b7:a0f:648]) by FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE ([fe80::99c5:47b7:a0f:648%5]) with mapi id 15.20.2750.021; Wed, 26 Feb 2020 11:01:12 +0000
From: Ruediger.Geib@telekom.de
To: robert@raszuk.net
CC: ippm-chairs@ietf.org, spring@ietf.org, ippm@ietf.org
Thread-Topic: [spring] Monitoring metric to detect and locate congestion
Thread-Index: AdXsd9MS3uEh95p5QyK2WBDxA74xnQAEqfmAAAB4ZeA=
Date: Wed, 26 Feb 2020 11:01:12 +0000
Message-ID: <FRXPR01MB039210375C9806E4D47DD14C9CEA0@FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE>
References: <FRXPR01MB03926E7F0A28B837D69C3C819CEA0@FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE> <CAOj+MMHZWoVaP8-h17n86cMB_ZY0CjyO9GNShDjKM_NDTxOXxA@mail.gmail.com>
In-Reply-To: <CAOj+MMHZWoVaP8-h17n86cMB_ZY0CjyO9GNShDjKM_NDTxOXxA@mail.gmail.com>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=Ruediger.Geib@telekom.de;
x-originating-ip: [164.19.3.3]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: b8ebfc34-c4bf-403e-d388-08d7baab3184
x-ms-traffictypediagnostic: FRXPR01MB0392:
x-microsoft-antispam-prvs: <FRXPR01MB0392FE1970F3E7BF7F77B08D9CEA0@FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE>
x-ms-oob-tlc-oobclassifiers: OLM:7691;
x-forefront-prvs: 0325F6C77B
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(366004)(346002)(136003)(376002)(39860400002)(199004)(189003)(76116006)(55016002)(9686003)(8676002)(19627235002)(85202003)(7696005)(33656002)(26005)(81166006)(9326002)(8936002)(81156014)(66574012)(86362001)(71200400001)(6916009)(186003)(66946007)(2906002)(66556008)(53546011)(966005)(85182001)(4326008)(64756008)(5660300002)(66476007)(54906003)(478600001)(316002)(66446008)(777600001); DIR:OUT; SFP:1101; SCL:1; SRVR:FRXPR01MB0392; H:FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1;
received-spf: None (protection.outlook.com: telekom.de does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: TF83nV0j3X867u6Hyj2SRELj13DI6Ok54W6cDT40pjBmmFMxzW2hgUE+u8MB5EveHedSs0sI9LrqXmYbXGLFNl+KX/pvO12l+QYzz+Ge9PMi9CtDD9wgOlcR0KSxDt2QVhZhwCBAISq5ScLSFKe1D7iZrAYoGYm/9XKG13OsMapl82knaI+nJTc/EfXNriRXhhlJivQ4Ars3ZI3Z4HCAPokxo7WqzHMXU/eeYm3VxJoyXsK5/jbpor9PRt6Vvbsu6uBIU7+xuH/XcInffBdTKC0G6C3TVvL0lYdbmBghjsv7+bRDwKsB/UiPGKsnCiYmpLcrxPRBCwOJGC3jL5VDds4plDANHfRPjgKjKlV20B08aNHYfMFuJWsuIRKDUaOjDQM6cONJ88THWuh4pcoMsQ/5s4pgrro0nDgOH8aIDmsyj62MCzuzn625N7dAFGEWn+JyDbM648K/DzAfmfeFNSaC83E7Bmh2nkuPrt0/9HgPkCfX/Wx11oXsESpiv1o0Tn+v98YNr1wxYPXLwXtP9MGu0ta1SiiZuOmRQ5Yj+8vq4IVsawRXCB8aJ+5Pvy/G
x-ms-exchange-antispam-messagedata: hqKIpWNLwPmd8bDOFrOvMVn6a0P156+weR7K99lfvtxY6tHxD4hkWw35WKGlNSqxdSqkv0lUKY9SqzivPQU0WfCRfkTidiQG5cntPZBLvLoyjSk/S3C+VTtTtzJBDn7puFbyONp8hiWcGa76+oklmQ==
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_FRXPR01MB039210375C9806E4D47DD14C9CEA0FRXPR01MB0392DEUP_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: b8ebfc34-c4bf-403e-d388-08d7baab3184
X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Feb 2020 11:01:12.7777 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bde4dffc-4b60-4cf6-8b04-a5eeb25f5c4f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 657OFf9fw3NCtVas/n6zt1n4BYAoUk20a4agV0mxqxbOmCwkJKMld3P0enQiR/xa0Q6orNMkYn+o1MrGGVdGzNVwpRNguQslXLbYDoZu+hc=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: FRXPR01MB0392
X-TM-SNTS-SMTP: D4ADED1FDE9D69AE35F1D4E59F26051CBAB16C70569CA9122A62A1DDB389E4212000:8
X-OriginatorOrg: telekom.de
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/NQQNJWXIeN-2Xo2fG78tif9mQcA>
Subject: Re: [ippm] [spring] Monitoring metric to detect and locate congestion
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Feb 2020 11:01:26 -0000

Hi Robert,

Thanks, my replies in line marked [RG1]

I have read your draft and presentation with interest as I am a big supporter and in some lab trials  of end to end network path probing.

Few comments, observations, questions:

You are essentially measuring and comparing delay across N paths traversing known network topology (I love "network tomography" name !)

[RG1] it’s telemetry with a constrained set up, but the term doesn’t appear in the draft yet…that can be changed.

------
* First question - this will likely run on RE/RP and in some platforms path between LC and RE/RP is completely deterministic and can take 10s or 100s of ms locally in the router. So right here the proposal to compare anything may not really work - unless the spec mandates that actually timestamping is done in hardware on the receiving LC. Then CPU can process it when it has cycles.

[RG1] the measurements pass the routers on forwarding plane. High-end routers add variable processing latencies. It’s on a level of double or lower triple digit [us] on Deutsche Telekom backbone routers. If a dedicated sender receiver system is used, timestamping may be optimized for the purpose.
------

* Second question is that congestion usually has a very transient character ... You would need to be super lucky to find any congestion in normal network using test probes of any sort. If you have interfaces always congested then just the queue depth time delta may not be visible in end to end measurements.

[RG1] The probing frequency depends on the characteristics of congestion the operator wants to be aware of. Unplanned events may cause changes in measurement delays lasting for minutes or longer (congestion or hardware issues). The duration of a reliably detectable “event” correspond to the measurement packet distance (I don’t intend to replace hello-exchanges or BFD by the metric).
------

* Third - why not simply look at the queue counters at each node ? Queue depth, queue history, min, avg, max on a per interface basis offer tons of information readily available. Why would anyone need to inject loops of probe packets in known network to detect this ? And in black box unknown networks this is not going to work as you would not know the network topology in the first place. Likewise link down/up is already reflected in your syslog via BFD and IGP alarms. I really do not think you need end to end protocol to tell you that.

[RG1] Up to now, the conditions in Deutsche Telekom’s backbone network require a careful design of router probing interval and counters to be read. The proposed metric allows to capture persistent issues impacting forwarding. It points out, where these likely occur. An operator may then have a closer look at an interface/router to analyse what’s going on, using the full arsenal of accessible information and tools. As unusual events happen rarely, it may still be a fair question for which purpose linecard- and central processing cycles of routers are consumed.
-------
+ Thanks for catching the nit below..

Regards, Ruediger

s/nodes L100 and L200 one one/nodes L100 and L200 on one/

:)

Many thx,
R.

On Wed, Feb 26, 2020 at 8:55 AM <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>> wrote:
Dear IPPM (and SPRING) participants,

I’m solliciting interest in a new network monitoring metric which allows to detect and locate congested interfaces. Important properties are

  *   Same scalability as ICMP ping in the sense one measurement relation required per monitored connection
  *   Adds detection and location of congested interfaces as compared to ICMP ping (otherwise measured metrics are compatible with ICMP ping)
  *   Requires Segment Routing (which means, measurement on forwarding layer, no other interaction with passed routers – in opposite to ICMP ping)
  *   Active measurement (may be deployed using a single sender&receiver or separate sender and receiver, Segment Routing allows for both options)

I’d be happy to present the draft in Vancouver.. If there’s community interest. Please read and comment.

You’ll find slides at

https://datatracker.ietf.org/meeting/105/materials/slides-105-ippm-14-draft-geib-ippm-connectivity-monitoring-00

Draft url:

https://datatracker.ietf.org/doc/draft-geib-ippm-connectivity-monitoring/

Regards,

Ruediger
_______________________________________________
spring mailing list
spring@ietf.org<mailto:spring@ietf.org>
https://www.ietf.org/mailman/listinfo/spring