Re: [spring] Monitoring metric to detect and locate congestion

Tianran Zhou <zhoutianran@huawei.com> Fri, 28 February 2020 01:37 UTC

Return-Path: <zhoutianran@huawei.com>
X-Original-To: spring@ietfa.amsl.com
Delivered-To: spring@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1CC353A0B92; Thu, 27 Feb 2020 17:37:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.799
X-Spam-Level:
X-Spam-Status: No, score=-1.799 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9erzh1ks7SDN; Thu, 27 Feb 2020 17:37:12 -0800 (PST)
Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 736AE3A0A62; Thu, 27 Feb 2020 17:37:12 -0800 (PST)
Received: from lhreml709-cah.china.huawei.com (unknown [172.18.7.108]) by Forcepoint Email with ESMTP id AA91781C5814F6AB0A14; Fri, 28 Feb 2020 01:37:10 +0000 (GMT)
Received: from lhreml712-chm.china.huawei.com (10.201.108.63) by lhreml709-cah.china.huawei.com (10.201.108.32) with Microsoft SMTP Server (TLS) id 14.3.408.0; Fri, 28 Feb 2020 01:37:10 +0000
Received: from lhreml712-chm.china.huawei.com (10.201.108.63) by lhreml712-chm.china.huawei.com (10.201.108.63) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Fri, 28 Feb 2020 01:37:10 +0000
Received: from NKGEML411-HUB.china.huawei.com (10.98.56.70) by lhreml712-chm.china.huawei.com (10.201.108.63) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.1.1713.5 via Frontend Transport; Fri, 28 Feb 2020 01:37:09 +0000
Received: from NKGEML515-MBX.china.huawei.com ([fe80::a54a:89d2:c471:ff]) by nkgeml411-hub.china.huawei.com ([10.98.56.70]) with mapi id 14.03.0439.000; Fri, 28 Feb 2020 09:37:06 +0800
From: Tianran Zhou <zhoutianran@huawei.com>
To: Haoyu Song <haoyu.song@futurewei.com>, Robert Raszuk <robert@raszuk.net>
CC: "ippm-chairs@ietf.org" <ippm-chairs@ietf.org>, "spring@ietf.org" <spring@ietf.org>, "ippm@ietf.org" <ippm@ietf.org>
Thread-Topic: [spring] Monitoring metric to detect and locate congestion
Thread-Index: AQHV7a7Ptdmb5GKzfkCnUw2fl3Hat6gvyZ+g//+APYCAAIhwMA==
Date: Fri, 28 Feb 2020 01:37:05 +0000
Message-ID: <BBA82579FD347748BEADC4C445EA0F21BF254F30@NKGEML515-MBX.china.huawei.com>
References: <FRXPR01MB03926E7F0A28B837D69C3C819CEA0@FRXPR01MB0392.DEUPRD01.PROD.OUTLOOK.DE> <BYAPR13MB2485E450C9A230E4A54104769AEB0@BYAPR13MB2485.namprd13.prod.outlook.com> <CAOj+MMFskpmOmc+xqf3K_nU7ePaXbTCF97nvKabdCa0xOosr_g@mail.gmail.com> <BYAPR13MB2485F3CFC7A36DE45352F7189AEB0@BYAPR13MB2485.namprd13.prod.outlook.com> <CAOj+MMHPU-EvGXSSe6959YLHF2TuaBub66DNcvvMKmbfz-XLpA@mail.gmail.com> <BYAPR13MB24852B404FE4B3FE114397F29AEB0@BYAPR13MB2485.namprd13.prod.outlook.com> <BBA82579FD347748BEADC4C445EA0F21BF254EF1@NKGEML515-MBX.china.huawei.com> <BYAPR13MB2485AC1197EFEAFB8D85165D9AE80@BYAPR13MB2485.namprd13.prod.outlook.com>
In-Reply-To: <BYAPR13MB2485AC1197EFEAFB8D85165D9AE80@BYAPR13MB2485.namprd13.prod.outlook.com>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.108.203.162]
Content-Type: multipart/alternative; boundary="_000_BBA82579FD347748BEADC4C445EA0F21BF254F30NKGEML515MBXchi_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/uP7Jiz106k2y1qQxsc6lhKRxdsM>
Subject: Re: [spring] Monitoring metric to detect and locate congestion
X-BeenThere: spring@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Source Packet Routing in NetworkinG \(SPRING\)" <spring.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/spring>, <mailto:spring-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/spring/>
List-Post: <mailto:spring@ietf.org>
List-Help: <mailto:spring-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/spring>, <mailto:spring-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Feb 2020 01:37:15 -0000

I take the second point. But it also gives short coming, the probe cannot take many data in one packet.
The first point I do not understand.
You said “combination ingress+egress queuing information on each transit node”. I suppose you mean all the queues information in this node. This in my opinion can only be collected in control plane.

Best,
Tianran

From: Haoyu Song [mailto:haoyu.song@futurewei.com]
Sent: Friday, February 28, 2020 9:23 AM
To: Tianran Zhou <zhoutianran@huawei.com>; Robert Raszuk <robert@raszuk.net>
Cc: ippm-chairs@ietf.org; spring@ietf.org; ippm@ietf.org
Subject: RE: [spring] Monitoring metric to detect and locate congestion

Two key differences. (1) you want to do this purely in data plane fast path. (2) you don’t want to keep a streaming session from each node for scalability reason
The path probing approach can meet both requirements.

Haoyu

From: Tianran Zhou <zhoutianran@huawei.com<mailto:zhoutianran@huawei.com>>
Sent: Thursday, February 27, 2020 5:14 PM
To: Haoyu Song <haoyu.song@futurewei.com<mailto:haoyu.song@futurewei.com>>; Robert Raszuk <robert@raszuk.net<mailto:robert@raszuk.net>>
Cc: ippm-chairs@ietf.org<mailto:ippm-chairs@ietf.org>; spring@ietf.org<mailto:spring@ietf.org>; ippm@ietf.org<mailto:ippm@ietf.org>
Subject: RE: [spring] Monitoring metric to detect and locate congestion

This is a good point. In this way, the probe is generically used to collect node data.
At each node, the probe will be send to the slow path to get the node data. Data could be carried in the payload which has more space.
Then, what’s the difference to existing way that Robert mentioned? Use streaming telemetry to export from each node.
As far as I know the state of art, many device support streaming telemetry can export in a period of 1-10 seconds. For customized data, the period could even be reduced to 10-100ms.

Best,
Tianran


From: ippm [mailto:ippm-bounces@ietf.org] On Behalf Of Haoyu Song
Sent: Friday, February 28, 2020 4:44 AM
To: Robert Raszuk <robert@raszuk.net<mailto:robert@raszuk.net>>
Cc: ippm-chairs@ietf.org<mailto:ippm-chairs@ietf.org>; spring@ietf.org<mailto:spring@ietf.org>; ippm@ietf.org<mailto:ippm@ietf.org>
Subject: Re: [ippm] [spring] Monitoring metric to detect and locate congestion

Can the combination ingress+egress queuing information on each transit node be collected by simply visit the node? If so, one or more probing paths that can cover all the nodes are sufficient.

Best,
Haoyu

From: Robert Raszuk <robert@raszuk.net<mailto:robert@raszuk.net>>
Sent: Thursday, February 27, 2020 12:18 PM
To: Haoyu Song <haoyu.song@futurewei.com<mailto:haoyu.song@futurewei.com>>
Cc: Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>; ippm-chairs@ietf.org<mailto:ippm-chairs@ietf.org>; spring@ietf.org<mailto:spring@ietf.org>; ippm@ietf.org<mailto:ippm@ietf.org>
Subject: Re: [spring] Monitoring metric to detect and locate congestion

The point is not to assure all egress ports are inspected.

The point of any really useful end to end path probing is to find out if combination of ingress+egress queuing on each transit node my packet may traverse are not congested.

Looking at Eulerian path algorithm I do not see such guarantees.

Best,
R.

PS. Path probing is A1 to B1. It is not A1 to B1, B2 .. Bn. where An are the ingress ports to the network and Bn are the egress ports.

On Thu, Feb 27, 2020 at 9:14 PM Haoyu Song <haoyu.song@futurewei.com<mailto:haoyu.song@futurewei.com>> wrote:
Hi Robert,

The Eulerian path algorithm guarantees to visit all the edges of a graph. In the SR context, we can consider the sub-path between two segments an edge.

Haoyu

From: Robert Raszuk <robert@raszuk.net<mailto:robert@raszuk.net>>
Sent: Thursday, February 27, 2020 11:50 AM
To: Haoyu Song <haoyu.song@futurewei.com<mailto:haoyu.song@futurewei.com>>
Cc: Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>; ippm-chairs@ietf.org<mailto:ippm-chairs@ietf.org>; spring@ietf.org<mailto:spring@ietf.org>; ippm@ietf.org<mailto:ippm@ietf.org>
Subject: Re: [spring] Monitoring metric to detect and locate congestion

Hi Haoyu,

> which applies Eulerian Path algorithm to find the minimum set of paths with network-wide coverage.

In practice networks use ECMP. ECMP decision may happen at each hop. Your end to end flows get spread over all ECMP paths. So limiting number of probed paths is inaccurate to the fundamental objective of the exercise.

That is infact main challenge with any end to end path probing today ... if you do not cover all possible paths your packets may take between ingress and egress you just do not get full picture of the network.

Thx a lot,
R.


On Thu, Feb 27, 2020 at 8:44 PM Haoyu Song <haoyu.song@futurewei.com<mailto:haoyu.song@futurewei.com>> wrote:
Hi Ruediger,

I like the general idea that using pre-determined paths in SR to collect performance metrics. I think this approach provides some unique benefits compared with the other approaches. It is also coincident with some of related research work I’m doing.
Here are some thoughts I have.

  1.  I think IOAM could be used as the standard approach for such probing packets. It can collect the performance metrics mentioned in the draft and does more.
  2.  An interesting problem raised by the draft but not fully addressed is the method to plan the optimal paths. There is a work called INT-PATH (https://ieeexplore.ieee.org/document/8737529<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fieeexplore.ieee.org%2Fdocument%2F8737529&data=02%7C01%7Chaoyu.song%40futurewei.com%7C03811f6f80554f7dad6408d7bbeb74f0%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637184492271070940&sdata=CMFuDR26gKPUjOYu3vcVlWNoTP%2FA%2FGiCIEnxERVBZmk%3D&reserved=0>) which applies Eulerian Path algorithm to find the minimum set of paths with network-wide coverage. However, the problem here seems different: you need path coverage redundancy. My question is: do we really need the redundancy to achieve the measurement goal? If so, what’s the best planning algorithm should be? In a real and large scale network, we have constraint on where the probing device(s) can be placed, and we usually want to monitoring the entire network, so an efficient algorithm is necessary.

Best regards,
Haoyu

From: ippm <ippm-bounces@ietf.org<mailto:ippm-bounces@ietf.org>> On Behalf Of Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>
Sent: Tuesday, February 25, 2020 11:55 PM
To: ippm-chairs@ietf.org<mailto:ippm-chairs@ietf.org>
Cc: spring@ietf.org<mailto:spring@ietf.org>; ippm@ietf.org<mailto:ippm@ietf.org>
Subject: [ippm] Monitoring metric to detect and locate congestion

Dear IPPM (and SPRING) participants,

I’m solliciting interest in a new network monitoring metric which allows to detect and locate congested interfaces. Important properties are

  *   Same scalability as ICMP ping in the sense one measurement relation required per monitored connection
  *   Adds detection and location of congested interfaces as compared to ICMP ping (otherwise measured metrics are compatible with ICMP ping)
  *   Requires Segment Routing (which means, measurement on forwarding layer, no other interaction with passed routers – in opposite to ICMP ping)
  *   Active measurement (may be deployed using a single sender&receiver or separate sender and receiver, Segment Routing allows for both options)

I’d be happy to present the draft in Vancouver... If there’s community interest. Please read and comment.

You’ll find slides at

https://datatracker.ietf.org/meeting/105/materials/slides-105-ippm-14-draft-geib-ippm-connectivity-monitoring-00<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fmeeting%2F105%2Fmaterials%2Fslides-105-ippm-14-draft-geib-ippm-connectivity-monitoring-00&data=02%7C01%7Chaoyu.song%40futurewei.com%7C03811f6f80554f7dad6408d7bbeb74f0%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637184492271070940&sdata=tL%2FtbMJ0Ei2e%2FUFZyBB2GFPEhj8qzCuWXo5upM9GNsE%3D&reserved=0>

Draft url:

https://datatracker.ietf.org/doc/draft-geib-ippm-connectivity-monitoring/<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fdraft-geib-ippm-connectivity-monitoring%2F&data=02%7C01%7Chaoyu.song%40futurewei.com%7C03811f6f80554f7dad6408d7bbeb74f0%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637184492271080924&sdata=qteR3ORsHz6mnbguKfMMQwcls9kaSi8NhAh9hFaH7b8%3D&reserved=0>

Regards,

Ruediger
_______________________________________________
spring mailing list
spring@ietf.org<mailto:spring@ietf.org>
https://www.ietf.org/mailman/listinfo/spring<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fmailman%2Flistinfo%2Fspring&data=02%7C01%7Chaoyu.song%40futurewei.com%7C03811f6f80554f7dad6408d7bbeb74f0%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637184492271080924&sdata=cUScWh29Li%2BFBJHIYBm6fcylGx6zrv%2F4RtEM6b5YMvs%3D&reserved=0>