Re: [Last-Call] Tsvart last call review of draft-ietf-opsawg-ntf-09

Haoyu Song <haoyu.song@futurewei.com> Mon, 01 November 2021 20:21 UTC

Return-Path: <haoyu.song@futurewei.com>
X-Original-To: last-call@ietfa.amsl.com
Delivered-To: last-call@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8E9A23A2F5C; Mon, 1 Nov 2021 13:21:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.09
X-Spam-Level:
X-Spam-Status: No, score=-2.09 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=futurewei.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 802Sd-Drnk6U; Mon, 1 Nov 2021 13:21:44 -0700 (PDT)
Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam08on2132.outbound.protection.outlook.com [40.107.102.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 50A933A2F38; Mon, 1 Nov 2021 13:21:44 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bWOhrpy0UbhkRqlGyW8tRzXN5zARL8FVuPQT5OQA0UvwwFCr4S01miMFVLS4CWsAW/gupoxxqbQkRkhCXSmSLAgPI9W+GadyFjSJEWjycekWGxGwMyCQeiVQq2EJVd/D8sDBXTy2QFGv0mF/BebAANNNXXol2ZxBff0Bvb2wQ+3q3agGtY7SbJI/fRnyIDmt/L6uagMJj9oIP3KOs39w2ttJnQQOw1lrVDDQ3h8cLz9TWFpC25EHQwEVk88zsO/7XCVwELxV38FhQyLQaJ1HvUN+nfl70Em3jaaJXAVhI4QACNE8/2fzVjNsHj+l6j99gZ4OxDs5ni2mmy3cVn5X9Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fWk+xAyfcfkNuK7tcwVDm8FF0f8XjkBBV/r2YgcrXAs=; b=DVTVJKclQ3cn7APnEzvEDFxJX2wjjc6jC4KtqMteJyhPLIx5wd4jHXxSKIyACIr+mmpIFbx6FlcK8BW42jyeEWatsz6ca+LL/8euTabj4d7P4htDEpiGojtwfwGNKPMrCJ1CuoYulmmubtr6guUEzBQyvuXJJWFUD/geUMU/encYJBmpIuGiOCgpEsFbLLVwl1aESVyWPYErYpuCr30SPkrJG3HH6Ux6dWIpmLCG3hrGRnwkEIOGYOrkPKQGynMFhZRxvEmct/XPnpSTlRiQOz9jiMeI7JDnKd27tvNkaWGPAtX5yyMeWAGgPOmhblxbZ395AYyC1W4+oqQFdn1bQQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=futurewei.com; dmarc=pass action=none header.from=futurewei.com; dkim=pass header.d=futurewei.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Futurewei.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fWk+xAyfcfkNuK7tcwVDm8FF0f8XjkBBV/r2YgcrXAs=; b=eyXZJSsi9APz1of+njkfHE9P+OnlJrPyD1tcWOKRzCuMTVnJ51mcF20aQUybwHOonXWI/XXxNULkczS5zlKyKDl8fPt8tZdjoyyHZIMZotsKPcyJQ9PBSk0GjEARTkMyCwepaWrLXl3cBYumWtfol10fZPtfjxx2kMXe2wqmDNk=
Received: from BY3PR13MB4787.namprd13.prod.outlook.com (2603:10b6:a03:357::13) by BY5PR13MB3523.namprd13.prod.outlook.com (2603:10b6:a03:1ae::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.7; Mon, 1 Nov 2021 20:21:31 +0000
Received: from BY3PR13MB4787.namprd13.prod.outlook.com ([fe80::410d:64ed:3b3a:a6b5]) by BY3PR13MB4787.namprd13.prod.outlook.com ([fe80::410d:64ed:3b3a:a6b5%7]) with mapi id 15.20.4669.006; Mon, 1 Nov 2021 20:21:31 +0000
From: Haoyu Song <haoyu.song@futurewei.com>
To: Michael Scharf <michael.scharf@hs-esslingen.de>, "tsv-art@ietf.org" <tsv-art@ietf.org>
CC: "draft-ietf-opsawg-ntf.all@ietf.org" <draft-ietf-opsawg-ntf.all@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>, "opsawg@ietf.org" <opsawg@ietf.org>
Thread-Topic: Tsvart last call review of draft-ietf-opsawg-ntf-09
Thread-Index: AQHXzq52JP2fHnsbpEu06jwWm2FsZKvvD3YA
Date: Mon, 01 Nov 2021 20:21:31 +0000
Message-ID: <BY3PR13MB47873525E293F371447364229A8A9@BY3PR13MB4787.namprd13.prod.outlook.com>
References: <163572266994.9090.12397686878265317058@ietfa.amsl.com>
In-Reply-To: <163572266994.9090.12397686878265317058@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: hs-esslingen.de; dkim=none (message not signed) header.d=none;hs-esslingen.de; dmarc=none action=none header.from=futurewei.com;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: f5975074-eb26-440b-fb58-08d99d753192
x-ms-traffictypediagnostic: BY5PR13MB3523:
x-microsoft-antispam-prvs: <BY5PR13MB3523DEB970C3F846D59C890F9A8A9@BY5PR13MB3523.namprd13.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: EgWW+4uWa8KhItkGf6gdVRxeXp9UXgXlsTP4ejV9wYm6BsgzHA6AUuQ73SbKUbeuwscp+RFZmiKXepYX7isIczg3EUS/n78KtDOdugjdeFGYEeIiNOlTh8KxZHqQoFiflhlcylL0Jbhm9JQcX/R+e+LLNMmaUtXO4Ne6vduZgg1Eh3pyjn2vDuXYqAqm/qHWEs9fPGvQUR2rSnA0xBkl2dFpgB4owIXrJTKqUc9kApzyENy0GNRtIp5A6Tjd6mt+vtPnjfZcvFdqxzbPY1WhjkxDr4jGBOuv6FyJYK342LIvOP5bbN3JYu2zhBRUrM4Qo0C/Fll+J6t2gfrgAqVdOoZUktSq3V4sM73HEINWi93nCj/TTSoZFNi4yvrwFdbYneGCN0dn3r/apA4SW1c+kgCcroxh6CFgzHRh1iSIpe83WNA5Ovg4QcKEPh4JLoAsrmykmSyCRbkmR/uPEGqrEIBKhWTcFJs8pkb/kHA3IIZWF5ApgFmCizECOYoEIDcgpNVisQwyG10u+aj+h8TK4i3yNVelFQdKt8Z51jQ2mGdHKDvEEKA66oCsV1SJBfPnDvioPynL11XlRqKOE0C4SSJcoV129tkvKosN4P0iWrp7bT7QhfwWr/yRSUqwaZvNhUutp+GLN1owMdT8P/rmU+cyqQEr8izfYauFdV/1S8OTcxCnF5ayX3G0q5inoQmwWaV9wldhJvuO7nqCNn+ljw==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY3PR13MB4787.namprd13.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(122000001)(38100700002)(52536014)(66574015)(86362001)(8936002)(71200400001)(44832011)(38070700005)(33656002)(186003)(8676002)(508600001)(53546011)(76116006)(5660300002)(7696005)(4326008)(66476007)(66556008)(66946007)(6506007)(2906002)(66446008)(64756008)(55016002)(316002)(54906003)(110136005)(9686003)(83380400001); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: AaXuwk7n+uVuOESH21JhVLazVKAxYijJ4Dj/Dw+VPX9iWI/RbaZWFHuK4F1p4XrjV17rjl3o5UvYvG6xomVbpAUSQM6OJxkvPfF821v/O8hjuGMOB7csKphZgyaLI//GY/TUn4AzyIkMNMe/eSD6iQ6H1fMN+8TAToCor5n8LByvcyhyRJlNxlyoecu/rWF8Dw8h1q/QsPZvHlJoh7++zgS9L0KDHVlR0imekgZxAPgfhre9X9Doa0E8Ddnv5qnQe/HFuRU/Z+090zkyPE3qLVVvMjGzxDcqHE8gGwMtvMH86zTxll7UPnwC47DxPuqLwkpgMHZeFzPPSBchZiiJP6sEpVq8JI0Bad7dddd1od8CsGyjlys5TAdBOnDHLgSxZ/Pg3oFhhRwY1WKzd6zUVjkTWrGR764TUQw5rdl54Q9l6QVt9OILSbTAJNjbsvoXkIjT1OLCYEBuhwJLBI76KxmSBIlGDGT4BpPwStt3waPNnGsSCO8t+vuEaifOCQfVaieh/IciP/z2Z4h8pdzhWfCqLeOJHKciXrC4+OLQ3/hKZpLu56e7TsAhtla0nj4NKgjfeOeAWhzCeK3tzOtC4u7KpWAU2s8QH4EKqrylxuq12t0zaZrsIMDmO3oHIH10ERvsYbjDFR1H3xysSvChHCHTgiWK9L4PhG1sAKs0/WT0NsuQi2pA3vElibft7AqO68P9MbbrSNooK28qpXoKbHmmrqRXgZxNSKwQBgx4SvPwFwU+95u1X6GvQDuw8eXKaZwYtG0tTORhF4ykDHg6SnjJWttfqIbyEejG3GIayTBMfXxhWMTkRPO1h4zD+kpewcXn9ng9b8pEnDzvYsRE3YuqeXn+kW1RYq/PiE4N+HbWzR92rgYWIbNxcF9SxPIz0MZ4EnVlZqz0z7tNGCOkLkJokl42coAzjiW27VGsjUOwuNVCScQ2z4BtV3lsOnX6j5OSI2P4SbnUFR9RBQwKbe3gd1ySq4cn0oWHeQ5lX0BWQetZ1j+PZzr5bi0B7FjYuBsXzFRwXZXLih/9YjngcnTKlVjYivYBRH+J8xx1KPIwgZ/nujRJ4uYhwbLL46QS++VdN/s8rNj5B08f/fTsGLtp/UQDQMpf0BQcDuIwHX6hDIDgEE9uYueZu8LgRENTlPtqI5EhfxZW5AWm5OOPF3Z9nkI4AbueCKNHpEb/+v+yUCe9uv2W7EJlMjOF07Uq+hF/yX3zMfjy61P8SJa1Sgn3EMjmA0UeZS/ocdDqeIiBZWm4Q83HAyc/3qxze4whxGTkQwtt1DUomGdoVSoBjHx6QokNKpV1zIpLAaTyg2bMHLfR49Z3wvlOXcRLfaySCNxs6GlKR7pMUVIR6t+aGBsRiS0EqaXUhYtE77HF94FmNsdreMVS6Vh2Pwr37PUZZ53OFitSypC03f6PTxuOJtEtkHsZ312/PBqHiezE5cVnH+b+mmP9Mxp0Jn3dBKht+ii64KCknllkA0I0ZN8UQ901EIKtTRiXHx58IajMi5oJcloxOQY8SYL9qtwSQqXnbKnaUX5JbS+AGTuP97W6Peyecn5xomfk2vs4724JUcVgcEynQYuBI/dHqmOpo6I8g6BnvA+H6YCAUrCZTRVfPLKc/Rlw2pOZUMTycKMNkb6WBSETCulSTaAVSM7zaBoMh5LaKQWbJJWYrAW+hWgB4B2fxyZVkB7NXqAzqXnpI4o=
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: Futurewei.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BY3PR13MB4787.namprd13.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: f5975074-eb26-440b-fb58-08d99d753192
X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Nov 2021 20:21:31.6712 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 0fee8ff2-a3b2-4018-9c75-3a1d5591fedc
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 1d2cXVKy/pNlnPuJQcH9hJwjFDLzCa5BGzo309nMzgaLRgRbdkDs6yZv+gXPfWOhIJhcvanoN0yCnKMRcBx73w==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR13MB3523
Archived-At: <https://mailarchive.ietf.org/arch/msg/last-call/0i3wAkizZcIMCkUz16Fdauls7Xw>
Subject: Re: [Last-Call] Tsvart last call review of draft-ietf-opsawg-ntf-09
X-BeenThere: last-call@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Last Calls <last-call.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/last-call>, <mailto:last-call-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/last-call/>
List-Post: <mailto:last-call@ietf.org>
List-Help: <mailto:last-call-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/last-call>, <mailto:last-call-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Nov 2021 20:21:50 -0000

Hi Michael,

Thank you very much for the review! 
According to your suggestion, we explicitly list the congestion avoidance as a requirement at each plane and add RFC8085 as BCP reference. 
We also take your suggestions on the precise terms used in the table. 
I have just one question:  While IPFIX can run over TCP/UDP/SCTP, for forwarding plane, we recommend to used it over UDP only for simplicity. Is this acceptable?
I'll upload a new version of the document as soon as the submission website is reopened. Thanks!

Best regards,
Haoyu

-----Original Message-----
From: Michael Scharf via Datatracker <noreply@ietf.org> 
Sent: Sunday, October 31, 2021 4:24 PM
To: tsv-art@ietf.org
Cc: draft-ietf-opsawg-ntf.all@ietf.org; last-call@ietf.org; opsawg@ietf.org
Subject: Tsvart last call review of draft-ietf-opsawg-ntf-09

Reviewer: Michael Scharf
Review result: Ready with Issues

This document has been reviewed as part of the transport area review team's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors and WG to allow them to address any issues raised and also to the IETF discussion list for information.

When done at the time of IETF Last Call, the authors should consider this review as part of the last-call comments they receive. Please always CC tsv-art@ietf.org if you reply to or forward this review.

This informational document describes an architectural framework for network telemetry and the main components of corresponding systems.

It has two issues related to TSV topics:

First, the document lacks a discussion of the importance of congestion control for telemetry traffic as well as corresponding references, e.g., to RFC 8085.
High-volume telemetry traffic can overload a network unless proper counter-measures are in place (i.e., at minimum "circuit breakers"). It doesn't seem appropriate to entirely ignore that issue.

Second, language regarding the ambigous term "transport" and the references to Internet transport protocols must be improved to be consistent with IETF standards.

Below are some examples for sections in which these issues are obvious.

Section 3.4

   It is worth noting that a network telemetry system should not be
   intrusive to normal network operations by avoiding the pitfall of the
   "observer effect".  That is, it should not change the network
   behavior and affect the forwarding performance.  Otherwise, the whole
   purpose of network telemetry is compromised.

=> This statement should be extended to be very explicit about the risk of causing network congestion by high-volume telemetry traffic unless proper isolation or traffic engineering techniques are in place, or congestion control mechanisms ensure that telemetry traffic backs off if it exceeds the network capacity. RFC 8085 is a relevant BCP in this space. As a side note, RFC 8085 discusses other relevant challenges as well, but the issues caused by potentially inelastic high-volume telemetry traffic seem particularly relevant for ensuring network stability when telemetry solutions get deployed.

4.1.  Top Level Modules

   +---------+--------------+--------------+---------------+-----------+
   | Module  | Management   | Control      | Forwarding    | External  |
   |         | Plane        | Plane        | Plane         | Data      |
   +---------+--------------+--------------+---------------+-----------+
   |Object   | config. &    | control      | flow & packet | terminal, |
   |         | operation    | protocol &   | QoS, traffic  | social &  |
   |         | state        | signaling,   | stat., buffer | environ-  |
   |         |              | RIB          | & queue stat.,| mental    |
   |         |              |              | ACL, FIB      |           |
   +---------+--------------+--------------+---------------+-----------+
   |Export   | main control | main control | fwding chip   | various   |
   |Location | CPU          | CPU,         | or linecard   |           |
   |         |              | linecard CPU | CPU; main     |           |
   |         |              | or forwarding| control CPU   |           |
   |         |              | chip         | unlikely      |           |
   +---------+--------------+--------------+---------------+-----------+
   |Data     | YANG, MIB,   | YANG,        | template,     | YANG,     |
   |Model    | syslog       | custom       | YANG,         | custom    |
   |         |              |              | custom        |           |
   +---------+--------------+--------------+---------------+-----------+
   |Data     | GPB, JSON,   | GPB, JSON,   | plain         | GPB, JSON |
   |Encoding | XML          | XML, plain   |               | XML, plain|
   +---------+--------------+--------------+---------------+-----------+
   |Protocol | gRPC,NETCONF,| gRPC,NETCONF,| IPFIX, mirror,| gRPC      |
   |         |              | IPFIX, mirror| gRPC, NETFLOW |           |
   +---------+--------------+--------------+---------------+-----------+
   |Transport| HTTP, TCP    | HTTP, TCP,   | UDP           | HTTP,TCP  |
   |         |              | UDP          |               | UDP       |
   +---------+--------------+--------------+---------------+-----------+

=> This table needs to be corrected.

1/ At least the entry in the column "forwarding plane" for IPFIX seems incorrect, as the IETF has standardized IPFIX use over TCP, UDP and also SCTP.

HS>> Yes, IPFIX can run over TCP/UDP/SCTP, but for forwarding plane, we recommend to used it over UDP only for simplicity. Is that okay?

2/ The label "transport" in the last line should be replaced by an other term (maybe "data transport"?). In the TCP/IP protocol stack, "HTTP" is not a transport but an application protocol, unlike TCP and UDP. As a result, the line headline should use a term that cannot be confused with the name of a layer in the TCP/IP protocol stack.


3/ The label "protocol" in the second but last line is also misleading. All entries in the "transport" line are protocols as well. The term "Application protocol" may be one option; others may exist as well.


4.1.1.  Management Plane Telemetry

   *  High Speed Data Transport: In order to keep up with the velocity
      of information, a server needs to be able to send large amounts of
      data at high frequency.  Compact encoding formats or data
      compression schemes are needed to reduce the quantity of data and
      improve the data transport efficiency.  The subscription mode, by
      replacing the query mode, reduces the interactions between clients
      and servers and helps to improve the server's efficiency.

=> The server is not the only bottleneck. This section needs to discuss the network as a potential bottleneck as well, and explain that a telemetry solution must protect the network from congestion by congestion control mechanisms or at least circuit breakers. RFC 8085 is a relevant BCP in this space.

4.1.2.  Control Plane Telemetry

=> Discussion of the risk of congestion by telemetry protocols without congestion control (e.g., using UDP possibly without circuit breakers) is missing in this section.

4.1.3.  Forwarding Plane Telemetry

   *  The data plane devices must provide timely data with the minimum
      possible delay.  Long processing, transport, storage, and analysis
      delay can impact the effectiveness of the control loop and even
      render the data useless.

=> Similar like in the previous section, this wording entirely ignores the impact of potential network capacity shortage and congestion. A reference to RFC 8085 and a corresponding discussion of how to meet the requirements from RFC 8085 is missing.

4.1.4.  External Data Telemetry

=> As the communication with "external" entites outside the boundary of a provider network may be realized over the Internet, the risk of congestion as well as proper counter-measures is even more relevant in this section as compared to the previous sections.