Re: [Tsv-art] Tsvart last call review of draft-ietf-detnet-architecture-08

Lou Berger <lberger@labn.net> Mon, 19 November 2018 15:43 UTC

Return-Path: <lberger@labn.net>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 44D4B130DDA for <tsv-art@ietfa.amsl.com>; Mon, 19 Nov 2018 07:43:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.902
X-Spam-Level:
X-Spam-Status: No, score=-1.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (768-bit key) header.d=labn.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id szR8hvkUCWG7 for <tsv-art@ietfa.amsl.com>; Mon, 19 Nov 2018 07:43:30 -0800 (PST)
Received: from gproxy7-pub.mail.unifiedlayer.com (gproxy7-pub.mail.unifiedlayer.com [70.40.196.235]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C604F130E5C for <tsv-art@ietf.org>; Mon, 19 Nov 2018 07:43:29 -0800 (PST)
Received: from cmgw14.unifiedlayer.com (unknown [10.9.0.14]) by gproxy7.mail.unifiedlayer.com (Postfix) with ESMTP id D73F421952B for <tsv-art@ietf.org>; Mon, 19 Nov 2018 08:34:59 -0700 (MST)
Received: from box313.bluehost.com ([69.89.31.113]) by cmsmtp with ESMTP id OlZjgWjv5vdTuOlZjgnH9v; Mon, 19 Nov 2018 08:34:59 -0700
X-Authority-Reason: nr=8
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=labn.net; s=default; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:MIME-Version :Date:Message-ID:From:References:Cc:To:Subject:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=nWksdGcOqCc1EOuVmXcuAtDmeWXcNOSdUm/9wJ1zZts=; b=jVDG+6IPvY/zd5YhlRy7EIL2Cw HKu4X+oq0RNbRewENuMYWIKRYsGNE6xP0HJvnIGL0e4RsujAiSuNO/7nIsGjc693hTXqYu/2RyOiA eZYMIzT8qdfzUx8sndq363V2t;
Received: from pool-100-15-106-211.washdc.fios.verizon.net ([100.15.106.211]:49512 helo=[IPv6:::1]) by box313.bluehost.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <lberger@labn.net>) id 1gOlZj-000YL6-B0; Mon, 19 Nov 2018 08:34:59 -0700
To: Michael Scharf <michael.scharf@hs-esslingen.de>
Cc: tsv-art@ietf.org, draft-ietf-detnet-architecture.all@ietf.org, detnet@ietf.org
References: <153817345967.27205.135001179751151278@ietfa.amsl.com>
From: Lou Berger <lberger@labn.net>
Message-ID: <fdf872d6-08a6-2c33-de21-9dd1506c1d21@labn.net>
Date: Mon, 19 Nov 2018 10:34:58 -0500
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1
MIME-Version: 1.0
In-Reply-To: <153817345967.27205.135001179751151278@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - box313.bluehost.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - labn.net
X-BWhitelist: no
X-Source-IP: 100.15.106.211
X-Source-L: No
X-Exim-ID: 1gOlZj-000YL6-B0
X-Source:
X-Source-Args:
X-Source-Dir:
X-Source-Sender: pool-100-15-106-211.washdc.fios.verizon.net ([IPv6:::1]) [100.15.106.211]:49512
X-Source-Auth: lberger@labn.net
X-Email-Count: 6
X-Source-Cap: bGFibm1vYmk7bGFibm1vYmk7Ym94MzEzLmJsdWVob3N0LmNvbQ==
X-Local-Domain: yes
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/d85OCQ-7InceFtPIpJ0Lhkq_VEs>
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-detnet-architecture-08
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Nov 2018 15:43:41 -0000

Michael,

I wanted to take a step back from the multiple discussions that were 
spawned by your review -- from a doc shepherd perspective, and see where 
we are.   I know that the authors have sent a -09 version that addresses 
some, but not all issues.

 From the exchanges I've seen, I think the key remaining issues are 
related to:

(a) possibly introduction of congestion in the general internet if 
packets were somehow to escape a detnet domain.  The source of this 
congestion would be inelastic traffic using DetNet or due to congestion 
loss that is masked by PREOF.

(b) The use of the term 'transport' in DetNet to refer to what is 
basically a Traffic Engineered sub-network layer, such as is provided 
with MPLS-TE or Optical Transport Networks.

Do you have any other issues that that are critical to be addressed 
before this work moves forward?  If so which?

Thank you,

Lou

On 9/28/2018 6:24 PM, Michael Scharf wrote:
> Reviewer: Michael Scharf
> Review result: Ready with Issues
>
> The document "Deterministic Networking Architecture"
> (draft-ietf-detnet-architecture-08) defines an overall framework for
> Deterministic Networking.
>
> As TSV-ART reviewer, I believe that this document has issues as detailed below.
>
> Michael
>
> Major issues:
>
> * It seems that DetNet cannot easily be deployed in the Internet without
> additional means. Thus, for a baseline document, one could expect some
> explanation on the requirements of deploying DetNet in a network. DetNet
> basically requires support in (almost) all network devices transporting DetNet
> traffic. That assumption should be explicitly spelt out early in the document,
> e.g., in the introduction. There also needs to be an explicit discussion of the
> implications if not the whole network is aware of or supports DetNet. There is
> some text in Section 4.2.2 and Section 4.3.3, but I believe additional explicit
> discussion is needed at a prominant place. For instance, can use of DetNet do
> harm to parts of a network not supporting DetNet? As a side note, when TCPM
> published RFC 8257, the following disclaimer was added: "DCTCP, as described in
> this specification, is applicable to deployments in controlled environments
> like data centers, but it must not be deployed over the public Internet without
> additional measures." I wonder if a similar disclaimer is needed for DetNet. If
> there is an implicit assumption that DetNet will  be used in homogenous
> environments with mostly DetNet-aware devices within the same organization,
> such an assumption should be made explicit.
>
> * It is surprising that there is hardly any discussion on network robustness
> and safety; this probably also relates to security. For instance,
> misconfiguration or errors of functions performing packet replication could
> severely and permantly congest a network and cause harm. How does the DetNet
> architecture ensure that a network stays fully operational e.g. if the topology
> changes or there are equipment failures? Probably this can be solved by
> implementations (e.g., dynamic control plane), but why are corresponding
> requirements not spelt out? Section 3.3.2 speculates that filters and policers
> can help, and that may be true, but that probably still assumes consistently
> and correctly configured (and well-behaving) devices. And Section 3.3.2 is
> vague and mentions a "infinite variety of possible failures" without stating
> any requirements or recommendations. There may be further solutions, such as
> circuit breakers and the like. Why are such topics not discussed?
>
> * Somewhat related, the document only looks at impact of failures to the QoS of
> DetNet traffic. What is missing is a discussion how to protect non-DetNet parts
> of a network from any harm caused by DetNet mechanisms. Solutions to this
> probably exist. But why is the impact on non-DetNet traffic (e.g., in case of
> topology changes or failures of DetNet functions) not discussed at all in the
> document?
>
> * Regarding security, an architecture like DetNet probably requires that only
> authenticated and authorized end systems have access to the data plane. The
> security considerations only briefly mention the control aspect ("the
> authentication and authorization of the controlling systems").
>
> * For an architecture document, the lack of clarity and consistency regarding
> terminology is concerning. This specifically applies to the case of incomplete
> networks (as per Section 4.2.2 and 4.3.3) that include "DetNet-unaware nodes".
> The document introduces terms such as "DetNet intermediate nodes" but then
> repeatedly uses generic terms such as "node" or "hop" that may include
> DetNet-unaware nodes. For instance, for incomplete networks, a sentence such as
> "The primary means by which DetNet achieves its QoS assurances is to reduce, or
> even completely eliminate, congestion within a node as a cause of packet loss"
> seems to only apply to "DetNet transit nodes" but not "DetNet-unaware nodes".
> Similar ambiguity exist for other use of the terms "hop" and "node", which may
> or may not include DetNet-unaware nodes. It is unclear why the document does
> not consistently use the terminology introduced in Section 2.1 in all sections
> and clearly distinguishes cases with and without DetNet support.
>
> * Section 4.4 refers to RFC 7426, which is an informational RFC on IRTF stream,
> and the document uses the concepts introduced there (e.g., "planes"). This is
> very confusing. First, an IETF Proposed Standard should probably refer to
> documents having IETF consensus. An example would be RFC 7491, albeit there is
> other related work as well, e.g., in the TEAS WG. Second, Section 4.4 is by and
> large decoupled from the rest of the document and not specific to DetNet.
> Neither do other sections of the document refer to the concepts introduced in
> Section 4.4, nor does Section 4.4 use the DetNet terminology or discuss
> applicability to DetNet. Section 4.4 even mentions explicitly at the end that
> it discusses aspects that are orthogonal to the DetNet architecture. It is not
> at all clear why Section 4.4 is in this document. Section 4.4 could be removed
> from the document without impacting the rest of the document.
>
> Minor issues:
>
> * Terminology "DetNet transport layer"
>
>    The term "transport layer" has a well-defined meaning in the IETF, e.g.
>    originating from RFC 1122. While "transport" and e.g. "transport network" is
>    used in the IETF for different technologies in different areas, I think the
>    term "transport layer" is typically understood to refer to transport
>    protocols such as TCP and UDP. As such, I personally find the term "DetNet
>    transport layer" misleading and confusing. The confusion is easy to see e.g.
>    in Figure 4, where UDP (which is a transport protocol as per RFC 1122) sits
>    on top of "transport".
>
>    Based on the document it also may be solution/implementation specific whether
>    the "DetNet transport layer" is actually a separate protocol layer compared
>    to the "DetNet service layer". Thus it is not clear to me why the word
>    "layer" has to be used, specifically in combination "transport layer".
>
>    To me as, the word "transport layer" (and "transport protocol") should be
>    used for protocols defined in TSV area, consistent with RFC 1122. But this is
>    probably a question to be sorted out by the IESG.
>
> * Page 9
>
>     A DetNet node may have other resources requiring allocation and/or
>     scheduling,
>
>    This is just one of several examples for inconsistent use of terminology.
>    What is a "DetNet node"? That term is not introduced in Section 2.1
>
> * Page 14
>
>     A DetNet network supports the dedication of a high proportion (e.g.
>     75%) of the network bandwidth to DetNet flows.
>
>    The 75% value is not reasoned. What prevents using 99% of the bandwidth for
>    DetNet traffic?
>
> * Page 15: Figure 2
>
>    If the term "transport layer" cannot be avoided, the labels in this figure
>    should at least be expanded to "DetNet transport layer".
>
> * Page 18: Figure 4
>
>    As already mentioned earlier, Figure 4 is confusing. UDP is a transport
>    protocol. If the term "transport" cannot be avoided, the labels in this
>    figure should at least be expanded to "DetNet transport".
>
> * Page 23
>
>     If the source transmits less data than this limit
>     allows, the unused resource such as link bandwidth can be made
>     available by the system to non-DetNet packets.
>
>    Could there be additional requirements on the use of unused resources by
>    non-DetNet packets, e.g., regarding preemption? I am just wondering... If
>    that was possible, a statement like "... can be made available by the system
>    to non-DetNet packets as long as all guarantees are fulfilled" would be on
>    the safe side, no?
>
> * Page 27:
>
>     DetNet achieves congestion protection and bounded delivery latency by
>     reserving bandwidth and buffer resources at every hop along the path
>     of the DetNet flow.
>
>    Why does this sentence use the word "hop"? As far as I understand, in DetNet
>    bandwidth and buffer resources are reserved in each DetNet intermediate node.
>    If there were hops over IP routers not being DetNet intermediate nodes, no
>    resources would be reserved there. As per Section 4.3.3, it is possible to
>    deploy DetNet this way. And obviously there can be resource bottlenecks below
>    IP, on devices that are not routers... So does "hop" here refer to IP router
>    hops or also to devices not processing IP (or IP/MPLS)?
>
> * Page 27:
>
>     Standard queuing and transmission selection algorithms allow a
>     central controller to compute the latency contribution of each
>     transit node to the end-to-end latency, ...
>
>    The text does not explain why a _central_ controller is needed for this
>    computation. Why would a distributed control plane not be able to realize
>    this computation. Isn't this implementation-specific?
>
> * Page 32
>
>    To somebody who is not deeply familiar with DetNet, it is impossible to parse
>    the description of the examples in Section 4.7.3. For instance, "VID +
>    multicast MAC address" is not introduced. I think this example must be
>    expaned with additional context and explanation to be useful to readers.
>
> * Page 34
>
>     There are three classes of information that a central controller or
>     distributed control plane needs to know that can only be obtained
>     from the end systems and/or nodes in the network.
>
>    Wouldn't it be sufficient to state "Provisioning of DetNet requires knowledge
>    about ...". Does it matter in this context whether the provisioning is done
>    by a central controller or a distributed control plane? For instance, could
>    the same paragraph also apply to a network that uses _multiple_ central
>    controllers, or hybrid combinations of central controllers and distributed
>    control planes? In general, an architecture document should be agnostic to
>    implementation aspects unless there is a specific need. In this specific
>    case, I fail to see a need to discuss the realization of the control plane of
>    a network.
>
> Editorial nits:
>
> * Page 9:
>
>     The low-level mechanisms described in Section 4.5 provide the
>     necessary regulation of transmissions by an end system or
>     intermediate node to provide congestion protection.  The allocation
>     of the bandwidth and buffers for a DetNet flow requires provisioning
>     A DetNet node may have other resources requiring allocation and/or
>     scheduling, that might otherwise be over-subscribed and trigger the
>     rejection of a reservation.
>
>    Probably a full stop is missing after "provisioning".
>
> * Page 11: "... along separate (disjoint non-SRLG) paths ..."
>
>    I find this confusing. I would understand e.g. "along separate
>    (SRLG-disjoint) paths".
>
> * Page 34:
>
>     When using a peer-
>     to-peer control plane, some of this information may be required by a
>     system's neighbors in the network.
>
>    Would "acquired" be a better term?
>
> * Page 34:
>
>     o  The identity of the system's neighbors, and the characteristics of
>        the link(s) between the systems, including the length (in
>        nanoseconds) of the link(s).
>
>    "Latency" or "delay" would probably be a better terms if the value is
>    measured in nanoseconds.
>
> * Page 35:
>
>     DetNet is provides a Quality of Service (QoS), and as such, does not
>     directly raise any new privacy considerations.
>
>    Broken sentence
>
> * Please expand acronyms on first use (e.g., OTN)
>
>