Re: [trill] Adam Roach's Discuss on draft-ietf-trill-ecn-support-05: (with DISCUSS and COMMENT)

Adam Roach <adam@nostrum.com> Thu, 08 February 2018 19:16 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 799C912D838; Thu, 8 Feb 2018 11:16:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.888
X-Spam-Level:
X-Spam-Status: No, score=-1.888 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, T_RP_MATCHES_RCVD=-0.01, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NwkS9a_FQJPA; Thu, 8 Feb 2018 11:16:02 -0800 (PST)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1021A12D810; Thu, 8 Feb 2018 11:16:02 -0800 (PST)
Received: from Svantevit.local (99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228]) (authenticated bits=0) by nostrum.com (8.15.2/8.15.2) with ESMTPSA id w18JFvwi065617 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Thu, 8 Feb 2018 13:15:57 -0600 (CST) (envelope-from adam@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host 99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228] claimed to be Svantevit.local
To: Donald Eastlake <d3e3e3@gmail.com>
Cc: draft-ietf-trill-ecn-support@ietf.org, trill-chairs@ietf.org, The IESG <iesg@ietf.org>, Susan Hares <shares@ndzh.com>, trill@ietf.org
References: <151805235597.17192.8686136333532184356.idtracker@ietfa.amsl.com> <CAF4+nEGcZF4dNTgKQXbk2NWBFRkYo_E9m0uraaaKU6Nk+hRrQg@mail.gmail.com>
From: Adam Roach <adam@nostrum.com>
Message-ID: <fe824bd5-407b-3a65-7e56-94df827cf819@nostrum.com>
Date: Thu, 08 Feb 2018 13:15:51 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <CAF4+nEGcZF4dNTgKQXbk2NWBFRkYo_E9m0uraaaKU6Nk+hRrQg@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------06DA7DE5E72E6D7F25109387"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/trill/NX0zb_geK3wblflzkCzaNaRTEiE>
Subject: Re: [trill] Adam Roach's Discuss on draft-ietf-trill-ecn-support-05: (with DISCUSS and COMMENT)
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/trill/>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Feb 2018 19:16:08 -0000

On 2/8/18 11:53 AM, Donald Eastlake wrote:
> Hi Adam,
>
> On Wed, Feb 7, 2018 at 8:12 PM, Adam Roach <adam@nostrum.com 
> <mailto:adam@nostrum.com>> wrote:
> > Adam Roach has entered the following ballot position for
> > draft-ietf-trill-ecn-support-05: Discuss
> >
> > ...
> >
> > ----------------------------------------------------------------------
> > DISCUSS:
> > ----------------------------------------------------------------------
> >
> > Thanks to the authors, chairs, shepherd, and working group for the 
> effort that
> > has been put into this document.
> >
> > I have concerns about some ambiguity and/or self-contradiction in this
> > specification, but I suspect these should be easy to fix. In 
> particular, the
> > behavior defined in Table 3 doesn't seem to be consistent with the 
> behavior
> > described in the prose.
> >
> > For easy reference, I've copied Table 3 here:
> >
> >> +---------+----------------------------------------------+
> >>       | Inner   |  Arriving TRILL 3-bit ECN Codepoint Name     |
> >>       | Native  +---------+------------+------------+----------+
> >>       | Header  | Not-ECT | ECT(0)     | ECT(1)     |     CE   |
> >> +---------+---------+------------+------------+----------+
> >>       | Not-ECT | Not-ECT | Not-ECT(*) | Not-ECT(*) |  <drop>  |
> >>       |  ECT(0) |  ECT(0) |  ECT(0)    |  ECT(1)    |     CE   |
> >>       |  ECT(1) |  ECT(1) |  ECT(1)(*) |  ECT(1)    |     CE   |
> >>       |    CE   |      CE |      CE    |      CE(*) |     CE   |
> >> +---------+---------+------------+------------+----------+
> >>
> >>                      Table 3. Egress ECN Behavior
> >>
> >>  An asterisk in the above table indicates a currently unused
> >>  combination that SHOULD be logged. In contrast to [RFC6040], in TRILL
> >>  the drop condition is the result of a valid combination of events and
> >>  need not be logged.
> >
> > The prose in this document indicates:
> >
> >  1. Ingress gateway either copies the native header value to the 
> TRILL ECN
> >     codepoint (resulting in any of the four values above) or doesn't 
> insert
> >     any ECN information in the TRILL header.
> >
> >  2. Intermediate gateways can set the CCE flag, resulting in "CE" in the
> >     table above.
> >
> > Based on the above, a packet arriving at an egress gateway can only 
> be in one of
> > the following states:
> >
> >  A. TRILL header is Not-ECT because no TRILL node inserted ECN 
> information.
> >
> >  B. TRILL header value == Native header value because the ingress 
> gateway
> >     copied it from native to TRILL.
> >
> >  C. TRILL header is "CE" because an intermediate node indicated 
> congestion.
>
> Sort of... But note that the TRILL header ECN bit s can indicate 
> non-ECT while the CCE bit is set making the overall TRILL Header "CE".

Right. That's part of case C. My states above assume application of 
Table 2 has already taken place.

>
> > If that's correct, I would think that any state other than those 
> three needs
> > to be marked with an (*). In particular, these two states fall into that
> > classification, and seem to require an asterisk:
> >
> >   - Native==CE && TRILL==ECT(0)
> >
> >   - Native==ECT(0) && TRILL==ECT(1)
>
> I would defer to my co-author Bob Briscoe but it looks to me like you 
> have a good point.

Thanks; I'll wait to hear from Bob.

> > In addition, the behavior this table defines for Native==ECT(0) && 
> TRILL==ECT(1)
> > is somewhat perplexing: for this case, the value in the TRILL header 
> takes
> > precedence; however, when Native==ECT(1) && TRILL==ECT(0) the Native 
> header
> > takes precedence. Or, put another way, this table defines ECT(1) to 
> always
> > override ECT(0). I don't find any prose in here to indicate why this 
> needs to be
> > treated differentially, so I'm left to conclude that this is a 
> typographical
> > error. If that's not the case, please add motivating text to Table 3 
> explaining
> > why ECT(1) is treated differently than ECT(0) for baseline ECN behavior.
>
> As noted in Section 4.1, there is an ECN variation where ECT(0) just 
> indicates ECT while ECT(1) indicates congestion of a lesser severity 
> level has been encountered than that indicated by CE. I believe the 
> dominance of ECT(1) over ECT(0) is designed to not interfere with this 
> variation.

Yes, and section 4 opens with the explanation that "Section 3 specifies 
interworking between TRILL and the original standardized form of ECN in 
IP [RFC3168]." Beyond that, I wouldn't expect any of the text in 
non-normative section 4 or its subsections to have bearing on the 
normative table in Section 3.

My reading of RFC 8311 is that it contemplates a series of experiments 
beyond those currently under development. It may well be that the 
current experiments consider ECT(1) to have a higher severity than 
ECT(0), but that this may not make sense for future experiments. Even if 
it does, I don't see guidance in RFC 8311 (or any other update to RFC 
3168) that suggests such a relationship between ECT(0) and ECT(1) exists.

On top of this (as implied by the existence of section 4), the TRILL 
handling for ECN will need to vary from experiment to experiment. It 
seems reasonable that part of this variability would include different 
mapping of ECN bits by the egress gateway.

Basically, I see two ways this can be resolved, although I'm happy to 
hear alternatives so long as they end up with the ECN and TRILL 
documents being in a consistent and future-proof state:

Approach 1: Revise Table 3 so that ECT(0) and ECT(1) are treated the 
same (i.e., pick one of "TRILL header wins" or "Native header wins" -- I 
would suggest "Native header wins," for maximal compatibility with older 
ECN clients but I'm not dogmatic on this point), and then include a 
normative statement that allows RFC 8311 experiments to override this 
mapping as makes sense for their design.

Approach 2: Leave the table as is, but add an explanation of why ECT(1) 
is given preferential treatment over ECT(0). Add a normative dependency 
from this document to a new document that updates RFC 8311 to add a 
requirement that any experiments that treat ECT(0) differently than 
ECT(1) MUST be designed such ECT(0) always indicates a lower severity of 
congestion than ECT(1). I presume that this new document would need to 
be done in coordination with TSVWG.

I think Approach 1 is more straightforward, but if there's a feeling in 
the working group that we need default egress behavior that is 
forwards-compatible with yet-undesigned experiments, then I think 
Approach 2 is what you need.

/a