Re: [trill] [Tsv-art] Tsvart early review of draft-ietf-trill-over-ip-10 - ECN & DSCP considerations

Donald Eastlake <d3e3e3@gmail.com> Sat, 01 July 2017 18:04 UTC

Return-Path: <d3e3e3@gmail.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 752E1127342; Sat, 1 Jul 2017 11:04:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.449
X-Spam-Level:
X-Spam-Status: No, score=-2.449 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nNj-w13x68mL; Sat, 1 Jul 2017 11:03:57 -0700 (PDT)
Received: from mail-it0-x22a.google.com (mail-it0-x22a.google.com [IPv6:2607:f8b0:4001:c0b::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 888171201F8; Sat, 1 Jul 2017 11:03:57 -0700 (PDT)
Received: by mail-it0-x22a.google.com with SMTP id m84so39271311ita.0; Sat, 01 Jul 2017 11:03:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=YPY4SuaFP+9Psn09VAzrOc9z7ISAZ0GTqP2XcK3eHKc=; b=QQvvMQno0aEgsR8P8W/qc0w9Bvt5yxKp2YQ63okbMtbo0TG40PdBaJGwGVVDMnhmei o5aj0T3DTOh3YSmnO9y8C5iOM8IH5V2j8eLXpwELH32nuLA0Mt0WDMjFfl9mskHFt8M9 wHqcuIzgNO80MqR7Snb2k8CK34i1Vyya20lBnvT8Qa+6Lf5xmUwzentQYE75v4RJFABZ fCuKWpusQQRg5bCAac5QQCOF1NSzHp+rJFkF2Nzlfdb/ULBuUkTqOWnKvYvVBH0jSqMy 3UgwZMcajg3ddFbPS9w/dikl46KpKpWzIS/NFxFA165E5/vC37NMlwCgujSq45N0zjww gTAQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=YPY4SuaFP+9Psn09VAzrOc9z7ISAZ0GTqP2XcK3eHKc=; b=CKElRv+89mhxLXsykn6vodblAo/nmFY4BoaG3OPPUO5XzyNdBmBgQVGsML57OKKKdN eQZzNn0RhRmu0y73uwQBalf5AzxH5jIKGvc8SSFqcgLQJay9cj0IqCw3J9LP6+S7JdIB 6jm2R/u5KW4OkJesqiK9AVSjs6hikQv+Eh+/WJSsrGvk7sp6etT86D63S1Ef8uaBDMk1 Y4Zs4tM4vRWcqPrrtjh2amqMS/Qsxz8eNtk1Q4kiDcmwGf4OAI4dJdzg95I8DmWIuxDI ZYCG2kXmeKnmzlzaIpOiqHMCXixyUeBIvpcmTIZx+LYlDh1VgmnXqXNfW8SdCXmrAZUJ dBLA==
X-Gm-Message-State: AIVw112Cw6FkKya6PqJ/nQSLsDe2oe/nlt6r03GtXTLMQ491wyWqRiZD Es54B9jMj+4UfF9AxV+PWbIuEqiRWA==
X-Received: by 10.36.3.11 with SMTP id e11mr1848026ite.79.1498932236595; Sat, 01 Jul 2017 11:03:56 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.17.221 with HTTP; Sat, 1 Jul 2017 11:03:41 -0700 (PDT)
In-Reply-To: <CE03DB3D7B45C245BCA0D243277949362FB59E00@MX307CL04.corp.emc.com>
References: <CE03DB3D7B45C245BCA0D243277949362FB4DF88@MX307CL04.corp.emc.com> <CAF4+nEFUmB=eMO6rvK6-YiL+3Adp_t1QVqgMDBHZFn73_JxEfw@mail.gmail.com> <CE03DB3D7B45C245BCA0D243277949362FB59E00@MX307CL04.corp.emc.com>
From: Donald Eastlake <d3e3e3@gmail.com>
Date: Sat, 01 Jul 2017 14:03:41 -0400
Message-ID: <CAF4+nEEdv8XRSKmpffe1fU+4wRpikeixzX-49g1ocQSJBaqgSg@mail.gmail.com>
To: "Black, David" <David.Black@dell.com>
Cc: Magnus Westerlund <magnus.westerlund@ericsson.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>, "draft-ietf-trill-over-ip.all@ietf.org" <draft-ietf-trill-over-ip.all@ietf.org>, IETF Discussion <ietf@ietf.org>, "trill@ietf.org" <trill@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/trill/xu6pn3r5qvhOkSB-Oo1-kJhi0Cc>
Subject: Re: [trill] [Tsv-art] Tsvart early review of draft-ietf-trill-over-ip-10 - ECN & DSCP considerations
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/trill/>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 01 Jul 2017 18:04:00 -0000

On Sat, Jul 1, 2017 at 10:53 AM, Black, David <David.Black@dell.com> wrote:
> Also, as noted earlier in this discussion, RFC 7657 explicitly discourages use of multiple DSCPs in a single TCP connection.  That needs to be reflected in the TCP encapsulation text in the trill-over-ip draft - in particular, the current text in Section 4.3 on mapping to DSCPs from TRILL priority and DEI does not appear to be consistent with RFC 7657 for TCP-based encapsulation.

I'm surprised it only "discourages" rather than "prohibits"...

Thanks,
Donald
===============================
 Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
 155 Beaver Street, Milford, MA 01757 USA
 d3e3e3@gmail.com

> Thanks, --David
>
>> -----Original Message-----
>> From: Donald Eastlake [mailto:d3e3e3@gmail.com]
>> Sent: Friday, June 30, 2017 9:43 PM
>> To: Black, David <david.black@emc.com>
>> Cc: Magnus Westerlund <magnus.westerlund@ericsson.com>; tsv-
>> art@ietf.org; draft-ietf-trill-over-ip.all@ietf.org; IETF Discussion
>> <ietf@ietf.org>; trill@ietf.org
>> Subject: Re: [Tsv-art] Tsvart early review of draft-ietf-trill-over-ip-10 - ECN &
>> DSCP considerations
>>
>> Hi David,
>>
>> On Mon, Jun 26, 2017 at 3:04 PM, Black, David <David.Black@dell.com>
>> wrote:
>> > Adding some comments on ECN and DSCP ...
>> >
>> >> > Section 4.3:
>> >> >
>> >> >    TRILL over IP implementations MUST support setting the DSCP value in
>> >> >    the outer IP Header of TRILL packets they send by mapping the TRILL
>> >> >    priority and DEI to the DSCP. They MAY support, for a TRILL Data
>> >> >    packet where the native frame payload is an IP packet, mapping the
>> >> >    DSCP in this inner IP packet to the outer IP Header with the default
>> >> >    for that mapping being to copy the DSCP without change.
>> >> >
>> >> > I think it is fine to require that implementations are capable of setting
>> >> > DSCP values on the outer IP header. However, I fail to see any discussion of
>> >> > the potential issues with actually setting the DSCP values. It is one thing to
>> >> > do this in an IP back bone use case where one can know and have control
>> >> > over the PHB that the DSCP values maps to. But otherwise, over general internet the
>> >> > behavior is not that predictable. One can easily be subject to policers or
>> >> > remapping. Also as the actual DSCP code point usage is domain specific this is
>> >> > difficult. Priority reversal is likely the least of the problems that this can
>> >> > run into over general Internet.
>> >>
>> >> It sounds like appropriate discussion and warnings about these issues
>> >> would resolve the above comment.
>> >
>> > For ECN, see RFC 6040 and draft-ietf-tsvwg-rfc6040update-shim.  In particular,
>> > copying the inner ECN codepoint to the outer IP header encapsulation without
>> > requiring decapsulation processing as specified in RFC 6040 or the 6040update-shim
>> > draft can lose congestion indications from the network and hence is wrong
>> > (it's also wrong wrt RFC 3168, but RFC 6040 and the 6040update-shim drafts are
>> > better and more current references).
>>
>> That's a good point.
>>
>> > For DSCPs, start with RFC 2983 - thinking about the validity (or likely validity)
>> > of the outer DSCP at the decapsulator may help in choosing whether to
>> > recommend a uniform model (e.g., copy DSCP out at ingress, copy back in at
>> > egress) or a pipe model (e.g., do something reasonable for outer DSCP at
>> > ingress, ignore it on egress) as the implementation default.
>>
>> I believe the default behavior in the current draft is the best
>> default. That sets DSCP based on the same TRILL Header indicia that
>> controls default QoS on non-IP links.
>>
>> > -- DSCP mapping to/from TRILL/Ethernet priorities
>> >
>> >> The intent in the draft is to reflect the default relative priority of
>> >> the different priority code points in IEEE Std 802.1Q where priority 1
>> >> is lower than priority 0. At a quick look, it appears to me that RFC
>> >> 2474 requires that 0x001000 be handled as being of a priority not
>> >> lower than the priority with which 0x000000 is handled. Yet RFC 3662,
>> >> which you point to, seems to suggest using 0x001000 as a lower
>> >> priority code point than 0x000000. Given that 3662 not only does not
>> >> update 2474 but is only Informational while 2474 is Standards Track, I
>> >> would say that 2474 dominates and that this draft makes the best
>> >> assumptions it can about default behavior...
>> >
>> > Well ... that's a discussion about text in RFCs that are well over a decade
>> > old, and in an area (less-than-best-effort service) where the aspirations
>> > of at least RFC 3662 weren't realized ... but that RFC is not safe to ignore,
>> > either.
>> >
>> > In practice, the specification of CS1 for less-than-best-effort service has
>> > been promulgated by RFC 4594 rather than RFC 3662, and RFC 4594 has
>> > had significant "running code" impact on network design and operation.
>> >
>> > As Magnus mentioned RFC7657, I strongly suggest starting from the
>> > RFC 7657 discussion of this topic in order to figure out what to do.  I'm
>> > not sure what to recommend, but I do think that starting from
>> > RFC 7657 (rather than RFC 2474 and RFC 3662) is the better approach.
>>
>> OK.
>>
>> > FWIW, the TSVWG WG is in the process of figuring out which DSCP
>> > to recommend for less-than-best-effort-service in place of CS1 - that's
>> > likely to be an active topic of discussion in Prague.
>>
>> I'll try to attend that session.
>>
>> Thanks,
>> Donald
>> ===============================
>>  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>>  155 Beaver Street, Milford, MA 01757 USA
>>  d3e3e3@gmail.com
>>
>> > Thanks, --David
>> >
>> >> -----Original Message-----
>> >> From: Tsv-art [mailto:tsv-art-bounces@ietf.org] On Behalf Of Donald
>> >> Eastlake
>> >> Sent: Sunday, June 25, 2017 8:07 PM
>> >> To: Magnus Westerlund <magnus.westerlund@ericsson.com>
>> >> Cc: tsv-art@ietf.org; draft-ietf-trill-over-ip.all@ietf.org; IETF Discussion
>> >> <ietf@ietf.org>; trill@ietf.org
>> >> Subject: Re: [Tsv-art] Tsvart early review of draft-ietf-trill-over-ip-10
>> >>
>> >> Hi Magnus,
>> >>
>> >> Thanks for the extensive review. See my responses below.
>> >>
>> >> On Thu, Jun 15, 2017 at 1:32 PM, Magnus Westerlund
>> >> <magnus.westerlund@ericsson.com> wrote:
>> >> >
>> >> > Reviewer: Magnus Westerlund
>> >> > Review result: Not Ready
>> >> >
>> >> > Early review of draft-ietf-trill-over-ip-10
>> >> > Reviewer: Magnus Westerlund
>> >> > Review result: Not Ready
>> >> >
>> >> > TSV-ART review comments:
>> >> >
>> >> > I have set this to not ready as there are several issues, some significant
>> that
>> >> > could affect the protocol realization significantly. Some may be me
>> missing
>> >> > things in TRILL, I was not that familiar with it before this review and I have
>> >> > only tried looking up things, not reading the whole earlier specifications.
>> So
>> >> > don't hesitate to push back and provide pointers to things that can
>> resolve
>> >> > issues. The authors and the WG clearly have thought about a lot of issues
>> >> and
>> >> > dealt with much already.
>> >>
>> >> OK. Hopefully we can resolve these one way or the other.
>> >>
>> >> > Diffserv usage
>> >> > --------------
>> >> >
>> >> > Section 4.3:
>> >> >
>> >> >    TRILL over IP implementations MUST support setting the DSCP value in
>> >> >    the outer IP Header of TRILL packets they send by mapping the TRILL
>> >> >    priority and DEI to the DSCP. They MAY support, for a TRILL Data
>> >> >    packet where the native frame payload is an IP packet, mapping the
>> >> >    DSCP in this inner IP packet to the outer IP Header with the default
>> >> >    for that mapping being to copy the DSCP without change.
>> >> >
>> >> > I think it is fine to require that implementations are capable  of setting
>> >> > DSCP values on the outer IP header. However, I fail to see any discussion
>> of
>> >> > the potential issues with actually setting the DSCP values. It is one thing
>> to
>> >> > do this in an IP back bone use case where one can know and have control
>> >> over
>> >> > the PHB that the DSCP values maps to. But otherwise, over general
>> >> internet the
>> >> > behavior is not that predictable. One can easily be subject to policers or
>> >> > remapping. Also as the actual DSCP code point usage is domain specific
>> this
>> >> is
>> >> > difficult. Priority reversal is likely the least of the problems that this can
>> >> > run into over general Internet.
>> >>
>> >> It sounds like appropriate discussion and warnings about these issues
>> >> would resolve the above comment.
>> >>
>> >> > Section 4.3:
>> >> >
>> >> >    The default TRILL priority and DEI to DSCP mapping, which may be
>> >> >    configured per TRILL over IP port, is an follows. Note that the DEI
>> >> >    value does not affect the default mapping and, to provide a
>> >> >    potentially lower priority service than the default priority 0,
>> >> >    priority 1 is considered lower priority than 0. So the priority
>> >> >    sequence from lower to higher priority is 1, 0, 2, 3, 4, 5, 6, 7.
>> >> >
>> >> >       TRILL Priority  DEI  DSCP Field (Binary/decimal)
>> >> >       --------------  ---  -----------------------------
>> >> >                   0   0/1  001000 / 8
>> >> >                   1   0/1  000000 / 0
>> >> >                   2   0/1  010000 / 16
>> >> >                   3   0/1  011000 / 24
>> >> >                   4   0/1  100000 / 32
>> >> >                   5   0/1  101000 / 40
>> >> >                   6   0/1  110000 / 48
>> >> >                   7   0/1  111000 / 56
>> >> >
>> >> > This appear to be an problematic mapping. At least for prio 0 and 1. As
>> >> > priority 1 appears to be intended to be higher than priority 0, it is
>> >> > interesting that it is mapped to CS1, which to quote
>> >> > https://datatracker.ietf.org/doc/rfc7657/:
>> >> >
>> >> > CS1 ('001000') was subsequently designated as the recommended
>> >> >       codepoint for the Lower Effort (LE) PHB [RFC3662].
>> >> >
>> >> > So what is proposed can in a network using default mapping, result in
>> that
>> >> you
>> >> > get priority 0 to be lower priority than 1. Plus that in some networks this
>> can
>> >> > also results in strange remapping that results in a different PHB for CS1
>> >> than.
>> >>
>> >> The intent in the draft is to reflect the default relative priority of
>> >> the different priority code points in IEEE Std 802.1Q where priority 1
>> >> is lower than priority 0. At a quick look, it appears to me that RFC
>> >> 2474 requires that 0x001000 be handled as being of a priority not
>> >> lower than the priority with which 0x000000 is handled. Yet RFC 3662,
>> >> which you point to, seems to suggest using 0x001000 as a lower
>> >> priority code point than 0x000000. Given that 3662 not only does not
>> >> update 2474 but is only Informational while 2474 is Standards Track, I
>> >> would say that 2474 dominates and that this draft makes the best
>> >> assumptions it can about default behavior...
>> >>
>> >> > MTU and Fragmentation
>> >> > ---------------------
>> >> >
>> >> > I think there are two main issue here. The first one is MTUD discovery
>> >> > of the actual IP path MTU between the ports. That will be needed to
>> >> prevent
>> >> > a lot of traffic going into MTU black holes. Especially as TRILL requries
>> >> > 1470 byte support which is likey above a lot of paths.
>> >>
>> >> Seems like it would depend on the environments where TRILL was used.
>> >> For example, I do not think 1470 would be a problem in most Data
>> >> Center or Internet Exchange point uses, for example. Data Centers
>> >> sometimes support 9K jumbo frames and the like.
>> >>
>> >> In fact, it is probably bad to focus too much on 1470 -- that is a
>> >> required minimum to be sure that reasonable size link state PDUs can
>> >> be successfully flooded through the TRILL campus so that routing will
>> >> work. However, it would commonly be the case that, for the TRILL
>> >> campus to be useful in a particular case, links need to be able to
>> >> carry the expected size TRILL Data packets. For example, if there were
>> >> two parts of a TRILL campus connected by one or a few TRILL over IP
>> >> links and the end stations in each part were assuming they could use
>> >> 1500 byte Ethernet packets, then the TRILL over IP links would need to
>> >> support an MTU based on 1500 + TRILL Header + IP and TRILL over IP
>> >> encapsulation. And more if security was being used or there were any
>> >> other reasons for additional headers/encapsulation...
>> >>
>> >> > Section 8.4:
>> >> >
>> >> >    Path MTU discovery [RFC4821] should be useful
>> >> >    in determining the IP MTU between a pair of RBridge ports with IP
>> >> >    connectivity.
>> >> >
>> >> > The issue with RFC4821 is that it has requirements on the packetization
>> >> layer.
>> >> > Trill appears to have several components that are useful. However, it will
>> >> > require a specification of the procedure to result in a useful tool.
>> >>
>> >> See below.
>> >>
>> >> > Section 8.4:
>> >> >
>> >> >    TRILL IS-IS MTU PDUs, as specified in Section 5 of [RFC6325] and in
>> >> >    [RFC7177], can be used to obtain added assurance of the MTU of a
>> >> >    link.
>> >> >
>> >> > Yes, that can confirm working MTUs that are at 1470 or above, but
>> appears
>> >> > prevented from working below 1470?
>> >>
>> >> While there is a minimum size for TRILL IS-IS MTU PDUs, determined by
>> >> header size, it is well below 1470, probably (depending on whether
>> >> secuirty is in use, etc.) below 150 bytes.
>> >>
>> >> > Thus, it appears that there is a lack of mechanism here to actually get a
>> valid
>> >> > and functional MTU from TRILL in the cases where the Path MTU is below
>> >> 1470. If
>> >> > I am wrong good, but I think this is an important piece for how to handle
>> >> the
>> >> > next main issue.
>> >>
>> >> How about referencing Section 3 of
>> >> https://tools.ietf.org/html/draft-ietf-trill-mtu-negotiation-05
>> >> which is currently in IETF Last Call? (The wording of that section is
>> >> probably going to be improved based on an OPS review by Brian
>> >> Carpenter.)
>> >>
>> >> > UDP encapsulation and IP fragments.
>> >>   ----------------------------------
>> >> > I see it as a big issue that UDP encapsulation is the native one, and that
>> >> > relies on IP fragmentation despite the need for reliable fragmentation.
>> >> With
>> >> > the setup of having to support 1470 MTU on TRILL level some packets will
>> >> be
>> >> > fragmented in many environments. That will lead to a lot of losses, and as
>> >> > discussed below a very big problem with middleboxes. The main problem
>> >> here is
>> >> > that if one tries to rely on IP fragments one will have issues with packets
>> >> > ending up in black holes. And different problems depending on IPv4 or
>> >> IPv6.
>> >> > IPv6 is lilkely the lesser problem assuming that one have working
>> PMTUD.
>> >> >
>> >> > There are several ways out of this.
>> >> >
>> >> > 1. Detect issues and use TCP encapsulation with correctly set MSS to not
>> >> get IP
>> >> > fragements 2. Determine MTU and implement an fragmentation
>> >> mechanism on top of
>> >> > UDP.
>> >>
>> >> So, I don't see that much problem with UDP being the general default
>> >> consistent with the TRILL philosophy of defaulting to need zero or
>> >> minimal configuration. The default should be to use multicast Hellos
>> >> for discovery of neighbors which sure points at UDP to me. Having to
>> >> traverse a NAT should be a rare case. Since, in the NAT case, you have
>> >> to configure things related to the static binding and the IP
>> >> address(es) of peer(s) anyway you can also configure to use a
>> >> different encapsulation than UDP, such as TCP, at the same time. I
>> >> don't see it as much of a problem if, by default, TRILL won't operate
>> >> through a NAT. If you are using UDP and it fragments and fragments are
>> >> dropped at a NAT, probably you can't exchange Hellos so you will not
>> >> form an adjacency and anything on the other side of the NAT will not
>> >> be visible.
>> >>
>> >> > Zero Checksum:
>> >> > --------------
>> >> >
>> >> > Section 5.4:
>> >> >
>> >> > UDP Checksum - as specified in [RFC0768]
>> >> >
>> >> > Considering the fast path encapsulation desire, I am surprised to not see
>> >> any
>> >> > mentioning of use of zero checksum here. Raising the zero checksum and
>> >> forward
>> >> > reference would be good I think.
>> >> >
>> >> > And then Section 8.5:
>> >> >
>> >> >    The requirements for the usage of the zero UDP Checksum in a UDP
>> >> >    tunnel protocol are detailed in [RFC6936]. These requirements apply
>> >> >    to the UDP based TRILL over IP encapsulations specified herein
>> >> >    (native and VXLAN), which are applications of UDP tunnel.
>> >> >
>> >> > If you actually intended to allow zero checksum, then you actually should
>> >> > document that Trill fulfills the requirements that the applicability
>> statement
>> >> > raises. I have not analyzed how well it meets these requirements.
>> >> >
>> >> > Please review Section 6.2 of RFC 8086 for example how that can be done.
>> >>
>> >> OK. We'll look into it.
>> >>
>> >> > TCP Encapsulation issue
>> >> > -----------------------
>> >> >
>> >> > Section 5.6:
>> >> >
>> >> > The TCP encapsulation appear to be missing an delimiter format allowing
>> >> each
>> >> > individual TRILL packet/payload to be read out of the TCP's byte stream.
>> In
>> >> > other words, a normal implementation has no way of ensuring that the
>> TCP
>> >> > payload starts with the start of a new TRILL payload. Multiple small TRILL
>> >> > payloads may be included in the same TCP payload, and also only parts
>> as
>> >> TCP is
>> >> > one way of dealing with TRILL packets that are larger than the
>> >> IP+Encapsulation
>> >> > MTU that actually will work.
>> >> >
>> >> > This comment is based on that there appear to be no length fields
>> included
>> >> in
>> >> > the TRILL header. The most straight forward delimiter is a 2-byte length
>> >> field
>> >> > for the TRILL payload to be encapsulated.
>> >>
>> >> Right. It might also be useful to include some sort of check field, as
>> >> is done in BGP, to detect if you are out of sync in parsing the TCP
>> >> stream.
>> >>
>> >> Another point is that, while with UDP it seems fine to send packets
>> >> with assorted QoS, you don't want to encourage re-ordering of TCP
>> >> packets in a stream. So if TCP encapsulation is being used, you want
>> >> to use the same DSCP value for the packets in a particular TCP stream.
>> >> So, generally, you need to have a TCP connection per priority handling
>> >> category. Mapping the 8 priority levels into a smaller number of
>> >> handling categories is a normal thing to do so you certainly don't
>> >> necessarily need 8 TCP connections. Adding material on this should not
>> >> be too hard.
>> >>
>> >> > Section 5.6:
>> >> >
>> >> > TCP endpoint requirements. I do wonder if an application like TRILL actual
>> >> > would need to discuss performance impacting implementation choices or
>> >> > limitations. For example use of NAGLE, the requirements on buffer sizes
>> in
>> >> > relation to Bandwidth delay products, as buffer memory in a RBridge will
>> >> impact
>> >> > performance.
>> >>
>> >> Well, I'm not sure how deeply this document should get into such
>> >> performance issues. What about just saying something about
>> >> consideration being given to tuning TCP for performance and pointing
>> >> to one or a few other RFCs that talk about this?
>> >>
>> >> > Congestion Control
>> >> > ------------------
>> >> > First thanks for the effort here.
>> >>
>> >> You're welcome.
>> >>
>> >> > 8.1.2 In Other Environments
>> >> >
>> >> >    Where UDP based encapsulation headers are used in TRILL over IP in
>> >> >    environments other than those discussed in Section 8.1.1, specific
>> >> >    congestion control mechanisms are commonly needed.  However, if the
>> >> >    traffic being carried by the TRILL over IP link is already congestion
>> >> >    controlled and the size and volatility of the TRILL IS-IS link state
>> >> >    database is limited, then specific congestion control may not be
>> >> >    needed. See [RFC8085] Section 3.1.11 for further guidance.
>> >> >
>> >> > This is correct, however my question is if the RBridges have any way of
>> >> knowing
>> >> > which traffic is actually congestion controlled, considering that TRILL
>> >> provides
>> >> > an layer 2 abstraction. I wonder if there should be any type of white list of
>> >> > the types of layer 2 payloads that can be assumed to be congestion
>> >> controlled,
>> >> > and thus okay to forward over IP paths? I am worried that without any
>> >> > recommendation to prevent traffic that is not controlled to be forwarded,
>> >> can
>> >> > lead to congestion issues.
>> >> >
>> >> > The other issue I think may exist is the issue serial unicast emulation of
>> >> > broadcast/multicast creates. As this amplifies the outgoing packet rate
>> with
>> >> > a factor of how many addresses are configured for serial unicast this can
>> >> > be significant traffic expansion. Thus, I think additional considerations are
>> >> > needed here, and maybe rate limiting of the amount of traffic to be
>> >> multicasted.
>> >>
>> >> OK. We can think about those issues.
>> >>
>> >> > Flow and ECMP
>> >> > -------------
>> >> >
>> >> > Section 8.3:
>> >> >
>> >> > For example, for TRILL
>> >> >    Data, this entropy field could be based on some hash of the
>> >> >    Inner.MacDA, Inner.MacSA, and Inner.VLAN or Inner.FGL.
>> >> >
>> >> > I would appreciate clearer references to what these fields are.
>> >>
>> >> In a TRILL Data packet, the payload after the TRILL Header looks like
>> >> an Ethernet frame except that there is always either a VLAN tag or,
>> >> alternatively, where the VLAN tag would be, a Fine Grained Label
>> >> [RFC7172]. (The preceding is the view in the TRILL RFCs, but there is
>> >> an equivalent and equally valid view in which all the fields through
>> >> and including the VLAN or FGL tag are part of the TRILL Header.) The
>> >> TRILL base protocol specification focuses on Ethernet as a link
>> >> technology between TRILL switches, in which case there will be a link
>> >> header including an Outer.MacDA and Outer.MacSA fields and possibly an
>> >> Outer.VLAN, all before the TRILL Header. See Figure 1 and Figure 2 in
>> >> RFC 7172.
>> >>
>> >> Some of the above could be added to the draft for clarity.
>> >>
>> >> > If I understand this correctly, the idea here is to look into the inner
>> >> > layer 2 frames, and use the flow equivalents that exists on that level and
>> >> > hash that into value that maps the flows onto the source port range.
>> >>
>> >> Yes.
>> >>
>> >> > I think this text should include a summary of the principle and ensure to
>> >> > note the important requirement that what is considered flows in the
>> inner
>> >> > must not result in being striped over multiple source ports as this may
>> lead
>> >> to
>> >> > reordering issues due to packets taking different paths.
>> >>
>> >> Well, we can add some text. But when would the relative ordering
>> >> matter for two TRILL Data packets where the two inner native payloads
>> >> have different values for any one or more of these three fields
>> >> (Inner.MacDA, Inner.MacSA, and inner VLAN/FGL tag) ? If any of those
>> >> fields are different, you are talking about different streams.
>> >>
>> >> > NAT and TRILL over IP:
>> >> > Section 8.5:
>> >> >
>> >> > If one like to use TRILL over IP through a NAT, then there are some very
>> >> > important considerations that are missing. First the need for static
>> binding
>> >> > configurations or the need for determining ones external address(es) and
>> >> be
>> >> > able to communicate that to the peer RBridges, and in addition ensure
>> that
>> >> one
>> >> > has keep-alives to that the NAT binding never times out.
>> >>
>> >> I think those are good points. There is an additional problem that
>> >> TRILL Hellos detect neighbors with which they have 2-way connectivity
>> >> by indicating, inside the Hellos that are sent, from what neighbors
>> >> Hellos have been received on that port. If a NAT is involved, these
>> >> neighbor addresses inside Hellos need to be mapped.
>> >>
>> >> > Next is the issue that there is almost zero chance of getting a IP/UDP
>> >> > encapsulation TRILL payload through the NAT if it results in IP
>> >> fragmentation,
>> >> > as NATs don't do defragment and refragmented on the internal side, and
>> >> an IP
>> >> > fragment lacks UDP port and thus can't be matched to binding.
>> >>
>> >> So perhaps the recommendation should be to configure the port to use
>> >> TCP if there will be fragmentation.
>> >>
>> >> > Also if you like to run IP/ESP through a NAT, then you most likely need the
>> >> > IP/UDP/ESP encapsulation (https://tools.ietf.org/html/rfc3948). Note
>> that
>> >> this
>> >> > will restrict the MTU even further and thus ensure that the 1470
>> >> requirement
>> >> > cannot be fulfilled even without additional tunnels over an 1500 bytes
>> MTU
>> >> > Ethernet infrastructure.
>> >> >
>> >> > I would note that also firewalls likely have issues with IP fragments for
>> the
>> >> > same reason, they require significant amount of state to be verified if
>> they
>> >> > should be let through.
>> >> >
>> >> > In general I think you should create a configuration that has chance to
>> work
>> >> > through most middleboxes, but I think you should require static bindings.
>> I
>> >> > think that configuration is, and don't laugh now, but
>> >> IP/UDP/ESP/TCP/TRILL,
>> >> > otherwise you will not be able to have both security and reliable
>> >> fragmentation
>> >> > of TRILL packets.
>> >>
>> >> OK. Thanks again for this review. It has pointed out a number of
>> >> problems and in thinking about those, I believe a couple of further
>> >> problems have come to mind that I mentioned above. We'll work on a
>> >> revised draft.
>> >>
>> >> Thanks,
>> >> Donald
>> >> ===============================
>> >>  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>> >>  155 Beaver Street, Milford, MA 01757 USA
>> >>  d3e3e3@gmail.com
>> >>
>> >> > Cheers
>> >> >
>> >> > Magnus Westerlund
>> >>
>> >> _______________________________________________
>> >> Tsv-art mailing list
>> >> Tsv-art@ietf.org
>> >> https://www.ietf.org/mailman/listinfo/tsv-art