Re: [nvo3] Review ptA: Technical draft-ietf-nvo3-gue-04

Tom Herbert <tom@herbertland.com> Thu, 20 October 2016 19:21 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 24B6A1296C1 for <nvo3@ietfa.amsl.com>; Thu, 20 Oct 2016 12:21:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yjKTOc7VAYdB for <nvo3@ietfa.amsl.com>; Thu, 20 Oct 2016 12:21:49 -0700 (PDT)
Received: from mail-qk0-x236.google.com (mail-qk0-x236.google.com [IPv6:2607:f8b0:400d:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B14091296D0 for <nvo3@ietf.org>; Thu, 20 Oct 2016 12:21:43 -0700 (PDT)
Received: by mail-qk0-x236.google.com with SMTP id f128so108621567qkb.1 for <nvo3@ietf.org>; Thu, 20 Oct 2016 12:21:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=2EdC/IsrXFzsxmchU4oEdMBiInzRmQ800NkLhh578NY=; b=WjVrNnR3hai1Hw/Gy7PHJPwL6DOvbA3dosZpnqBjmxL17EVJct5/ctshpzELz+m/+W z0lYPIW8Mykz8eWeDaAigsMf05b+bW7tr52J3WinA+bzwkRCsePDwoAwR9/hV6bOXFnG 2xgmjlf5i263mP+0nAPhA9ekrVvuUnTTgq8F7Ze9WHz3GCee9G0zYuYh7lwNw1gF5bU7 w4bXJ1MyLNDl6MaD/fZP9+CMDP2Ds11q9Mm0GEX49S+0T+S9xNxp58ggHEGLMkzhfFdW sQvOc56aNh2Fxe3GlcnJYaLGoIY3Q6X+K8uPWrzCHRlOogA9Xyvrpx2YITTszT4Jz/g6 bpuA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=2EdC/IsrXFzsxmchU4oEdMBiInzRmQ800NkLhh578NY=; b=bkvontgOwIs77Kk9xEtMMfS8qsr7NWLMfNLJPBx5lkQCsLBd191X4rYGIMSKy/xG+r hbJs4wOIaKFN1szE8i5CnzYML76I+SH3E3PihPyJv3urvaK24aNw/6PM3/jhQJ/TNOwH S7OQZY3FmeUdmViYOyPfGqS6A4ZDWCSkngDfc1pak9j6sdQJUXKevSsG5fumjf1wbEeU 2uQcmpoQjJKxmq/M4KDgvag41jt3sOymDx+Y169x7/rrOCLMnuy8MNz1T0FZS9bIsvOJ 9lms6ZcC3HCPFiCrUTUaZsk389fo4Ra92jdPi11kIMtuGg0YrHGsFyJZWzVsjWcNnK7a j+Yw==
X-Gm-Message-State: ABUngvcd0Ner5zzgRGHM/L5WxO6V7GP5tCXOnlCXKDJvMOfSahWphlTyVrSaMEgE7yxFj1HVksLCL+NhXWujoQ==
X-Received: by 10.55.214.211 with SMTP id p80mr2201310qkl.235.1476991300726; Thu, 20 Oct 2016 12:21:40 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.237.44.71 with HTTP; Thu, 20 Oct 2016 12:21:39 -0700 (PDT)
In-Reply-To: <67e76ab1-2f5b-4906-4cce-f7c176fd49a0@bobbriscoe.net>
References: <67e76ab1-2f5b-4906-4cce-f7c176fd49a0@bobbriscoe.net>
From: Tom Herbert <tom@herbertland.com>
Date: Thu, 20 Oct 2016 12:21:39 -0700
Message-ID: <CALx6S34J_UayTEyYDPCcyGDb963myo1zg=Ytm3KhbVvZ3KwXJQ@mail.gmail.com>
To: Bob Briscoe <ietf@bobbriscoe.net>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/AsmBnEbuWphzg0mUsBeCa1eylt0>
Cc: "nvo3@ietf.org" <nvo3@ietf.org>, Osama Zia <osamaz@microsoft.com>, Lucy Yong <lucy.yong@huawei.com>
Subject: Re: [nvo3] Review ptA: Technical draft-ietf-nvo3-gue-04
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Oct 2016 19:21:54 -0000

Hi Bob,

Thanks for the detailed review. Comments are inline.

>
> A/ TECHNICAL PROBLEMS/COMMENTS
>
> 1/ ADDRESSING ARCHITECTURE
>
> 1.1/ Inferring Connection Semantics: the rule not the exception
>
> The draft assumes that, as a general rule, the UDP dst. port of a GUE packet
> will be fixed (6080) and that flow entropy will come from the source port
> (see the two quoted sections below).
>
Yes.

> S. 5.11.1. Flow classification
>
>    " ... When a packet is encapsulated with
>     GUE, the source port in the outer UDP packet is set to a flow
>     entropy value ...
>
Yes.

> S.5.11.2 Flow entropy properties
>
>         The flow entropy is the value set in the UDP source port of a
>         GUE packet. Flow entropy in the UDP source port should adhere to
>         the following properties:
>
>
> Nonetheless, the draft recognises there will be cases where "connection
> semantics" have to be applied in order to traverse middleboxes such as
> firewalls and NATs (but only mentioned in the relevant parts of 5.6.1 &
> 5.6.2 quoted below).
>
> Such middleboxes generally only allow "ingress" UDP datagrams if they look
> like responses to recent "egress" datagram(s). So there has to be a concept
> of an "initiator" end of the GUE tunnel. Only once the initiator end has
> sent an "egress" datagram with src:dst ports e:G (from ephemeral port e to
> the GUE port G), then the GUE encap at the remote "responder" end would be
> able to traverse the middlebox using "ingress" datagrams with src:dst ports
> reversed (G:e).
>
Yes.

> S.5.6.1. Inferring connection semantics:
>
>    A middlebox may infer bidirectional connection semantics
>    [...] To operate in
>    this environment, a GUE tunnel must assume connected semantics  [...]
>    The source port set in the UDP
>    header must be the destination port the peer would set for replies.
>    In this case the UDP source port for a tunnel would be a fixed value
>    and not set to be flow entropy as described in section 5.11.
>
>    The selection of whether to make the UDP source port fixed or set to
>    a flow entropy value for each packet sent should be configurable for
>    a tunnel.
>
Yes.

> S. 5.6.2. NAT
>
>    In
>    the case of stateful NAT, connection semantics must be applied to a
>    GUE tunnel as described in section 5.6.1.
>
> [BTW, I suggest changing the final sentence of the first para in S.5.6.1.
> (quoted above) to:
>
>    Therefore, in the ingress direction, the destination UDP port would
>    provide flow entropy, while the source port would take the fixed
>    value of 6080 (the converse of the case in section 5.11).
>
The destination port can't be used for flow entropy of the inner
packet. If the ports or fixed for a UDP (ie. A,B in one direction and
B,A in reverse) then we don't get per inner flow entropy for ECMP in
outer packet. Appropriate use of IPv6 flow label and support in ECMP
solves this problem but that is obviously applicable to only IPv6.

> ]
>
> The text quoted from both sections 5.6.1 & 5.6.2 above implies
> a) that the operator of tunnel endpoint(s) can somehow know whether there
> are any middleboxes within the tunnel.
> b) that applying connection semantics is feasible.
>
> Connection semantics feasibility:
> *  transport encap: relatively easy - it was simple to implement connection
> semantics in GUT (see code or  example in Figure 4 in
> draft-manner-tsvwg-gut-02, or see description later under A3.2/ "Transport
> encap with Connection Semantics: Flow state management"). Nonetheless,
> without congestion semantics, GUE/GUT is even simpler, because it can be
> stateless.
> * network encap: harder (see separate email for my proposed design: C1/
> "Stateless Connection Semantics", but until there's a working implementation
> we have to allow for the possibility that it's not feasible).
>
> Regarding the first question - whether middleboxes (such as firewalls) exist
> on a path:
> * most operators of tunnel endpoints don't know for sure, but they do know
> that firewalls, etc. are very likely, so they would have to turn on the
> "middleboxes exist" parameter.
> * in one or two important (but private) data centres, the admin might know
> that there are no firewalls (and certainly no NATs), so she can turn off the
> "middleboxes exist" parameter. However, that is the exception not the rule.
>
> In summary, connection semantics are essential wherever there might be
> middleboxes. This implies:
> * transport encap: connection semantics are relatively simple, so why not
> solely standardize this case? The few cases where the operator knows for
> certain that there are no middleboxes don't need to use connection
> semantics, but they are in private networks, so they shouldn't be the
> primary use-case for standardization.

Yes, for transport encap connection semantics would be required. The
use of transport layer encapsulation is most interesting on the
Internet anyway. See draft-herbert-transports-over-udp-00.

> * network encap: Will connection semantics work? Two possibilities:
>   a) if no, the GUE network encap will be pretty useless, given nearly all
> real networks contain firewalls, etc. There will be no point standardizing
> the network encap just for a few special private networks that have no
> middleboxes.
>   b) if yes, they will be needed in most real networks, so it should be the
> default case that is standardized. Then the IETF has to ask, is there any
> point standardizing a GUE network encap without connection semantics, just
> for a few controlled environments where the operator knows for sure that
> there are no middleboxes?
>
But then we don't get any value of using source port for flow entropy.
This is a common technique in probably all the proposed UDP
encapsulations, in fact this is considered a major value of using UDP
in the first place.

> Corollary of all this: A packet is a "GUE packet" if either src or dst port
> = 6080.
>
If connection semantics are used this would be true. However, it
should be mentioned that the converse clause is not true, ie. "If the
src or dst port = 6080, the it may or may not be a GUE packet."

> 1.2/ A Firewall or NAT in front of both ends
>
> Most firewalls / NATs only allow an incoming UDP datagram in response to a
> recent outgoing datagram. If there there are two such middleboxes each
> "protecting" a different endpoint of a GUE tunnel (network or transport
> encap), then neither end can send an initial GUE datagram.
>
> To operate in such an environment, GUE endpoints will need to support STUN
> [RFC5389].
>
Okay.

> 1.3/ Multiple GUE servers (transport encap) not possible behind a NAT-PT
> with one external IP
>
> Two cases:
>
> * For transport encap: every GUE server has to have its own public IP
> address.
> Reason: if a NAT-PT with one external IP address (A) sits in front of
> multiple GUE servers, only one can be reached on the well-known GUE port
> (6080). Because there will be only one address:port combination to address
> packets to (A:6080). (Dan Wing pointed out this same problem with GUT on the
> tsvwg ML). It's not a killer, but it is a limitation to applicability that
> has to be understood and documented.
>
I don't see how this is any different than any other use case for UDP.
address:port can only refer to one instance of a UDP service. The
point of GUE and other UDP encapsulations is that we can multiplex
over that port.

> * With network encap: Non-issue.
>
>
> 1.4/ Network decap and transport decap problematic on the same (IP)
> interface
>
> A consequence of using the same well-known port for GUE transport and
> network encap is that both decaps cannot be deployed at the same IP address.
>
I'm not sure I understand. We already have IP layer which can do both
IPIP and transport layer in same protocol, why is UDP encapsulation
any different?

> Thought experiment: This might work by implementing a combined
> transport/network decap that checked whether there was another IP header in
> the header chain and:
> * if there was, removed the outer IP and the outer UDP+GUE+option headers
> * if not, removed solely the outer UDP+GUE+option headers, but not the outer
> IP.
>
Still missing it, this already works. You can see this in the Linux
implementation.

> However, there is nothing to say that a GUE transport encap should not
> encapsulate a packet that has already been tunnelled in an IP outer (e.g.
> IPsec AH or ESP). That is, the transport encap would insert a UDP and GUE
> header between the outer IP and the inner IP, without adding another IP
> outer.
>
Sure.

> It would be safer to use two different well-known ports for transport and
> network encap. However, I think deploying transport and network encap on the
> same IP is a corner case we just need to rule as inadmissible. Nonetheless,
> a sys-admin would get weird behaviour if this did happen, with lots of
> head-scratching before she realised what had happened. I'm not sure how to
> mitigate this.
>
I agree it will be safer and probably would want TOU to run over a
separate port, but I don't see that it is incorrect to run them over
the same port.

> 2/ WIRE PROTOCOL
>
> 2.1/ HLEN too small
> S3.1
> The 5-bit Hlen field (multiplied in 4B units making max header length 128B)
> worries me a lot.
> Let's not make a similar mistake to when we limited TCP option space to 40B,
> which has caused enormous grief.
>
Well 128 > 40! Also, you need to consider that options in GUE are not
TLVs so they don't have wire overhead of type and length. Too small is
obviously subjective, but given the current extensions and rate at
which we are adding I think 128 bits is safe for the foreseeable
future. If we do hit the limit then that would require a new GUE
version to fix (not a new protocol).

> 2.2/ GUE versions
> S3.1
> The hack in GUE v1 to compress out the GUE header for direct encapsulation
> of IP (v4 or v6) seems neat, but it is also extremely dangerous. If GUE
> becomes successful, it would prevent incremental deployment of any new
> version of IP starting 0b10, 0b11 or 0b00. Because:
> * S.5.4 says drop an unknown version field, so IP cannot be upgraded
> independently from GUE code.
> * A version of IP starting 0b00 would be mistaken for GUE.
>
> The latter might sound unlikely, but bear in mind that:
> * you don't know what ideas might come up in future for using multiple
> versions of IP - the IP version field could become important.
> * a future version of IP might wrap the version field, because 0x0-0x3 are
> no longer used (a version only has to be a unique tag, it doesn't have to
> increase).
>
There are still two values that can be mapped to new versions of IP.
The GUE draft specifies that version 01 is direct IP encapsulation.
0100 indicates IPv4, 0110 indicates IPv6. So when IPv8 comes we would
declare that 0101 indiates IPv8. Before delivering to IP stack we
would to rewrite this to be 1000.

> [Aside: If you prefer an equally dangerous hack (perhaps because you don't
> believe there will ever be a version of IP beyond v6), you could have
> reduced the Ver field to the first single bit by making GUEv0 the one
> without a GUE header, and GUEv1 the one with. This would have given more
> space for the Hlen field (see my concern in A2.1/ "HLEN too small" above and
> my idea in a separate email to remove the C flag).]
>
> In the separate email about redesign, I'll describe an alternative approach
> that always fits the base GUE protocol into 4B, or even within the 8B UDP
> header (see C6/ Wire Protocol; it comes from an idea to develop GUT into
> what I called Gutless, back in Feb 2010).
>
> 2.3/ No need to interpret the protocol field relative to IPv4
> S3.2.1:
>
>    The protocol number in interpreted relative
>    to the IP protocol that encapsulates the UDP packet (i.e. protocol of
>    the outer IP header).
>
I can't find that in draft, but yes protocol field must be interpreted
per IPv4 or IPv6.

> IPv6 [RFC2460] defines the Next Header field to use the same protocol
> identifier space as IPv4. There are no IPv4 protocol numbers that are
> inappropriate for IPv6 (see the IANA protocol number registry). Therefore,
> this should simply say that the protocol number is interpreted as an IPv6
> protocol number (and therefore the field would be more appropriately called
> "Next Header").
>
> 2.4/ No need to restrict interpretation of the protocol field
> S3.2.1:
> This draft should not state any restrictions (e.g. those in the second and
> third paragraphs quoted below) that preclude certain protocol numbers in
> combination with either an IPv4 or IPv6 outer.
>
>    For an IPv4 header the protocol may be set to any number except for
>    those that refer to IPv6 extension headers or ICMPv6 options (number
>    58). [...]
>
>    For an IPv6 header the protocol may be set to any defined protocol
>    number except Hop-by-hop options (number 0). [...]
>
> Various implementations are capable of understanding an IPv6 extension or
> v6-ICMP within an IPv4 header (e.g. [RFC6145]). And any list of restricted
> header combinations can never deal with newly defined headers. So the only
> test needed is "Does your code for this combination and order of headers
> have the logic for the next header?" GUE then only needs to refer to the
> appropriate action already specified in RFC2046 (quoted below) rather than
> making up its own rules:
>
I mostly agree. The only protocol numbers we disallow are EHs and
IPMPv6 options (except for one specific DO that you refer to below) .
Allowing EHs with v4 would just be opening a Pandora's box IMO.

>    The Option Type identifiers are internally encoded such that their
>    highest-order two bits specify the action that must be taken if the
>    processing IPv6 node does not recognize the Option Type:
>    [...]
>
>    If, as a result of processing a header, a node is required to proceed
>    to the next header but the Next Header value in the current header is
>    unrecognized by the node, it should discard the packet and send an
>    ICMP Parameter Problem message to the source of the packet, with an
>    ICMP Code value of 1 ("unrecognized Next Header type encountered")
>    and the ICMP Pointer field containing the offset of the unrecognized
>    value within the original packet.  The same action should be taken if
>    a node encounters a Next Header value of zero in any header other
>    than an IPv6 header.
>
> There is a sentence at the end of S.3.6 (quoted below) that repeats these
> unnecessary restrictions. If you agree with me, please also remove it.
>
>    [...] In this case next
>    header must refer to a valid IP protocol for IPv4. No other extension
>    headers or destination options are permitted with IPv4.
>
Yes, I'll remove that.

>
> 2.5/ Missed opportunity to liberalise interpretation of the protocol field
>
> I believe that GUE offers the opportunity to liberalise, rather than
> restrict, protocol field interpretation. In particular, GUE could allow
> encapsulation of hop-by-hop options (next header number 0). You might wonder
> what a HbH option could possibly mean within a GUE header - see C2.4/ "GUE:
> a potential solution to the IPv6 extension header discard problem" in my
> separate email about how to use GUE to solve the problem where IPv6 packets
> with header extensions are highly prone to discard [RFC7872].
>
I'd rather see vendors fix HBH processing to work. One clarification
that is being made in 2460bis is that intermediate nodes can ignore
HBH, this should go a long way to fix the problems. Another thing I'd
like to avoid is encouraging middleboxes to parse UDP payloads. UDP
port numbers only have specific meaning meaning end to end. For
instance a UDP packet to port 6080 might be GUE or might be something
completely different. If it's not GUE and some middlebox parses the
UDP payload and modifies it then the middlebox has systematically
introduced a silent data corruption.

> 2.6/ Positioning GUE with respect to existing IPv6 extension headers
>
> The draft needs to state rules for where GUE encapsulation fits in the order
> of a chain of any IPv6 extension headers already present in an arriving IPv6
> packet. Below, this question is considered for both types of encapsulation,
> and in both cases it can be seen that the UDP/GUE header would not
> necessarily be the first header after an IPv6 outer.
>
> * Network encap:
> According to my reading of RFC2473, certain IPv6 extension headers in an
> arriving IPv6 should (theoretically) be copied as extension headers for the
> outer:
>   a) a Hop-by-Hop Options header (depending on the encap configuration, but
> a jumbogram option would have to be copied)
>   b) a Routing header (depending on the encap configuration)
>   c) The Tunnel Encapsulation Limit Option (within a Destination Options
> Extension Header)
>
>   - HbH options are pretty academic these days, given they cause about
> 39-54% discard [RFC7872]. However, if there is one on the inner, I guess we
> should still say that a GUE network encap should copy it to the outer before
> UDP/GUE is added.
>   - I believe RFC2473 was wrong to say a routing header could be copied to
> the outer. Imagine a packet gets tunnelled that has a routing header listing
> addresses D2, D1 & D0 still left to visit. Although it is unclear what it
> means to copy a routing header to the outer, it must mean that these
> addresses would be visited by the tunnelled packet, then visited again after
> decapsulation.
>   - I believe the Tunnel Encapsulation Limit Option is also pretty academic
> these days, but again, if one arrived, a GUE network encap ought to check
> the value, decrement it, and copy the header to the outer.
>
Hmmm, I'm not sure how much we need to say here. These seem like
general questions for any IP over IP tunneling. I tend to believe that
the point of tunneling to create a virtual link so it can have it's
own properties. RFC2460 recommends use of encapsulation to add EHs to
packet, and also we have cases where tunneling is used to hide EHs
that would otherwise be dropped by middleboxes.

> * Transport encap:
> In this case, I have suggested where the UDP/GUE header should fit in the
> following order of extension headers (copied from RFC2046):
>            IPv6 header
>            Hop-by-Hop Options header
>          +UDP
>          +GUE
>            Destination Options header (note 1)
>            Routing header
>            Fragment header
>            Authentication header (note 2)
>            Encapsulating Security Payload header (note 2)
>            Destination Options header (note 3)
>            upper-layer header
>
That is valid and would work in the implemenation. Although
intermediate nodes aren't supposed to be looking at anything but HBH,
we know they do so hiding EHs in GUE might be a problem and again we
don't want them trying to parse UDP payload. Also, for something like
Segment Routing (SR) we probably want Routing header ouside of GUE
lest we force every node in the list to have to support GUE.

> The draft ought to mention that if AH has been applied to a packet which is
> then encapsulated by GUE in transport mode, the AH header is not
> recalculated, so it does not cover the UDP/GUE headers. Decapsulation works
> because the UDP/GUE headers are inserted before the authentication header,
> so they will be removed (by a GUE decapsulator in transport mode) before AH
> is verified.
>
Okay.

> Personally I don't know enough about routing headers to make the decision on
> whether they should be above or below the GUE header in the transport encap.
> I believe they are only processed when a packet reaches the destination
> address in the main header, but I am not familiar with all the different
> routing types (I know some are deprecated, and frankly I couldn't be
> bothered to read the others).
>
See above about SR.

> 2.7/ Reliable delivery of control messages
>
> The examples of potential control messages (those with the 'C' flag) given
> in S.3.5.1. (echo request/reply for testing) aim to mimic the data channel,
> so unreliable delivery as a GUE datagram is appropriate.
>
> The draft doesn't define any other tunnel control messages. However, if it
> did, many/most would need to be delivered reliably and in order (e.g. key
> agreement, any necessary configuration agreement, consistent application of
> connection semantics, etc).
>
> Therefore, reliable ordered delivery for control messages will need to be
> defined (see C3.2/ "Reliable delivery of control messages" in separate email
> for a suggested design).
>
That can be implemented in definition of specific control messages.
I'm pretty sure OAM doesn't require reliability in the encapsulation
layer so we should be okay there.

> 2.8/ Extensibility of the flags and optional fields scheme: doesn't work

Please see my previous email on this. Self describing fields (e.g.
TLVs) are exactly what we are trying to avoid at this layer. We have
almost 40 years of experience with IPv4 options and 20 years of
experience with IPv6 EHs and they aren't widely deployed. GRE is the
only example of a low level protocol we could find with extensibility
that is widely supported.

> S3.3:
> This is meant to be "the primary mechanism of extensibility in GUE".
> However, for extensibility to work, GUE needs to distinguish between:
> * options: the base set of flags+options defined from the start and required
> in all GUE code
> * "extensions" (my term): future extensions to the flags and options.
>
> The current GUE flags scheme only works for options, but it inherently puts
> extensions into a chicken-and-egg stand-off. because:
> a) S5.4 says an implementation MUST drop a packet with an unknown flag. So,
> if the IETF later defines bit 7, until a very large proportion of GUE decap
> implementations have been upgraded with logic that understands bit 7, the
> packet is going to be dropped with high probability. So no encap is going to
> want to set bit 7 on a packet, so there is no motivation for a decap to
> implement the code for bit 7.
> b) For such unknown flags, we cannot change "MUST drop" to "MUST ignore",
> because the lengths of the fields are not self-describing - they have to be
> hard-coded into an implementation. So if one GUE implementation only has
> logic about the flags up to bit 6, but a packet arrives with bit 8 set, the
> implementation doesn't know how large the "Fields" field is, so it doesn't
> know where the private data starts.
>
> For proper extensibility, each new GUE flagged option needs to be
> self-describing, i.e. with additional fields to say:
> a) Whether nodes that do not have the logic to understand the option should
> drop or ignore the packet, separately for:
>   - nodes on the path
>   - nodes at the dest. (decap) of the GUE datagram.
> b) Whether the option is intended to change on path (in which case it should
> not be covered by integrity or authentication codes).
> c) Whether the option should be copied or not by a GUE-in-GUE tunnel encap
> (see A4.4/ "Tunnels in Tunnels" later).
> d) The length of the option
> e) Additionally you might want to borrow the IPv6 idea of controlling
> whether there needs to be an error message or not, but personally I believe
> that is overkill (the intention was for silent failure to be impossible for
> critical features, but it is very hard to deliver error messages reliably
> anyway).
>
> The above shows that attempting to invent a new extensibility scheme usually
> ends in tears. The IETF and others have developed tried-and-tested
> extensibility approaches like TLV, CBOR. Even then, they still have
> problems. The above points draw lessons from all this, particularly:
> * action codes and change codes in the initial bits of IPv6 HbH & DO options
> [RFC2460]
> * TRILL extension word flags: critical and non-critical separately for
> hop-by-hop and ingress-to-egress (see [RFC7179] updated by [RFC7780]).
> * 'Self-describing objects', including type and size, is listed as
> 'Architectural Principle of the Internet' number 3.12 in [RFC1958]
>
> 2.9/ Hard-coded option lengths do not scale
>
> By hard-coding the length of each option in an RFC and in the GUE code
> (rather than self-describing in the packet), you are stuck with a certain
> size option for ever. Experience has proven that fields such as message
> authentication codes (MACs), fragment IDs, etc. have to scale. Admittedly,
> we could define flags for larger fields later, but I have shown above that
> new flags would be undeployable.
>
I don't see how this would be more undeployable than TLVs. Suppose we
implemented fragmentation with 32 bit ids and realize later that that
needs to be 64. With flag-fields we would define a new 64 bit field
and change sender and receivers. With a TLV we could start sending the
fragment TLV with 64 bits. That easily expressed would be in the
length field, but why would we expect the receiver to be able to deal
with that if it implemented the feature for 32 bits? In both TLV and
flag fields the implementations need to change and it is also implicit
that some negotiation occurs so that both sides agree on the id size.

> 2.10/ Random access to options needs motivating
> Quoting S3.3:
>
>    Flags allow random access, for instance [...]
>
Because order of option processing is relevant. Please look at section
7 of draft-herbert-gue-extensions-00. For instance, it seems intuitive
that we should process the options that verify or authenticate the GUE
header before processing options that might have side effects (like
doing reassmbly).

I will add some text to the GUE draft for this.

> There might be a case for GUE to use a protocol heap rather than a stack
> [Braden03]. If so, please motivate it.
>
> [Braden03] Braden, R., Faber, T. & Handley, M., "From Protocol Stack to
> Protocol Heap: Role-Based Architecture," ACM SIGCOMM Computer Communication
> Review 33:17--22 ACM (January 2003)
>
>
> 3/ STATE
>
> 3.1/ Per-connection state vs. stateless connections but per-tunnel state
>
> The GUE draft does not suggest a mechanism for GUE endpoints to apply
> connection semantics.
>
> * For transport encap the GUT draft suggests an approach that uses per-flow
> state (see the example given in Figure 4 in draft-manner-tsvwg-gut-02).
> * For network encap a stateless approach is proposed in my separate email
> (see C1/ "Stateless Connection Semantics"). Statelessness is important to
> simplify migration during load-balancing, failures etc.
>
This is discussed in detail in draft-herbert-transports-over-udp-00.

> The 'shared fate' resilience principle [Clark88] maintains that a system
> should avoid reliance on flow-state held on the path, preferring to hold
> state solely at the endpoints. One could argue that, in transport encap
> mode, the GUE endpoints are on the end hosts, and therefore, the
> communication path is resilient because if GUE flow state is lost because an
> end host fails, the communication will have failed anyway. However,
> strictly, a GUE endpoint process is likely to be separate (perhaps even in
> NIC hardware) so it could fail independently of the true endpoint process of
> the connection.
>
> So it would be ideal to use a stateless approach for both network and
> transport encap. However, the best stateless approach I could come up with
> (if it works at all) requires some coordination and hence one-off set-up
> latency between the GUE endpoints. Therefore, stateless connections will be:
> * more appropriate for network encap (usually long-lived tunnels); and
> * less useful for transport encap (opportunistic per connection).
>
> To summarize, it is likely that the stateful approach will be used, at least
> for some GUE encapsulators in transport mode. Therefore, for the transport
> encap mode at least, the draft needs to consider per-flow state and its
> management (see following section).
>
> [Clark88] Clark, D.D., "The design philosophy of the DARPA internet
> protocols," Proc. ACM SIGCOMM'88, Computer Communication Review
> 18(4):106--114 (August 1988)
>
> 3.2/ Transport encap with Connection Semantics: Flow state management
>
> Hosts already maintain flow-state for each connection in progress. To
> support GUE in transport encap mode, it is trivial for the hosts at each end
> to associate a little extra state with the existing state of each inner
> flow:
> * At the initiator end, it needs no flow-state to receive GUE packets, but
> in order to send GUE packets, it associates the original (inner) flow's ID
> with the source port it will use in the UDP outer to send every GUE packet.
> * At the responder end, it has to associate the inner flow ID with the
> source port in arriving GUE UDP outer headers. It needs this so that, when
> the inner flow sends out packets, the GUE encapsulator can intercept them
> and encapsulate them with a GUE header, using the stored source port as the
> destination port.
> * Any error messages returned from the responder also need to be
> encapsulated in the same way.
>
> Also, the draft needs to specify:
> * that a GUE transport decap ought to protect itself against DDoS by not
> storing flow state if no associated socket is open;
> * how long to time out unused flow state;
> * what to do with a packet if the necessary flow state is not present;
>
> 3.3/ Keepalives for middlebox flow state
>
> Middleboxes, such as firewalls and NATs time out the pin-hole associated
> with UDP flow-state fairly rapidly, but rarely less than 15s [RFC5405].
> RFC5405 rightly says that an application that uses UDP should be responsible
> for recovering a timed out connection, rather than the stack sending
> keepalives to hold open a connection, when it doesn't actually know whether
> the application still wants the connection open.
>
> Nonetheless, an inner flow will not be aware that it is being tunnelled
> using UDP/GUE. Therefore it seems less inappropriate for the GUE encap to
> keep state alive on behalf of the application, so it ought to send keepalive
> GUE datagrams to hold any pin-hole open. However, if the application has not
> sent anything for some time (whatever that means), the GUE encap should time
> out the connection, rather than holding middlebox flow-state (and its own
> flow-state) open for ever.
>
> If you agree, it might be necessary to specify a keepalive control message
> that a GUE encap can send to the remote end of the GUE tunnel (which would
> also keep any flow-state at the remote end alive). These would only be
> necessary in one direction, and would not need to be reliably delivered.
>
I think disassociated location in TOU will mostly obviates the need
for keep-alives. IMHO keep-alives are a terrible hack for a vaguely
define problem that just serves to fill the Internet with junk
packets, I really would rather not see such things in GUE.

> See Section 3.1 of draft-manner-tsvwg-gut-02 for the keepalive control
> message defined for GUT.
>
>
> 4/ OPERATION
>
> 4.1/ Transport encap: to GUE or not to GUE?
>
Transport encap needs a whole draft by itself (again TOU draft is
probably that). I will add a statement that specifics of transport
layer encapsulation are out of scope of this draft.

> For transport encap, the draft needs to say how the host decides when to use
> GUE and when not.
> There's text on this in S.4 of the GUT draft, if you want to use it.
>
> 4.2/ Hop limit / TTL processing
> I couldn't find any text about this. Perhaps you intended this sentence in
> S.5.3 to cover it:
>
>    it should follow standard conventions for tunneling of
>    one IP protocol over another
>
> I think it would be best to spell out Hop limit processing. There's text on
> this in S.3.2 of the GUT draft, if you want to use it.
>
>
> 4.3/ Error messages
> S5.4
>
>    No error message is returned
>    back to the encapsulator.
>
> Please go through every type of error and in each case justify why no error
> message to the encap is necessary.
>
I can do that, but it would be nice to have a good example of an error
message system at L3 (I don't believe ICMP is that!). The problem with
error messages is that it's a lot of work to define them, we need to
implement processing, and in practice they might not even delivered
over the network (like the notorious unreliability of ICMP in the
Internet).

> 4.4/ Tunnels in tunnels
> S5.5 2nd para
>
>    It
>    may encapsulate a GUE packet in another GUE packet, for instance to
>    implement a network tunnel (i.e. by encapsulating an IP packet with a
>    GUE payload in another IP packet as a GUE payload).
>
> A number of problems here:
>
> 1) A "GUE packet" has not been defined. I assume any UDP header with either
> src or dst UDP port = 6080 (see A1.1/ "Inferring Connection Semantics: the
> rule not the exception").
>
> 2) There is an incremental deployment problem here. Existing tunnels won't
> check within the outer IP for whether a UDP port is a GUE port. They will
> just add a new outer IP header without the UDP or GUE.
>
They shouldn't check the UDP ports either unless the node is the one
that created the inner packet. Again, intermediate nodes cannot safely
parse UDP payloads.

> 3) Whatever, if a tunnel is GUE-aware, this para needs to be clear exactly
> which headers it should copy with the outer IP:
> * Do you intend this to mean that all the following should be copied to the
> outer IP header:
>   - the outer UDP,
>   - any v0 GUE header
>   - plus any GUE options or private data.
> * Is it appropriate to copy all the options and private data? I think only
> some (e.g. perhaps the VNID in certain circumstances?). Others would not
> have the correct semantics if blindly copied (e.g. fragment options,
> coverage of MACs, etc).
> * How does a GUE-in-GUE encapsulator know which to copy?
>
Generally, I don't think anything should be copied. Each encapsulation
header is idempotent in that it describes just that instance of
encapsulation. Looking at the GUE extensions defined and the header, I
don't see anything that would make sense to copy.

> Also, should any extension headers on an arriving IPv6 outer also be copied
> to be associated with the new outer? If so, which ones, and how does the
> encapsulator know? Do the same rules apply whether using transport or
> network encapsulation?

See points above about EH and placement in header with GUE.

>
> I have been arguing since about 2009 that, when adding a new IP outer, each
> IP (at least IPv6) extension header should self-describe which headers
> should be copied to the outer on encap. At present RFC2473 lists some
> extension headers that might be copied and says it depends on the
> configuration of the encapsulator. But a hard-coded list precludes
> introduction of any new extension that needs to be copied. And certainly it
> doesn't work for extensions like GUE that don't fit into the original mould
> of what an IPv6 extension looks like. The behaviour needs to be somehow
> self-declared in each header, not in a standard.
>
> It is tough to solve this problem in a way that will work with existing
> tunnels. It needs solving more generally, not just for GUE. However, as long
> as GUE encapsulators address this problem from day-1, GUE presents an
> opportunity to solve the general problem in environments where all
> encapsulations are GUE-based (see my proposed solution in C4.1/ "Ensuring
> certain GUE headers are copied when a GUE packet is tunnelled" within my
> separate email on redesign). Then other encapsulation approaches might
> follow.
>
Or just say no header is copied. Consider we might be tunneling a
GUE/IPIP packet in GUE. Would we expect that implementations dig into
the IPIP packet to see if there's an encapsulated GUE packet? The
number of combinations of encapsulating encapsulated packets is
enormous, IMO it's going to be easier to just say every encapsulation
is independent (modulo existing standards for propagating IP fields
like diff-serv to outer headers).

> 4.5/ SHOULD adjust MTU?
>
>     An operator may set MTU to account for encapsulation overhead
>     and reduce the likelihood of fragmentation.
>
> I would expect "SHOULD" here.
>
Okay.

> You might want to refer to draft-ietf-tram-stun-pmtud for a way to do PMTUD
> with UDP (for STUN, but I think it would be similar for GUE).
>
Okay.

> 4.6/ Is orig-proto field necessary in the fragmentation option?
> S4.3 of draft-herbert-gue-extensions-00
>
> Why does the original protocol of a fragmented packet need to be visible
> before reassembly by declaring it in the GUE fragmentation option of each
> fragment? The GUE protocol field will be available once the fragments are
> reassembled, and I can't see why it would be needed before that.
>
> It is not good security practice to create multiple fields that are all
> intended to be set to the same value. Even if the implementation uses these
> orig-proto fields before reassembling the fragments, it will still have to
> check that they all match the GUE protocol field when the packet has been
> reassembled. And if any are not the same, it will raise security concerns
> about any action that had previously been taken based on an inconsistent
> value.
>
This because we can put the original protocol in the proto field of
fragments, intermediate nodes that don't understand fragment option
would misinterpret packet. The protocol field in each fragment must be
matched or the packet isn't reassembled, I don't see how that is bad
security ;-)

> 4.7/ Congestion Control: reductio ad absurdum
> S5.9
> I suggest you remove the para about DCCP being appropriate for tunnel
> congestion control. I appreciate you are trying to comply with RFC5405, but
> it is impossible for tunnel specs to do so without looking absurd. The more
> you try, the more it will look like you are the ones that are absurd.
> RFC5405 gives no guidance on how to comply with its requirement about
> congestion control of non-IP traffic across a tunnel... because there is no
> running code for tunnel congestion control, or for a network circuit
> breaker.
>
> It has been suggested in the past that DCCP should be used across tunnels.
> DCCP is intended for a single flow and all the DCCP profiles defined so far
> ensure a DCCP "flow" will consume about as much capacity as a TCP flow. If
> DCCP were to be applied across a GUE tunnel it would reduce the rate of the
> aggregate of all flows across the tunnel to roughly the same as a /single/
> TCP flow (see the intro of RFC7893 "Pseudowire Congestion Considerations").
>
> One might imagine that RFC5405 means that a tunnel protocol designer would
> have to detect roughly how many flows a tunnel aggregate consisted of at any
> one time (say N flows) and attempt to design a congestion control (e.g. a
> DCCP profile) to consume roughly as much capacity as N TCP flows. However,
> this would probably cause horror for some in the transport area at the
> thought of the IETF endorsing a congestion control that can be N times as
> greedy as TCP.
>
> To further reduce the idea of a tunnel encap applying congestion control to
> absurdity, it would need:
> a) a huge buffer to absorb incoming packets whenever they arrived faster
> than the tunnel rate. All packets (in small and large flows) would back up
> behind this huge queue, which would be called buffer bloat, which would
> cause horror for most people in the transport area.
> b) ideally, a time machine (a negative buffer) to bring packets forward in
> time whenever the arrival rate of all the flows was insufficient to satisfy
> the desired aggregate rate of the tunnel.
> c) the addition of feedback channel(s) and a huge amount of extra
> processing.
>
I don't quite understand. Wouldn't DCCP just provide a congestion
controlled context for a flow. In this case the flow is a tunnel, but
shouldn't that fact be irrelevant to DCCP?

In any case I'm not an expert in DCCP, if the consensus is that it's
not useful for tunnels we can remove it.

> [As you can see, I don't support the idea in RFC5405 that a tunnel becomes
> responsible for congestion control of traffic that it encapsulates.
> Otherwise, to be consistent, an Ethernet link would become responsible for
> congestion control of traffic it encapsulates. However, I accept that
> consistency with RFC5405 is currently a hurdle your draft has to cross
> before it can be approved. If you feel you have to suggest a mechanism, IMO
> a policer makes sense - either a rate policer or a congestion-rate policer.]
>
>
> 4.8/ Multicast outer -> Implosion on inner destination
> S.5.10
> Consider an inner flow of unicast packets, src-IP A, dst-IP B. Consider the
> encap adds an outer addressed to multicast address M, and consider n
> decapsulators subscribe to group M. This will cause the network to duplicate
> each packet n times. As each decap forwards the inner, n duplicates of each
> packet will converge on B.
>
> This might make sense with unicast inner packets for a small number of
> decaps (e.g. two for redundancy). And a multicast overlay could make sense
> for multicast inner packets as long as the multicast routing was aware of
> the P2MP tunnel (with suitable grouping of multicast groups).
>
> I think the text should say that a multicast outer is not precluded, because
> it is a theoretical possibility, but it should not be attempted without a
> safety harness and an empty bladder.
>
Okay.

> 4.9/ Deriving flow entropy from the inner is contrary to "GUE permits encap
> of arbitrary IP protocols" claim
> S.5.11.1
> The general idea for creating flow entropy seems to be for the GUE encap to
> map inner flows of possibly "atypical IP protocols" to individual UDP outer
> flows, on the assumption that switches or routers that implement ECMP etc.
> will understand UDP but not "atypical IP protocols". Let's examine this
> claim by taking network encap and transport encap separately.
>
> 1) Network encap
> Imagine that a GUE encap has been implemented that understands TCP, UDP,
> SCTP, DCCP, ICMP, RSVP, IPsec and ESP.

For everything in this list except TCP and UDP I would view these as
"atypical IP protocols"-- most devices don't know how to parse these
for ECMP. This is one of the big values of UDP encapsulation,
encapsulating such packets in UDP increases chances of
interoperability and usefulness of optimizations like ECMP.

> Then researchers implement NewSexyTP, with a new IP protocol number. Every
> GUE encap in the world doesn't have any logic to understand or locate the
> flow ID fields of NewSexyTP. So GUE does not "permit encap of arbitrary IP
> protocols" as claimed in the motivation section.

Okay, I will change the statement not to be all inclusive.

>
> Further, why will GUE implementations be updated with logic to understand
> NewSexyTP any faster than the ECMP code in general-purpose switches and
> routers? One GUE implementation might be updated, but other developers might
> not so diligently track the latest transport protocols. One cannot even
> really argue that the ECMP code in switches and routers is implemented in
> hardware, so it will be harder to change than GUE code. Because the
> forwarding performance of GUE tunnel encap will need to be no different to
> the performance of forwarding in general switches and routers, so if
> hardware is necessary for one it will be necessary for the other.
>
It's for getting flow based ECMP. If NewSexyTP is deployed today
(without encapsulation) it might work but ECMP would only be based on
3-tuple so that can lead to poor load balancing of the protocol.
Encapsulating in UDP with good entropy in source port for inner flow
can fix this. As I mentioned before, setting IPv6 flow label can also
address this, that _does_ work for arbitrary transport protocols if
the sender sets it per flow and switches support it for ECMP.

> 2) Transport encap.
> If GUE encap is implemented as a centralized daemon process on a host or
> centralized in a NIC, it will suffer from the same lack of forward
> compatibility with new transport protocols as the network encap -
> particularly if it is implemented in NIC hardware. Ie, if an operator
> installs SexyNewTP in their OS, they will also have to wait for a GUE update
> that supports SexyNewTP. This is the case with or without connection
> semantics.
>
This is actually the easier problem. TOU can only be done by the end
host not intermediate devices, so the transport implementation can set
the UDP source itself for entropy. In this case, we don't even need
hash on transmit, an entropy value is just saved in the flow context
that is used in each packet for entropy values. The NIC doesn't need
to know about any of this, all it seems are UDP packets and can
perform it's functions like packet steering just based on information
in the IP/UDP header.

> However, it might be possible to implement GUE transport encap (including
> with connection semantics) so that each instance of a protocol stack is
> associated with an instance of GUE (warning: I have no idea yet whether this
> will be possible). In this case, each GUE instance would consistently add
> the same outer port number to the inner protocol instance it was associated
> with, without needing to understand how to identify a flow ID in any
> particular protocol.
>
> In summary, certainly for net encap, but possibly not for transport encap,
> GUE only helps "atypical IP protocols" that a particular GUE encap
> implementation already understands.
>
> 4.10/ Flow entropy from encrypted data could weaken the crypto?
> S.5.11.1
>
>      o If a node is encrypting a packet using ESP tunnel mode and GUE
>         encapsulation, the flow entropy could be based on the contents
>         of clear-text packet. For instance, a canonical five-tuple hash
>         for a TCP/IP packet could be used.
>
> I'm not a crypto expert, but it sounds dangerous to take some clear-text
> from a known position in the data, hash it with a function that is not
> strongly one-way, then send this hash along with the cipher text.
>
> I think the SPI can be used as a unique consistent per-flow value, can't it?
> The SPI has been suitably randomised so that it reveals nothing about the
> flow ID.
>
Some of this is discussed in the TOU. IMO anything sent in plaintext
in the packet that could be used in tracking should be periodically
rotated for security at least on the Internet. When we are sourcing
TOU this is easy because the value used to derive entropy is not based
on a hash, it's just a random number that we can change over time.

> 4.11/ No need to constrain flow entropy distribution
> S.5.11.2
>
>       o The flow entropy should have a uniform distribution across
>         encapsulated flows.
>
> Equal distribution of flows is not necessarily appropriate for all
> scenarios. Flows have a distribution of sizes, and altho ECMP is generally
> done randomly, an operator might want to (somehow) bias the hash algorithm
> to allow for the flows with the highest rate, which might otherwise
> unbalance the load. See for instance:
> "Engineered Elephant Flows for Boosting Application Performance in
> Large-Scale CLOS Networks" Broadcom White Paper (March 2014)
>
Well it is a "should"

> 4.12/ No need to constrain flow entropy interpretation
>
>         Decapsulators, or any networking devices, should not attempt to
>         interpret flow entropy as anything more than an opaque value.
>
> This seems unnecessarily constraining. This might not be a good idea, but if
> someone finds a use for it, there's no need to stop them - if it's useful
> they'll ignore you anyway, so why bother saying it? Perhaps you intended to
> explain why doing this could be problematic, rather than precluding it?
>
Again its a should. The problem is that if they do this they may end
up assuming semantics that happenstance from the implementation not
the protocol. If the implementation changes and devices continuing to
assume the wrong semantics this breaks things. I believe this text is
consistent with the requirements of IPv6 flow label.

> 5/ SECURITY
>
> 5.1/ Addresses that are both visible and hidden? Have your GUE and eat it
> too?
>
> S.7.  In the following sentence,
>
>    Existing network security
>    mechanisms, such as address spoofing detection, DDOS mitigation, and
>    transparent encrypted tunnels can be applied to GUE packets.
>
> This should point out that an existing set of address spoofing detection
> rules would not work with GUE. I think you meant that existing rules and
> mechanisms could be modified to check the packets encapsulated by GUE
> without using radically new techniques.

Okay.

>
> However, if GUE is in network encap mode and it encrypts the IP headers of
> the inner packets, address spoofing detection and DDoS mitigation will not
> be possible over the length of the GUE tunnel. You cannot both claim that
> GUE can hide information, and that GUE allows existing security techniques
> to work that rely on access to the hidden information.
>
I think the meaning here it that security mechanisms work on GUE
packets. Will clarify.

> 5.2/ How can the Security option protect a UDP/GUE header from being moved
> or removed?
>
It can't. The receiver can require it's presence as a condition for
accepting a packet.

> The Security option is "used to provide integrity and authentication of the
> GUE header."
> I assume you envisage this would be complemented by other authentication
> techniques such as IPsec AH to provide integrity and authentication of the
> rest of the packet.
>
Yes, that is why we specifically say it only covers the GUE header.

> However, it occurs to me that the two together do not protect the integrity
> of the structure of the packet as a whole (whether network or transport
> encap). An on-path attacker could still move the UDP/GUE header within the
> packet (it might be possible to construct a valid packet with altered
> semantics), or remove the UDP/GUE header completely. I can't immediately
> think whether any damage could be done with such an attack, or how to
> prevent it. However, I'm sure there will be a crypto expert for whom this is
> not a new problem.
>
Yes. if integrity of the whole packet is required then AH can be used
over the whole packet.

> Also, the 32B max length of the security option is insufficient. I looked
> for a MAC protocol where a larger field is needed, and the first one I
> picked required a larger field: RFC4383 "TESLA in Secure RTP" requires 34B,
> and that's just for the default sizes, not even the maximum. I picked TESLA
> because I knew each datagram needs a lot of authentication space. TESLA
> provides multicast message authentication, so as well as a key index and a
> MAC, each packet reveals a continually changing key.
>
I noticed that use of HMAC often includes a keyid, so we probably want
to be +4 bytes for that also. We can add another field to supplement,
or increasing the existing ones.

> 5.3/ What happens when a port scan sends a datagram to port 6080?
>
> When a port scan (that doesn't necessarily know about GUE) sends a datagram
> to port 6080, if the datagram has a body, and the body starts with a zero
> bit, the GUE daemon will start processing it.
> If the first 4 octets happen (randomly) to be set to values that would be a
> valid GUE header (see S.5.4), it will be decapsulated and forwarded to a
> protocol handler.
>
> Not a show-stopper, but worth documenting?
>
I'm not sure what the relevance of the question is. If we receive a
packet on 6080 (or any UDP port for that matter) it is up to the
receiving application to validate or authenticate the packet. Why is
port scan special? This why security mechanisms in GUE are first class
citizens in the protocol.

> 5.4/ Firewalls will still block new/atypical protocols
> Few firewalls allow incoming UDP. So GUE will not enable deployment of
> servers using atypical/new protocols, which will still face a deployment
> problem.
>
There's couter evidence to that is QUIC deployment and deployment of
other UDP based protocols in the Internet.

> If a firewall opens a pin-hole to allow incoming UDP to access the
> well-known GUE port it would allow attackers to reach servers of any
> protocol while bypassing the firewall. E.g. an attacker could access a
> TCP-server by encapsulating TCP in GUE in order to bypass the firewall.
> Therefore, a firewall will only open a pin-hole to a GUE server, if it also
> inspects the packet encapsulated by GUE and applies all its normal rules to
> that as well.
>
TBH, I really don't want firewalls or anyone else in the network
cracking open GUE packets. I don't want firewalls looking at my TCP
options and trying to "optimize" my connections by rewriting rcvwin or
TS or whatever else they do. One of the major features of GUE is that
we can use DTLS to encrypt the whole encapsulated packet, and in the
case of TOU this would be encrypting the L4 headers. This is the only
know solution to protocol ossification. See TOU draft for discussion.

> This is why I have said elsewhere that the draft should state that firewall
> bypass by new/atypical protocols is a non-goal of GUE.
>
As far a firewalls are concerned, GUE packets are UDP packets and so
they can apply whatever rules are appropriate for UDP. If they really
need to know state or information about the payload then that needs to
be done in the context of something like SPUD or PLUS. Again, we
intend to encrypt transport layers so there's no general concept of
firewalls parsing GUE to find transport latest information.

> 5.5/ Transport Encap: Two Passes through a Local Firewall?
>
> GUE in transport mode resubmits the encapsulated packet to the host's IP
> stack. But it needs to make sure it re-injects the packet at the correct
> point in relation to any local firewall.
>
> * If the firewall includes rules to inspect the packet encapsulated with GUE
> (as discussed in the previous point), it would make sense to re-submit the
> packet above the local firewall.
> * If not, GUE should resubmit the packet so that it passes through the local
> firewall again.
>
> The latter mode would make more sense if GUE was also decrypting the inner
> packet. So, rather than have two options, a local firewall could work
> co-operatively with GUE in transport mode, so it doesn't have to inspect the
> inner in both passes.
>
Number of passes to make through a firewall at an encapsulator is an
implementation issue. In Linux I believe we can make one or two
passes, if it is just one the inner packet is what is checked.

> 6/ Implementation
>
> 6.1/ Practical Large Receive Offload Requirements
> Appendix A.4 says:
>
>    The conservative approach to supporting LRO for GUE would be to
>    assign packets to the same flow only if they have identical five-
>    tuple and were encapsulated the same way. That is the outer IP
>    addresses, the outer UDP ports, GUE protocol, GUE flags and fields,
>    and inner five tuple are all identical.
>
> Rant: It is sad if such a conservative approach to LRO is still necessary.
> Any API to LRO hardware needs to be able to be given the locations of
> certain header fields that are deliberately intended to vary, so it can
> offer the facility to separately report these for each packet. A MAC of the
> encapsulating headers is a good case in point. ECN is an even better example
> of a varying field, because it has been a standard part of the IP header
> since 2001, long before LRO hardware was designed.

We (Linux community at least) are pressing NIC vendors to provide
protocol specific offloads that work accross various encapsulation
protocols. LRO is one of the hardest because it requires the device to
parse the encapsulation protocol. I think the answer here is to move
to programmable NICs (via BPF maybe).

Generally though, the fewer fields that vary the better for NIC
offloads. You might want to look at partial-GSO work by Alexander Duyk
which generalizes GSO HW offload for a variety of encapsulations.

Tom

>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/