Re: [IPsec] Mirja Kuehlewind's Discuss on draft-ietf-ipsecme-tcp-encaps-09: (with DISCUSS)

Hello all,

Here's some proposed text for:

- Clarifying the configuration model around ports
- Clarifying the role of the stream prefix
- Expanding the TCP performance considerations. 

Changes are in bold.

Thanks,
Tommy

-------

2.  Configuration

   One of the main reasons to use TCP encapsulation is that UDP traffic
   may be entirely blocked on a network.  Because of this, support for
   TCP encapsulation is not specifically negotiated in the IKE exchange.
   Instead, support for TCP encapsulation must be pre-configured on both
   the TCP Originator and the TCP Responder.

   Implementations MUST support TCP encapsulation on TCP port 4500, 
   which is reserved for IPsec NAT Traversal. 

   Beyond a flag indicating support for TCP encapsulation, the configuration
   defined on each peer can include the following optional parameters:

   o  Alternate TCP ports on which the specific TCP Responder listens for
      incoming connections.  Note that the TCP Originator may initiate
      TCP connections to the TCP Responder from any local port. 

  ...

4.  TCP-Encapsulated Stream Prefix

   Each stream of bytes used for IKE and IPsec encapsulation MUST begin
   with a fixed sequence of six bytes as a magic value, containing the
   characters "IKETCP" as ASCII values.  This value is intended to identify
   and validate that the TCP connection is being used for TCP encapsulation
   as defined in this document, to avoid conflicts with the prevalence of previous
   non-standard protocols that used TCP port 4500. This value is only sent once, 
   by the TCP Originator only, at the beginning of any stream of IKE and ESP messages.

...

12.  Performance Considerations

   Several aspects of TCP encapsulation for IKE and IPsec packets may
   negatively impact the performance of connections within a tunnel-mode
   IPsec SA.  Implementations should be aware of these and take these
   into consideration when determining when to use TCP encapsulation.
   Due to these performance impacts, implementations SHOULD favor using 
   direct ESP or UDP encapsulation over TCP encapsulation whenever
   possible.

12.1.  TCP-in-TCP

   If the outer connection between IKE peers is over TCP, inner TCP
   connections may suffer effects from using TCP within TCP.  Running
   TCP within TCP is discouraged, since the TCP algorithms
   generally assume that they are running over an unreliable datagram
   layer.

   If the outer (tunnel) TCP connection experiences packet loss, this loss
   will be hidden from any inner TCP connections, since the outer connection
   will retransmit to account for the losses. This means that loss on an outer 
   connection will be interpreted only as delay by inner connections. Since
   the outer TCP connection will deliver the inner messages in order, any messages
   after a lost packet may have to wait until the loss is recovered. This increases
   the burstiness of inner traffic, since a large number of inner packets will be delivered
   across the tunnel at once. This effect is especially of concern when there is a high 
   level of loss on the outer connection.

   TCP-in-TCP can also lead to increased buffering, or bufferbloat. This can occur when the
   window size of the outer TCP connection is reduced, and becomes smaller than the window
   sizes of the inner TCP connections. This can lead to packets backing up in the outer TCP
   connection's send buffers. In order to limit this effect, the outer TCP connection should have
   limits on its send buffer size, and on the rate at which it reduces its window size.

   The inner TCP's round-trip-time estimation will also be affected by the burstiness of the outer 
   TCP connection.  This will make loss-recovery of the inner TCP traffic less reactive and more
   prone to spurious retransmission timeouts.

   Note that any negative effects will be shared between all flows going through the outer TCP
   connection. This is of particular concern for any latency-sensitive or real-time applications 
   using the tunnel. If such traffic is using a TCP encapsulated IPsec connection, it is recommended
   that the number of inner connections sharing the tunnel be limited as much as possible.

...

12.5.  Tunnelling ECN in TCP

   Since there is not a one-to-one mapping between outer IP packets and inner ESP/IP messages
   when using TCP encapsulation, the markings for Explicit Congestion Notification (ECN) cannot
   be simply mapped. However, any ECN markings on inner messages should be preserved through
   the tunnel.

   Implementations SHOULD follow the ECN compatibility mode as described in RFC 6040. The
   outer TCP connection SHOULD mark its packets as not ECN-capable, but SHOULD NOT clear
   any ECN markings on inner packets.

...

14.  IANA Considerations

   This memo includes no request to IANA.

   TCP port 4500 is already allocated to IPsec for NAT Traversal.  This port SHOULD
   be used for TCP encapsulated IKE and ESP as described in this document.

> On Apr 28, 2017, at 4:41 AM, Mirja Kühlewind <ietf@kuehlewind.net> wrote:
> 
> Hi Tero,
> 
> a few quick replies but we also discussed this yesterday at the telechat and agreed on a way forward.
> 
> On 27.04.2017 16:12, Tero Kivinen wrote:
>> Mirja Kühlewind writes:
>>>> I agree that this kind of port squatting is regrettable, but I also don't
>>>> think it really
>>>> helps to not publish RFCs that document widely used protocols because we
>>>> are sad they port-squatted.
>>>> 
>>>> I proposed a way to deal with this in an earlier e-mail. Would that be
>>>> satisfactory
>>>> to you. To retransmit, we add the following
>>>> 
>>>> "Note: While port 4500 is the reserved port for this protocol, some existing
>>>> implementations
>>>> also use port 443. The Stream Prefix provides some mitigation against
>>>> cross-protocol
>>>> attacks in this case, however, the use of port 443 is NOT RECOMMENDED"
>>>> 
>>>> We could do something similar for port 80.
>>>> 
>>>> Would that work?
>>> 
>>> This already is good but I think it's not enough. As Tero noted the working
>>> group thought that they rather specify a generic scheme which I find
>>> problematic. Currently the drafts says:
>>> 
>>> "This document leaves the selection of TCP ports up to
>>>    implementations.  It is suggested to use TCP port 4500, which is
>>>    allocated for IPsec NAT Traversal."
>>> 
>>> Which sounds to me like an invitation to squat on any open port regardless
>>> what the port is supposed to be used for (hoping that the magic number would
>>> avoid any collision). I don't think that a good thing to right in an RFC.
>> 
>> Note, that configurations can only use ports which are defined to be
>> used by the responder. I.e., if the responder is configured to listen
>> port 4500 and port 443 (with TLS wrapping), there is no point of
>> initiator to try port 80, as it will not work.
> 
> I understood this. My point breaks down to a) the case where you would use two services at the same time on the same port (see below) and b) some language changes that do favor to use 4500 where possible and not explicitly advise to use other assigned specific ports in the line of what Ekr already proposed.
> 
>> 
>> I.e., in practice this will end up the operators picking suitable
>> number of port numbers they will configure their system to respond to,
>> and most likely they will try to use services which are not in use
>> normally by the responder. I.e., if the responder is running
>> web-server, it cannot very easily also use the port 80 (or 443) for
>> the IPsec traffic, thus it might pick ports 4500, 8000, 8080 or some
>> other random ports which might go through the filters which are
>> already blocking all UDP traffic, and at least port 4500 for TCP.
> 
> Actually this is another good point. If you have multiple open ports, you can give more advise which one to try when and in which order, e.g. always try 4500 first (first UDP then TCP which the draft I believe already says), then others.
> 
>> 
>> So this is same thing we have now in web. I am quite often running web
>> servers on random ports just because some places have had filters
>> blocking some ports. For example one of my friends company network
>> blocked connection to port 8080, but did not block connections to port
>> 2020, so my web server was running (also) on that port...
>> 
>> So unless you are saying "TCP/UDP ports on all protocols MUST NOT be
>> configurable, and only port numbers reserved for those services are
>> allowed" there is nothing we can really do.
>> 
>> (and even if you say that, nothing is going to change, as people are
>> so used to getting past stupid firewalls).
>> 
>>> Now given the text you propose above, I actually assume that the text I just
>>> cited will be removed but the whole document is written with this assumption
>>> and therefore there are a couple more places where wording probably needs to
>>> change.
>>> 
>>> I do understand well the problem and that 443 is used in practice. However,
>>> to match reality I would rather like to see a document that specifies the
>>> approach of encapsulating in TLS/TCP on port 443 that is used today and pure
>>> TCP encapsulation for use with port 4500 only. Again i think that's where
>>> your proposed text is heading to but I think it needs more changes; in this
>>> case it would also make sense to add the TLS part back in the main document
>>> for 443 only.
>> 
>> This is what the document tries to say. I.e., for some ports (like
>> 443), there might be some other framings (like TLS) in place, and for
>> other ports there might not be. As IPsec will require
>> pre-configuration this information which ports are used, and what
>> framing protocol is used comes from the configuration.
>> 
>>> Further, I have one more question: The document is written in a way that
>>> allows the implementation of multiple services on the used port. Is that
>>> actually done in reality?
>> 
>> Yes and no. There are some old IKEv1 based proprietary TCP
>> encapsulation methods, and I think some of those might actually use
>> port 500 and perhaps also port 4500. So vendor implementing those,
>> might want to do multiplexing based on whether there is the stream
>> prefix in front or not to see whether the old proprietary method is
>> used, or the one defined in document is used.
> 
> There is a difference of distinguishing other/old version of IKE or trying to distinguish IKE from other protocols. The point is for IKE you can define the magic number as reserved but you probably don't really want to reserve the same bits for all other protocols than you may run in parallel on the port. As long as you don't 'officially' make this reservation by updating the respective specs of other protocols, you can never be sure that there will not be a conflict in future version of these protocols.
> 
>> 
>> Another issue might be that the SGW terminating the TCP 443
>> connection, might also support some proprietary TLS VPN implementation
>> and differntiating from that is also something I can see that vendors
>> might want to do.
>> 
>> So I do not know if anybody is really do it now, but I can see reasons
>> why people might want to multiplex the proprietary things they are now
>> running on those ports 500/4500/443 with the standard we are defining
>> here.
> 
> That's fine as long as the multiplexing is between version of IKE only.
> 
>> 
>>> If we could restrict the use of this encapsulation
>>> with servers that only are IKE servers (at least for the used port), you
>>> would actually not need the magic number anymore. I guess you can still have
>>> the magic number if you really want it because that makes it easier to
>>> distinguish valid IKE/IPSec traffic from other random traffic that might be
>>> send to this port but the other service running on this port (on other
>>> servers) does not need to know about the magic number because it is supposed
>>> to never see any IKe/IPSec TCP-encapsulated traffic.
>> 
>> If it is operationally possible, and there is no old proprietary
>> protocols to be supported, I would assume most implementations would
>> not want to bother with the multiplexing different protocols on one
>> port.
> 
> Then let's just not support this case and let's specify in the draft that you cannot run another service on the used port.
> 
>> 
>> I.e., new implementations and new setups will most likely then move
>> for example the adminstative interface over https to separate
>> IP-address, just to make sure that it is easier to implement and
>> operate.
>> 
> 
> _______________________________________________
> IPsec mailing list
> IPsec@ietf.org <mailto:IPsec@ietf.org>
> https://www.ietf.org/mailman/listinfo/ipsec <https://www.ietf.org/mailman/listinfo/ipsec>