Re: [tsvwg] draft-ietf-tsvwg-natsupp-12 - review comments

Michael Tuexen <Michael.Tuexen@lurchi.franken.de> Mon, 13 July 2020 15:45 UTC

Return-Path: <Michael.Tuexen@lurchi.franken.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4D0753A13B3 for <tsvwg@ietfa.amsl.com>; Mon, 13 Jul 2020 08:45:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QAw6OvUBcz0E for <tsvwg@ietfa.amsl.com>; Mon, 13 Jul 2020 08:45:31 -0700 (PDT)
Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 910463A136C for <tsvwg@ietf.org>; Mon, 13 Jul 2020 08:44:10 -0700 (PDT)
Received: from [10.211.21.224] (unknown [194.95.11.175]) (Authenticated sender: lurchi) by mail-n.franken.de (Postfix) with ESMTPSA id 049CE721BE029; Mon, 13 Jul 2020 17:44:07 +0200 (CEST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <3513086b88ce45eabd985e48fbd303d8@mera.com>
Date: Mon, 13 Jul 2020 17:44:05 +0200
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <157CF4E4-F433-4614-AA08-9353464AC80B@lurchi.franken.de>
References: <2f399a790bcf45b8af1203d99cc8589b@mera.ru> <C6EDF205-CBD6-4D39-856F-88C2A122C328@fh-muenster.de> <8deaf1ab80ab49c895b845f7705b4416@mera.com> <D969CA3F-2AE5-493E-B2F5-6C7A38E1CBAD@fh-muenster.de> <3513086b88ce45eabd985e48fbd303d8@mera.com>
To: Maksim Proshin <maksim.proshin@mera.com>
X-Mailer: Apple Mail (2.3608.80.23.2.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/lmGuWSWaIaCyTWf_wOGklHT1YQQ>
Subject: Re: [tsvwg] draft-ietf-tsvwg-natsupp-12 - review comments
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Jul 2020 15:45:35 -0000

> On 11. Mar 2020, at 13:07, Maksim Proshin <maksim.proshin@mera.com> wrote:
> 
> Hi Michael,
> 
> I think we're almost done with my comments. There are two follow-up things below that would require your attention. The rest is fine.
Hi Maxim,

hopefully I have addressed the remaining issues. Please let me know.

Best regards
Michael
> 
> BR, Maxim
> 
> -----Original Message-----
> From: Michael Tuexen <tuexen@fh-muenster.de> 
> Sent: Monday, March 9, 2020 9:35 PM
> To: Maksim Proshin <maksim.proshin@mera.com>
> Cc: tsvwg@ietf.org
> Subject: Re: [tsvwg] draft-ietf-tsvwg-natsupp-12 - review comments
> 
>> On 19. Nov 2019, at 10:56, Maksim Proshin <maksim.proshin@mera.com> wrote:
>> 
>> Hi Michael,
>> 
>> Thanks for your feedback and addressing my comments! 
>> I have some follow-up below inline (with [Maxim] tag). Please take a look. 
>> I think all major comments have been resolved, maybe 1 comment about SHOULD vs MUST still remains below but that should not stop the document to go to WGLC in my view.
> Hi Maxim,
> 
> see my comments inline. I removed the text, which wasn't commented by you or doesn't require an answer from me.
> 
> Best regards
> Michael
>> 
>>> 2.
>>> I think the mechanisms in this document are really complex for the NAT device. The document basically requires the NAT to be able to build SCTP chunks. I think it could be optional as in this case there will not be a problem rather than it may take more time. Perhaps also worse debugging. Anyway I don't see it as super critical. E.g.. the NAT must send ABORT in case when INIT has the same Initial Tag. I think the NAT might drop it without ABORT. In that case SCTP will retransmit INIT several times and then abort the establishment with the same Initial Tag. Next time it will randomly choose a new VTag. 
>>> Or do I miss something?
>> We want to avoid the substantial delay introduced by the timer based retransmissions and give the end-point an indication
>> why the INIT dropped. That way the end-point can just start a new association right away, even without involving the SCTP user.
>> [Maxim] I understood that reason but I don't like MUST here. Can it be replaced by SHOULD or you have strong arguments against that?
> My argument is that I don't want application writers to think about SCTP connection setup
> timeouts are not meaning the peer is dead, but there is a NAT in between which might need
> another connection retry.
> 
> The alternate text would be "SHOULD" send an ABORT, MAY drop the packet which results in
> very long connection setups and requires application writers to deal with connection setup
> timeouts as an indication what retrying it should be done. So the detection that the peer
> is not reachable will take even longer.
> [Maxim] I prefer SHOULD here because it brings an alternative approach for implementation and doesn’t require a dependency from NAT implementation. Yes, it might bring some cons but both approaches have them. As a compromise you can additionally recommend an approach with updates in NAT.
> 
> Do you really think that sending such packets is very hard? I added some code to a firewall
> some time ago and it was not very complex. Don't some NAT boxes send TCP RST segments?
> [Maxim] It's not just about difficulty. It's about MUST that is required to be followed by implementers. If a mechanism can be implemented without MUST then I think it shouldn't be used. 
OK. Changed it to a SHOULD. The sending wouldn't be reliable anyway.
>> 
>>> Also it seems in most of the cases mapping of Int-VTag to Priv-Addr should be enough. Is it really required to keep the rest like External VTag in the table?  
>> Yepp. To map incoming packets with the T-bit set.
>> [Maxim] Yes, it's reasonable but not mandatory in my view compared to e.g. Int-VTag.  That was my point.  Do you think it's a bad idea to add a sentence explaining why Ext-VTag is needed? Btw, I propose to use "T bit" instead to align with RFC4960.
> Fixed.
> [Maxim] Ok
>> 
>>> -----
>>> 3.
>>> It seems the case when SCTP is behind the NAT and it's in the Server mode (accepts INITs) is not described clearly.. There is only one paragraph about it:
>>> 
>>>   For an incoming packet containing an INIT-chunk a table lookup is
>>>   made only based on the addresses and port numbers.  If an entry with
>>>   an External-VTag of zero is found, it is considered a match and the
>>>   External-VTag is updated.
>>> 
>>> I think it’s not enough. What should be the behavior when an entry is not found? When it’s found but
>> You can't do anything...
>>> External VTag is not zero?
>> Drop it.
>> [Maxim] Would be good to mention it in the draft.
> OK. I added:
> If an entry with a non-matching External-VTag is found or no entry is found,
> the incoming packet is dropped. If an entry with a matching External-VTag
> is found, the incoming packet is forwarded.
> [Maxim] Ok
>> 
>>> As far as I understood the idea of the document is that the NAT is kind of a firewall for packets received from unknown external IP address (no entry found including incoming INIT case). Is it really so? Can it be improved in cases when the NAT trusts to some external IPs, i.e. there is a method to configure from which external IP addresses it should accept incoming INITs. In that case the NAT could also map incoming associations.
>> You can also have mechanisms not described in this document to handle exposed ports...
>> [Maxim] Yes, we discussed it and I'm fine with it. Anyway, for me it would be really useful if the draft in the beginnig has a sentence explaining it. 
> I added the last sentence to the following paragraph in Section 4.3:
> 
> <t>For the SCTP NAT processing the NAT device has to maintain a NAT binding table
> of Internal-VTag, Internal-Port, External-VTag, External-Port, Private-Address,
> and whether the restart procedure is disabled or not.
> An entry in that NAT binding table is called a NAT state control block.
> The function Create() obtains the just mentioned parameters and returns
> a NAT-State control block.
> A NAT device MAY allow creating NAT-State control blocks via a management
> interface.</t>
> [Maxim] Ok
>> 
>>> Otherwise I see it as a serious limitation of the proposed solution and it should be clearly stated at the beginning of the document.
>>> -----
>>> 5.
>>> The document focuses on cases when entries are added into the table but doesn’t describe when and how they should be removed.
>>> I think processing of ABORT and SHUTDOWN COMPLETE/ACK should be added.
>>> The document also recommends a timer in Section 10 which I guess results in entries removal. I wonder if it should also be mentioned that if the NAT device is able to track liveness of SCTP instances associated by the private address, it should also remove entries when it’s dead..
>> I think the best way to remove the NAT state is to do it by a timer. You have to have one anyway. And taking
>> specific chunks into account makes the implementation harder in case of the packet containing that chunks is lost
>> after NAT processing. However, the ID does not forbid this. So if you have a setup where such processing makes sense,
>> you are free to add it.
>> [Maxim] Ok, I thought that was the idea of the draft actually but not described well. At least NAT implementations with SCTP support that I saw so far try to use ABORT/SHUTDOWN chunks and the timer, i.e. both ways. I'm actually fine if the draft describes the timer-based approach only but I thought it's logical to also use the chunks as the draft already prescribes to handle them specifically.
> You can add that. It is not forbidden. But you see the SCTP association normally only on one
> path. So in the multihoming case, you need to use the timer approach on all other paths.
> [Maxim] Ok
>> 
>> 
>>> -----
>>> 7. 
>>> The cases when INIT and ASCONF are retransmitted, i.e. sent from the same Priv-Addr are not covered. I think it’s better to clearly state or even have examples that in those cases there will be matching in the table also including Priv-Addr matching and the NAT should handle it like any other chunk/packet, i.e. replace the source address and send it further.
>> We describe where state is added, collisions are detected and so on. So we don't explicitly describe the case of retransmissions,
>> which means you find a matching entry. This is assumed all over the text.
>> [Maxim] For me it would be worth to mention it once in the beginning.
> I added:
> <t>If an outgoing SCTP packet contains an INIT or ASCONF chunk and a matching
> NAT binding table entry is found, the packet is processed as a normal
> outgoing packet.</t>
> [Maxim] Ok
>> 
>>> -----
>>> 8.
>>> I think the “Disable Restart” mechanism might bring issues and not all use cases are explained in the document..
>>> E.g. it might be the case that one of the SCTP hosts behind the NAT is lost and the server side will not discover it immediately. When the SCTP host is alive back it will try to establish the association back. In that case the NAT should drop INIT or not if it still has that entry? I guess it shouldn’t drop it or if there is no entry in the NAT already or if the association was re-established from another private address, SCTP on the server side will create a new association and will have the stale one and the new one simultaneously. If SCTP on the server side is idle and not sending HBs, it will have that stale association forever.
>>> I believe there might be other such cases and I think for the “Disable Restart” mechanism there should be recommendations like enabled Path Monitoring.  
>>> It also contradicts with the idea of RFC4960 to match the association by IP addresses and ports and I think some clarifications should be added, e.g. for OOTB processing.  
>> Yes, it does. I think the requirements are stated in the last sentence of Section 6.3.2.
>> [Maxim] Perhaps I misunderstand because I don't see how Section 6.3.2 addresses the problem that I described above. In my view there is some conflict here. In one hand, in order to work with different clients behind the NAT, Disable Restart feature should be enabled and this is what Section 6.3.2 prescribes. On the other hand, when Disable Restart is enabled, the server can't detect the restart and will always treat it as a new association and might keep the old one forever. Is it the intention of the proposed mechanism? If so, I propose to clearly describe this issue in the draft and propose some mitigation actions, at least path monitoring.  Isn't it a potential security attack when one attacker client disables restart and retriggers the same association many many times consuming only resources for that one association until the server dies due to lack of resources?  
> What is the difference to a client issuing a lot of associations to a server using different port numbers
> or IP(v6) addresses? The server has to protect itself against too many SCTP associations. This is not related
> to the DISABLE_RESTART feature.
> [Maxim] The difference is that an attacker uses the same IP and port. I.e. if in your system you have a filtering mechanism you would limit the number of IPs and ports which could be used so an attacker can't use any IP or port and limited. DISABLE_RESTART makes it worse because an attacker will need to know one IP and one port from the "white" list to perform the attack. I think it would be good to explain in the draft that when the DISABLE_RESTART feature is enabled, the server should care about "old" associations, i.e. remove them or so, due to potential resource issues. 
I have added to the security considerations section:
<t>SCTP endpoints disabling the restart procedure, should monitor the status
of all associations to mitigate resource exhaustion attacks by establishing
a lot of associations sharing the same IP addresses and port numbers.</t>
>> 
>>> -----
>>> 12.
>>> Section 1, Imp:
>>> I propose to clearly state that this document is addressed to NAT and SCTP and that all SCTP scenarios will only work if both SCTPs (local and remote) and the NAT support this document.
>> There is now a table describing this.
>> [Maxim] Well, it was already in version -12 that I reviewed so I'm not sure what exactly I wanted to improve. Perhaps I wanted the draft from the beginning to clearly state that it describes mechanisms that should be implemented in NAT, SCTP client and SCTP server. 
> The Introduction contains:
> 
> This document specifies procedures allowing a NAT to support SCTP by
> providing similar features to those provided by a NAPT for TCP and
> other supported protocols.  The document also specifies a set of data
> formats for SCTP packets and a set of SCTP endpoint procedures to
> support NAT traversal.
> 
> We can't control who is implementing what, but we can only clearly describe
> what needs to be implemented in the NAT device and in the end points.
> 
> We have never distinguished between end-points acting as clients or servers.
> I wold prefer to keep it that way.
> [Maxim] Ok
>> 
>>> -----
>>> 18.
>>> Section 1, Imp:
>>> 
>>> At each level, the Internal Host, External Host and NAT
>>>   may or may not support the features described in this document.
>>> 
>>> I think you actually mean “SCTP on the Internal Host” and “SCTP on the External Host”. I propose to improve it. There are other places like this.
>> Hmm, I think this is clear, since the "features described in this document" refers to an SCTP stack.
>> [Maxim] Actually I think the draft uses "host" in two different meanings.. Anyway, it's not really a serious problem so up to you. 
> I prefer to keep things as they are right now.
> [Maxim] Ok
>> 
>>> -----
>>> 28.
>>> Section 4.3, Imp:
>>> It’s not described what to do in case of INIT is sent from the same private IP with the same ports with the same or different Initiate TAG and there is the entry. 
>> The same initiate tag would be a retransmission, so no action required. If there is a different
>> initiate tag, it means that the end-point has restarted. However, to avoid attacks, no state change
>> should occur. That is why we haven't described it.
>> [Maxim] I think it would be good to clarify it in the draft. Is it possible?
> Sure. I added (see your seventh point above):
> <t>If an outgoing SCTP packet contains an INIT or ASCONF chunk and a matching
> NAT binding table entry is found, the packet is processed as a normal
> outgoing packet.</t>
> [Maxim] Ok
>> 
>>> -----
>>> 35.
>>> Section 4.3, Ma:
>>> 
>>>   For an incoming packet containing an INIT-chunk a table lookup is
>>>   made only based on the addresses and port numbers.  If an entry with
>>>   an External-VTag of zero is found, it is considered a match and the
>>>   External-VTag is updated.
>>> 
>>> It should be clearly stated what the NAT should do when the entry is not found and when it's found but External-VTag is not zero.
>> I think most NAT boxes will drop the packet. Some may have something specific handling configured.
>> [Maxim] Do you mean that it was not described intentionally? This is the reason I have related questions or proposals for clarification. Perhaps it would be worth to mention in the beginning that processing of SCTP chunks in use cases not described in the draft are up to implementations. 
> See response to your third issue above.
> [Maxim] Ok
>> 
>>> -----
>>> 62. 
>>> Section 10, Ma/Imp:
>>> 
>>>   State maintenance within a NAT is always a subject of possible Denial
>>>   Of Service attacks.  This document recommends that at a minimum a NAT
>>>   runs a timer on any SCTP state so that old association state can be
>>>   cleaned up.
>>> 
>>> Not sure how that risk is actually possible as the NAT will not accept incoming INITs but anyway according to my experience it's not just enough to have the timer as in this case a NAT can mistakenly remove entries. It also requires SCTP to help the NAT to keep the state up to date by e.g. sending packets, e.g. HBs, which also requires timers synchronization between SCTP and the NAT.
>> That is correct. Since you have a default value for the HEARTBEAT.Interval, you can configure the NAT timer, Do you think the HB timer
>> should be configured differently?
>> [Maxim] No, I don't think it should be different. That was one my point that it would be good if the draft recommends to configure NAT timer based on HB.Interval. It's not needed to provide a precise formula. Moreover as I wrote earlier it should also recommend/prescribe to enable Path monitoring on SCTP ends. Anyway, my initial point here was whether it's really a subject of possible DoS attacks because NAT only creates entries from locally originated INITs.
> You can have an attacker in the private network issuing lots of associations or a remote one, which
> uses some application layer protocol to trigger lots of connection requests. If you don't run a timer,
> a remote attacker could never tear down associations completely.
> [Maxim] Ok
>