Re: [nvo3] Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00

Dan Wing <dwing@vmware.com> Wed, 20 September 2017 22:27 UTC

Return-Path: <dwing@vmware.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B17231320D8; Wed, 20 Sep 2017 15:27:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=onevmw.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1_tTzqO2HUnk; Wed, 20 Sep 2017 15:27:38 -0700 (PDT)
Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on0063.outbound.protection.outlook.com [104.47.36.63]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8E6ED132026; Wed, 20 Sep 2017 15:27:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onevmw.onmicrosoft.com; s=selector1-vmware-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=G4Qvdm3FU0lkRnUGz5RdZzqysqZhyz/86lleJyDR21k=; b=gljAIgAdkB0pi7CcQ/0xdTFCoBD5JLpPqvCqHQQVCYa/Q9P78GAKeLGdv27pI5n/52JGMAehCqLgI2qI+k4lnL2qqMyC/2Qaz91LSLUF3RUI74gFuMQw6Ge6yf7RcrcD4fbyQVw0hCt4beVj6so+oF4Ue8n3v9UW4topLYoN2Io=
Received: from CO2PR05MB2648.namprd05.prod.outlook.com (10.166.95.12) by CO2PR05MB651.namprd05.prod.outlook.com (10.141.199.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.77.5; Wed, 20 Sep 2017 22:27:36 +0000
Received: from CO2PR05MB2648.namprd05.prod.outlook.com ([10.166.95.12]) by CO2PR05MB2648.namprd05.prod.outlook.com ([10.166.95.12]) with mapi id 15.20.0077.011; Wed, 20 Sep 2017 22:27:36 +0000
From: Dan Wing <dwing@vmware.com>
To: Linda Dunbar <linda.dunbar@huawei.com>
CC: Sami Boutros <sboutros@vmware.com>, "draft-ietf-nvo3-encap@ietf.org" <draft-ietf-nvo3-encap@ietf.org>, NVO3 <nvo3@ietf.org>
Thread-Topic: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
Thread-Index: AQHTKyLBGY5aCvQFvkqhnEP9WPpQxKK0x/kg//+vcoCAAeaWUIAGAjKAgAHo0nCAAAyagIAACi1wgAAJP4A=
Date: Wed, 20 Sep 2017 22:27:35 +0000
Message-ID: <3848FEC7-60ED-4CDB-8281-F732BB53FB84@vmware.com>
References: <EAE090B0-8025-4F27-916F-2CEE308C2B10@vmware.com> <4A95BA014132FF49AE685FAB4B9F17F6594723E4@SJCEML702-CHM.china.huawei.com> <D02E94BF-070B-44BD-B704-F9675555D9A9@vmware.com> <4A95BA014132FF49AE685FAB4B9F17F659472B9C@SJCEML702-CHM.china.huawei.com> <52E8BD2E-D92D-473D-8EE1-A9E134A0CF01@vmware.com> <4A95BA014132FF49AE685FAB4B9F17F65949D30A@SJCEML702-CHM.china.huawei.com> <2A6C2454-6F80-473D-B905-C3ED5C55D87C@vmware.com> <4A95BA014132FF49AE685FAB4B9F17F65949D341@SJCEML702-CHM.china.huawei.com>
In-Reply-To: <4A95BA014132FF49AE685FAB4B9F17F65949D341@SJCEML702-CHM.china.huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=dwing@vmware.com;
x-originating-ip: [208.91.1.34]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; CO2PR05MB651; 20:MlrFq5ivUdWUMK6+d2qXBgJWAale3xG19CbzVccH97MhOZX5BMVUvx7kfFr4Jebwlzz1OY+CnchzH6qpecUoczKvgL7wncUZKteDRF+phFuk0zjJE8JcWWShpqf6JzbHNZs9axBjwJQ9kc1aTmfcTTjkOjo/RA4cMK6rqjckkQA=
x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR;
x-forefront-antispam-report: SFV:SKI; SCL:-1; SFV:NSPM; SFS:(10009020)(6009001)(376002)(346002)(24454002)(13464003)(51914003)(51444003)(189002)(377454003)(199003)(561944003)(53946003)(6306002)(53546010)(36756003)(68736007)(101416001)(82746002)(53936002)(575784001)(2906002)(3660700001)(8936002)(3280700002)(3846002)(6116002)(54906003)(305945005)(105586002)(106356001)(86362001)(102836003)(5660300001)(33656002)(4326008)(6506006)(50986999)(6486002)(77096006)(230783001)(97736004)(6916009)(6246003)(8676002)(2950100002)(25786009)(229853002)(316002)(81166006)(76176999)(81156014)(83716003)(189998001)(66066001)(99286003)(478600001)(93886005)(966005)(14454004)(54356999)(2900100001)(7736002)(6436002)(6512007); DIR:OUT; SFP:1101; SCL:1; SRVR:CO2PR05MB651; H:CO2PR05MB2648.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
x-ms-office365-filtering-correlation-id: 644d72ad-4a4d-4669-85de-08d50076cb54
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:CO2PR05MB651;
x-ms-traffictypediagnostic: CO2PR05MB651:
x-exchange-antispam-report-test: UriScan:(61668805478150)(158342451672863)(10436049006162)(50582790962513);
x-microsoft-antispam-prvs: <CO2PR05MB651D8B27E7622835F464734DC610@CO2PR05MB651.namprd05.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(100000703101)(100105400095)(3002001)(6041248)(20161123555025)(20161123558100)(20161123562025)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:CO2PR05MB651; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:CO2PR05MB651;
x-forefront-prvs: 04362AC73B
received-spf: None (protection.outlook.com: vmware.com does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-ID: <4637DB11B9DB9B4FAC301EBC9FE4D808@namprd05.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: vmware.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Sep 2017 22:27:35.8373 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO2PR05MB651
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/xl7f7Tawp9d2HXR8c3P8tA2pmYU>
Subject: Re: [nvo3] Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Sep 2017 22:27:43 -0000

On Sep 20, 2017, at 2:57 PM, Linda Dunbar <linda.dunbar@huawei.com> wrote:
> Dan and Sami, 
> 
> Since the draft is only about the consideration, does it need to dive into the detailed solution? 
> 
> I feel the actions at the initiating VTEP and returning VTEP are part of solution. The bottom line, the consideration needs to point out issues traversing the existing NAT. 

Yes, I agree some text discussing NAPT and firewall port forwarding is necessary.  I have provided text.

I object to the proposal that VTEPs implementations send traffic to a port other than the IANA-assigned destination port, because the proposal forces both VTEPs to maintain per-flow state and the proposal introduces new brittleness.  The proposal still requires the remote VTEP to have NAPT port forwarding, so it can't achieve its goal of eliminating port forwarding, anyway.

-d


> Linda
> 
> -----Original Message-----
> From: Dan Wing [mailto:dwing@vmware.com] 
> Sent: Wednesday, September 20, 2017 4:18 PM
> To: Linda Dunbar <linda.dunbar@huawei.com>
> Cc: Sami Boutros <sboutros@vmware.com>; draft-ietf-nvo3-encap@ietf.org; NVO3 <nvo3@ietf.org>
> Subject: Re: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
> 
> 
>> On Sep 20, 2017, at 2:09 PM, Linda Dunbar <linda.dunbar@huawei.com> wrote:
>> 
>> Sami,
>> 
>> Answers inserted below:
>> 
>> From: Sami Boutros [mailto:sboutros@vmware.com] 
>> Sent: Tuesday, September 19, 2017 5:23 PM
>> To: Linda Dunbar <linda.dunbar@huawei.com>; Dan Wing <dwing@vmware.com>; draft-ietf-nvo3-encap@ietf.org
>> Cc: NVO3 <nvo3@ietf.org>
>> Subject: Re: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
>> 
>> Hi Linda,
>> 
>> There are some more clarifications needed before adding the text.
>> 	• The VTEP initiating the traffic, does it need to maintain a state for each source port it is putting on the packet? I.e. The inner payload flow info that maps to this source port? So when it receives return traffic it knows this is a return traffic for the Geneve tunnel from the other VTEP?
>> [Linda] I don’t think the VTEP initiating the traffic need to make any changes to its current implementation.
> 
> 
> It would need to change, because it would need to listen for incoming packets on a different port than 4789.  How long does it listen?  How many ports does it listen to?  What happens if the NAPT loses mapping state (reboots)?  Does it need to send UDP keepalive traffic?  
> 
>> 	• The VTEP sending only the return traffic, does it need to maintain a state between the NAT translated source port (by the NAT GW) and the inner payload too? Which source should it use for returned traffic? 
>> [Linda] the VTEP sending the return traffic may need some changes, especially if VxLAN is used, because the Destination Port can no longer to be 4789. 
> 
> It would need to remember which destination port it needs to reply on, which is effectively a NAPT table for the entire network of the inner IP addresses.  That's a very significant amount of state, on par with large-scale NAT.  If we want high availability, that state would need to be synchronized with an adjacent VTEP, just like with a large-scale NAT.
> 
> 
> Instead, let's just do what's easy:  send packets to the IANA-assigned UDP port, which keeps the VTEP implementation simple.
> 
> -d
> 
> 
>> 
>> Thanks,
>> 
>> Sami
>> From: Linda Dunbar <linda.dunbar@huawei.com>
>> Date: Friday, September 15, 2017 at 12:42 PM
>> To: Sami Boutros <sboutros@vmware.com>, Dan Wing <dwing@vmware.com>, "draft-ietf-nvo3-encap@ietf.org" <draft-ietf-nvo3-encap@ietf.org>
>> Cc: NVO3 <nvo3@ietf.org>
>> Subject: RE: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
>> 
>> Sami,
>> Your wording only describes the scenario of one VTEP initiating the traffic, but doesn’t describe the issues within the NAT device when receiving the return encapsulated traffic from the remote VTEP.  
>> 
>> Specifically,  if the Overlay Tunnel uses VxLAN encapsulation [RFC7348], IANA assigned Port 4789 is used in the outer frames’ destination port field and hashed value of inner packet is used in the outer frames’ source port field [RFC 7348].  This approach can cause trouble when traversing NAT.
>> 
>> In addition, your wording of “The Other VTEP … should use the translated UDP source port as the destination port for the reverse traffic” is basically eliminating the VxLAN RFC7348 encapsulation as a valid tunnel method (is it a problem?)
>> 
>> IMHO, this document should only describes the issue, but not the solutions.
>> 
>> 
>> Therefore, I suggest some minor changes to your wording:
>> -------------------------------------------------------------------------
>> NAT traversing consideration:
>> 
>> For scenarios where traffic over an NVO3 tunnel can be initiated from only one VTEP side, and in the event that the NVO3 tunnel encapsulation traverse a NAT GW.
>> The NAT GW will change both the VTEP src ip and udp port of the tunnel IP/UDP header for the traffic initiated from one VTEP and expects the return traffic to have certain destination udp port (instead of the RFC7348 specified 4789 as the destination udp port).   
>> 
>> The other VTEP end of the tunnel, which MUST not be initiating any traffic but only sending the return traffic, should use the translated UDP source port as the destination port for the reverse traffic sent over the tunnel to the translated NAT VTEP src address.
>> 
>> 
>> Thanks, Linda
>> 
>> From: Sami Boutros [mailto:sboutros@vmware.com] 
>> Sent: Thursday, September 14, 2017 4:37 PM
>> To: Linda Dunbar <linda.dunbar@huawei.com>; Dan Wing <dwing@vmware.com>; draft-ietf-nvo3-encap@ietf.org
>> Cc: NVO3 <nvo3@ietf.org>
>> Subject: Re: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
>> 
>> Hi Linda,
>> 
>> Thanks for the response. 
>> 
>> I will add a section as follow, please let me know if this is OK.
>> 
>> NAT traversing consideration:
>> 
>> For scenarios where traffic over an NVO3 tunnel can be initiated from only one VTEP side, and in the event that the NVO3 tunnel encapsulation traverse a NAT GW.
>> The NAT GW will change both the VTEP src ip and udp port of the tunnel IP/UDP header.
>> 
>> The other VTEP end of the tunnel, which MUST not be initiating any traffic but only sending the return traffic, should use the translated UDP source port as the destination port for the reverse traffic sent over the tunnel to the translated NAT VTEP src address.
>> 
>> Thanks,
>> 
>> Sami
>> From: Linda Dunbar <linda.dunbar@huawei.com>
>> Date: Thursday, September 14, 2017 at 12:33 PM
>> To: Sami Boutros <sboutros@vmware.com>, Dan Wing <dwing@vmware.com>, "draft-ietf-nvo3-encap@ietf.org" <draft-ietf-nvo3-encap@ietf.org>
>> Cc: NVO3 <nvo3@ietf.org>
>> Subject: RE: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
>> 
>> Sami,
>> 
>> Sorry for the delayed response.
>> 
>> Answers are inserted below:
>> 
>> From: Sami Boutros [mailto:sboutros@vmware.com] 
>> Sent: Monday, September 11, 2017 12:24 PM
>> To: Linda Dunbar <linda.dunbar@huawei.com>; Dan Wing <dwing@vmware.com>; draft-ietf-nvo3-encap@ietf.org
>> Cc: NVO3 <nvo3@ietf.org>
>> Subject: Re: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
>> 
>> Hi Linda,
>> 
>> I have few clarifications and one question?
>> 
>> When an overlay tunnel traverses an SNAT GW, the source address of the VTEP will change to be the NAT source address. So, if we have an overlay tunnel between 2 endpoints and we are doing SNAT in one direction. The other endpoint will only see the translated NAT address.  
>> 
>> [Linda] Correct. China Telecom needs this type of deployment.
>> 
>> 
>> An overlay tunnel is bidirectional and doesn’t make an assumption that the overlay traffic carried over the tunnel will only be initiated from one side. 
>> 
>> [Linda] The tunnel is bidirectional, but the traffic is initiated from the CPE side for this scenario.
>> 
>> 
>> And it seems that what you are asking is to add directionality to the overlay tunnel, in which one endpoint can not send any overlay traffic unless it receives traffic from the other endpoint of the overlay tunnel, as well this endpoint should use the UDP source port received from the other initiating endpoint as the destination UDP port for the tunnel to accommodate for SNAT traversal.
>> 
>> [Linda] The so called “Overlay tunnel” is simply the encapsulation done by both end points. In the scenario depicted, the traffic is initiated from the CPE to the Gateway, and can be returned by the Gateway. Therefore, the tunnel is bi-directionally.  I don’t see any extra work needed to add “directionality to the overlay tunnel”
>> 
>> Linda 
>> 
>> Is my understanding correct?
>> 
>> Thanks,
>> 
>> Sami
>> 
>> From: Linda Dunbar <linda.dunbar@huawei.com>
>> Date: Friday, September 8, 2017 at 2:31 PM
>> To: Dan Wing <dwing@vmware.com>, "draft-ietf-nvo3-encap@ietf.org" <draft-ietf-nvo3-encap@ietf.org>, Sami Boutros <sboutros@vmware.com>
>> Cc: NVO3 <nvo3@ietf.org>
>> Subject: Suggested wording for the "NAT Traversing Consideration" to be added to the Section 6 of draft-ietf-nvo3-encap-00
>> 
>> Sami, et al:
>> 
>> Follow up the discussion during IETF 99, I took the stab putting together the wording to describe why it is an issue for VxLAN encapsulated data frames traversing NAT. I think this sub-section should be included in the Section 6 of the draft-ietf-nvo3-encap.
>> 
>> Linda Dunbar
>> 
>> ----------------------------------------------------------------------
>> 
>> 6.10 NAT Traversal consideration
>> Here is an example of encapsulated traffic traversing NAT (Network Address Translation) or, to be precise, NAPT (Network Address and Port Translation):
>> <image001.jpg>
>> Figure x – Scenario of Overlay Tunnel traversing NAT
>> 
>> If the Overlay Tunnel uses VxLAN encapsulation [RFC7348], IANA assigned Port 4789 is used in the outer frames’ destination port field and hashed value of inner packet is used in the outer frames’ source port field [RFC 7348].  Using different source port numbers is for achieving better traffic distribution among multiple paths.  This approach is fine within one data center, but can cause trouble when traversing NAT.
>> Many NAT devices use Source Address and Port number of the data frames coming from private addresses to derive the source port number for the frames sent to the public network, as shown below (assuming VxLAN is used for encapsulation between the CPEs and the Gateway).  The NAT Public Source Port H1/H2/H3/H4 can’t be the same, otherwise the NAT device can’t tell if the return data frames should be sent to CPE1 or CPE2.
>> 
>> Private Source Address
>> Private Source Port
>> Private Destination Port 
>> NAT Public Source Addr
>> NAT public Source Port
>> NAT public Destination Port
>> CPE1
>> A1
>> 4789
>> Public A
>> H1
>> 4789
>> CPE1
>> A2
>> 4789
>> Public A
>> H2
>> 4789
>> CPE2
>> A3
>> 4789
>> Public A
>> H3
>> 4789
>> CPE2
>> A4
>> 4789
>> Public A
>> H4
>> 4789
>> 
>> Per RFC 7348, the destination port number of the data frames returned from the Gateway should be 4789. When the NAT device receive the data frames (with the destination port = 4789) from the Public Network (i.e. from the Gateway in the figure above), the NAT device has to drop the data frames as the Destination Port 4789 is not one of H1/H2/H3/H4.
>> Of course, we can always modify the NAT devices to accommodate the VxLAN encapsulated data frames. The problem is traversing the existing NAT devices already deployed.
>> --------------------------------------  
>> 
>> Sincerely hope the editors to draft-ietf-nvo3-encap can add those suggested text to the draft.
>> 
>> Linda Dunbar
>> 
>> _____________________________________________
>> From: Linda Dunbar 
>> Sent: Thursday, July 13, 2017 2:31 PM
>> To: 'Dan Wing' <dwing@vmware.com>
>> Cc: NVO3 <nvo3@ietf.org>; draft-ietf-nvo3-encap@ietf.org
>> Subject: RE: [nvo3] draft-ietf-nvo3-encap-00 should add considerations of traversing NAPT
>> 
>> 
>> Dan,
>> 
>> Thanks for pointing out the Geneve's processing for traversing VxLAN.
>> 
>> First of all I wasn't proposing using reduced set of source ports. I think that the general document needs to add the consideration for traversing NAT because it is an issue for IPv4. Joe Touch suggested adding the entropy in the IPv6 flow ID as a solution for IPv6.
>> 
>> As for your suggested approach in Geneve:
>> The reverse traffic is sent to the Geneve destination port (6081), and a firewall or NAT or NAPT mapping is necessary for UDP/6081 traffic -- on both datacenters, which both probably have their own underlay NAPTs.  Those firewalls (or NATs or NAPTs) need to have appropriate pinholes for UDP/6081.
>> 
>> Does it mean all reverse traffic only use UDP port 6081?
>> Or all the NAT device convert the NVE’s UDP port 6081 to multiple port numbers?
>> 
>> Thanks,
>> Linda
>> 
>> -----Original Message-----
>> From: Dan Wing [mailto:dwing@vmware.com] 
>> Sent: Wednesday, July 12, 2017 6:09 PM
>> To: Linda Dunbar <linda.dunbar@huawei.com>
>> Cc: NVO3 <nvo3@ietf.org>; draft-ietf-nvo3-encap@ietf.org
>> Subject: Re: [nvo3] draft-ietf-nvo3-encap-00 should add considerations of traversing NAPT
>> 
>> 
>>> On Jul 12, 2017, at 12:37 PM, Linda Dunbar <linda.dunbar@huawei.com> wrote:
>>> 
>>> Sami, et al,
>>> 
>>> 
>>> 
>>> The draft-ietf-nvo3-encap-00 is written very clear.
>>> 
>>> 
>>> 
>>> However, the Section 6 (Common Encapsulation Considerations) should add a sub-section on the consideration of traversing NAPT.  Encapsulated traffic could go to different data centers or WAN, which could go through Network Address Port Translation devices
>>> 
>>> 
>>> 
>>> Using VxLAN as an example: VxLAN specification [RFC 7348] uses a set of Port numbers to achieve better traffic distribution among multiple paths, which is fine within one data center, but causing trouble when traversing NAPT.
>> 
>> You're describing a problem with Geneve, which mimics VXLAN in that both of them suggest using a wide range of UDP ports to help underlay ECMP and to help receiver CPU load balancing, specifically this text of https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dietf-2Dnvo3-2Dgeneve-3A&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=IMDU0f3LtPMQf5XkZ06fNg&m=JVfdVYy3BrUDhZx4ZdLfwKHw_6NfcycUjhx4GlepvXE&s=hqGiAon1Fo8tWEivuKr2TEh2yRYzg9IIbpQHO1C1-yg&e= 
>> 
>>   Source port:  A source port selected by the originating tunnel
>>      endpoint.  This source port SHOULD be the same for all packets
>>      belonging to a single encapsulated flow to prevent reordering due
>>      to the use of different paths.  To encourage an even distribution
>>      of flows across multiple links, the source port SHOULD be
>>      calculated using a hash of the encapsulated packet headers using,
>>      for example, a traditional 5-tuple.  Since the port represents a
>>      flow identifier rather than a true UDP connection, the entire
>>      16-bit range MAY be used to maximize entropy.
>> 
>> If a reduced set of source ports is used instead, as you propose, the ECMP and CPU load balancing benefits are lost.  That seems problematic.
>> 
>>> NAPT use Port number to map back the source address. With a set of port numbers, NAPT can’t easily figure out the reverse direction traffic’s final IP addresses.
>> 
>> The reverse traffic doesn't use the inverted 5-tuple.  The reverse traffic is sent to the Geneve destination port (6081), and a firewall or NAT or NAPT mapping is necessary for UDP/6081 traffic -- on both datacenters, which both probably have their own underlay NAPTs.  Those firewalls (or NATs or NAPTs) need to have appropriate pinholes for UDP/6081.
>> 
>>> In addition, since the IP of packets change through NAPT device, it can mess up the learning of the peer NVE used in encapsulation.
>> 
>> The underlay did the NAPT, so I don't see a problem with the NVE overlay getting confused.  Could you explain in more detail?
>> 
>>> STUN can be used to get changed IP and port from NAPT device, but it requires NAPT device support STUN.
>> 
>> NAPT devices are not expected to implement STUN.  STUN is expected to be implemented in the hosts behind the NAT and on a server on the other side of the NAT (usually on a server on the Internet).  See Figure 1 on page 6 of https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc5389-23page-2D6&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=IMDU0f3LtPMQf5XkZ06fNg&m=JVfdVYy3BrUDhZx4ZdLfwKHw_6NfcycUjhx4GlepvXE&s=1nrxWvsqi-o7i2t6n0fS6cS070-aqwJKHFT5sCF4cPg&e= .
>> 
>>> That’s not available in some scenarios. Furthermore, it can’t solve the aforementioned five-tuple issue.
>>> 
>>> 
>>> 
>>> VXLAN over IPSec may be used to deal with the above problems,
>> 
>> Both Geneve and VXLAN run over UDP, and both use a fixed destination port (rather than inverted 5-tuple) for return traffic.  Not sure how VXLAN succeeds at dealing with the above problems, but I would love to learn.
>> 
>>> but IPSec brings up to 88 bytes of overhead plus the key distribution management, which can lower the efficiency.
>> 
>> Should be able to use IPsec transport mode, which is more around 40 bytes overhead.
>> 
>> -d
>> 
>> 
>>> 
>>> 
>>> 
>>> 
>>> Suggestion: Add Section 6.10 Traversing NAPT consideration.
>>> 
>>> I can help to provide the text if you all think the suggestion is acceptable.
>>> 
>>> 
>>> 
>>> We can discuss more in Prague.
>>> 
>>> 
>>> 
>>> Thanks, Linda Dunbar
>>> 
>>> 
>>> 
>>> Huawei USA IP Technology Lab
>>> 
>>> 5340 Legacy Drive,
>>> 
>>> Plano, TX 75024
>>> 
>>> Tel: +1 469-277 - 5840
>>> 
>>> Fax: +1 469 -277 - 5900
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> nvo3 mailing list
>>> nvo3@ietf.org
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ietf.org_mailman_listinfo_nvo3&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=IMDU0f3LtPMQf5XkZ06fNg&m=60T3yKN2I7oCxe8OH9mfZix-2ykSSjL-P-RoXCkZGdg&s=q7A_LzZuDp-yZnlA7Xw_N7yuLk4HO7K07jgl3Z78Ixg&e=