[bmwg] RFC 4814 pseudorandom port numbers versus RoCEv2

Gábor LENCSE <lencse@hit.bme.hu> Tue, 29 June 2021 20:52 UTC

Return-Path: <lencse@hit.bme.hu>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 69F9B3A09E7 for <bmwg@ietfa.amsl.com>; Tue, 29 Jun 2021 13:52:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UqAJXOX6FBem for <bmwg@ietfa.amsl.com>; Tue, 29 Jun 2021 13:52:15 -0700 (PDT)
Received: from frogstar.hit.bme.hu (frogstar.hit.bme.hu [IPv6:2001:738:2001:4020::2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A708B3A09DD for <bmwg@ietf.org>; Tue, 29 Jun 2021 13:52:14 -0700 (PDT)
Received: from [192.168.1.148] (host-79-121-43-161.kabelnet.hu [79.121.43.161]) (authenticated bits=0) by frogstar.hit.bme.hu (8.15.2/8.15.2) with ESMTPSA id 15TKq3n2057740 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for <bmwg@ietf.org>; Tue, 29 Jun 2021 22:52:08 +0200 (CEST) (envelope-from lencse@hit.bme.hu)
X-Authentication-Warning: frogstar.hit.bme.hu: Host host-79-121-43-161.kabelnet.hu [79.121.43.161] claimed to be [192.168.1.148]
To: "bmwg@ietf.org" <bmwg@ietf.org>
From: Gábor LENCSE <lencse@hit.bme.hu>
Message-ID: <1001b7d2-f1c9-05c4-bfdb-092481e49bb5@hit.bme.hu>
Date: Tue, 29 Jun 2021 22:51:57 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="------------66FE5D92B41776D75C872602"
Content-Language: en-US
X-Virus-Scanned: clamav-milter 0.102.4 at frogstar.hit.bme.hu
X-Virus-Status: Clean
Received-SPF: pass (frogstar.hit.bme.hu: authenticated connection) receiver=frogstar.hit.bme.hu; client-ip=79.121.43.161; helo=[192.168.1.148]; envelope-from=lencse@hit.bme.hu; x-software=spfmilter 2.001 http://www.acme.com/software/spfmilter/ with libspf2-1.2.10;
X-DCC-wuwien-Metrics: frogstar.hit.bme.hu; whitelist
X-Scanned-By: MIMEDefang 2.79 on 152.66.248.44
Archived-At: <https://mailarchive.ietf.org/arch/msg/bmwg/3RfMYcr80DbXRbF_t5wggU7akGQ>
Subject: [bmwg] RFC 4814 pseudorandom port numbers versus RoCEv2
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 29 Jun 2021 20:52:20 -0000

Dear BMWG Members,

I wonder if you have already met the following problem.

RFC 4814 requires the usage of pseudorandom source and destination port 
numbers. The recommended range for the destination port numbers is: [1, 
49151].
However the 4791 destination port number identifies RoCEv2 ( 
https://en.wikipedia.org/wiki/RDMA_over_Converged_Ethernet ).

If you use a NIC that supports RoCEv2 (and it is enabled) then about 20 
test frames disappear from every 1 million test frames. Here are the 
results of a 60s long throughput test at 4,000,000fps rate:

root@hp1:~/siitperf# ./build/siitperf-tp 84 4000000 60 2000 2 2
EAL: Detected 32 lcore(s)
EAL: No free hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: PCI device 0000:5d:00.0 on NUMA socket 0
EAL:   probe driver: 14e4:16d7 net_bnxt
PMD: Broadcom Cumulus driver bnxt
PMD: 1.10.1:214.4.91
PMD: Driver HWRM version: 1.5.1
PMD: bnxt found at mem e6e10000, node addr 0x7f56c0000000M
EAL: PCI device 0000:5d:00.1 on NUMA socket 0
EAL:   probe driver: 14e4:16d7 net_bnxt
PMD: 1.10.1:214.4.91
PMD: Driver HWRM version: 1.5.1
PMD: bnxt found at mem e6e00000, node addr 0x7f56c0112000M
PMD: Port 0 Link Down
PMD: Port 0 Link Up - speed 10000 Mbps - full-duplex
PMD: Port 1 Link Down
PMD: Port 1 Link Up - speed 10000 Mbps - full-duplex
Info: Left port and Left Sender CPU core belong to the same NUMA node: 0
Info: Right port and Right Receiver CPU core belong to the same NUMA node: 0
Info: Right port and Right Sender CPU core belong to the same NUMA node: 0
Info: Left port and Left Receiver CPU core belong to the same NUMA node: 0
Info: Testing started.
Info: Reverse sender's sending took 59.9999998300 seconds.
Reverse frames sent: *240000000*
Info: Forward sender's sending took 59.9999999983 seconds.
Forward frames sent: 240000000
Reverse frames received: *239995097*
Forward frames received: 239995159
Info: Test finished.
root@hp1:~/siitperf# echo $(((240000000-239995097)/240))
*20*

That is, about 20 frames are lost from every 1,000,000 frames. (First, I 
have found it very strange. Using fixed port numbers, there was no frame 
loss, but using pseudorandom port numbers, 20 of 1,000,000 frames were 
lost even at significantly lower frame rates.)

I have debugged the issue by halving the destination port range, and I 
have found that if the destination port number range was set to [4791, 
4791] then ALL test frames were lost!

I just wanted to save you a few hours of debugging by letting you know 
this issue. :-)

Best regards,

Gábor