Re: [Tofoo] VXLAN (UDP tunnel protocols) and non-zero checksums

Joe Touch <> Sat, 03 May 2014 04:07 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 21DCA1A000A; Fri, 2 May 2014 21:07:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.551
X-Spam-Status: No, score=-2.551 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.651] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id RiV1cXZLsaUT; Fri, 2 May 2014 21:07:35 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 1B08A1A0004; Fri, 2 May 2014 21:07:35 -0700 (PDT)
Received: from [] ( []) (authenticated bits=0) by (8.13.8/8.13.8) with ESMTP id s43478Tf005500 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Fri, 2 May 2014 21:07:15 -0700 (PDT)
Message-ID: <>
Date: Fri, 02 May 2014 21:07:09 -0700
From: Joe Touch <>
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
MIME-Version: 1.0
To: Tom Herbert <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
Cc: "" <>,, "" <>, "" <>, "" <>
Subject: Re: [Tofoo] VXLAN (UDP tunnel protocols) and non-zero checksums
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for Tunneling over Foo \(with\)in IP networks \(TOFOO\)." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 03 May 2014 04:07:37 -0000

On 5/2/2014 6:47 PM, Tom Herbert wrote:
>> I don't know how you can make this claim.
>> You don't know it's a corrupted packet - esp. because it's highly unlikely
>> that the checksum would be zero'd solely by corruption. What you know - at
>> best - is that the source decided to send a zero-checksum packet.
> Factually, I only know that I received a packet with checksum of zero,
> not that one want sent. If I have external information that says the
> sender has not disabled checksums then something must have gone awry.

Absolutely. Let's assume you follow the VXLAN draft. In that case:

    The UDP checksum field SHOULD be transmitted as zero.  When a packet
    is received with a UDP checksum of zero, it MUST be accepted for

The proposed spec says that the source can either enable or disable the 
checksum (SHOULD doesn't mean MUST).  It says nothing about tracking 
whether the source has disabled zero checksum. It's very clear that when 
you get a zero checksum, you should accept it.

If you want to track things the way you're proposing, then you need to 
update that draft.


However, the case you're proposing is bizarre - it would happen only if 
the endpoint said one thing and did another. It *could* happen from an 
error, but that's unlikely.

So let's say it DID happen - and that the spec says to check (which it 
currently doesn't). You then have conflicting info - on one hand, the 
endpoint said it would not zero the checksum, and on the other hand it did.

So what do you do? You can treat *EITHER* the packet or the info about 
the endpoint as incorrect - and you have *no* information that tells you 
which. You want to treat the packet as an error, but the info you have 
about the endpoint is equally suspect.

IMO, this is a case that's useless to consider in the spec; at that 
point, you're have no reason to trust anything you know about that 
endpoint at all, and you might as well cut it off completely -- 
regardless of what it sends.

> This is not just the rare case of corrupted checksum value, but
> unfortunately zero is the likely value if the sender is not properly
> setting the checksum.

Protocol specs are not designed to address or avoid all implementation 

Your protocol ought to care whether the checksum is zero - or not care. 
The current draft doesn't care.

> Since the checksum is always zeroed in the
> packet before computation, there are many opportunities for bugs in
> drivers, stacks, and HW where the checksum is not actually written
> correctly (especially possible in presence of TSO and checksum offload
> in NICs)-- in this case packets may be sent incorrectly with a
> checksum of zero.

If you want to prevent your protocol from being susceptible to this kind 
of implementation error, you need to change the spec so you always drop 
packets when the checksum is zero.

However, checksum errors aren't there to find implementation errors; 
they're there to find transmission or copying errors.

> This condition could be difficult to detect since
> everything might otherwise appear okay. I would take this into
> consideration when contemplating use of zero checksums.

This discussion wasn't about zero checksums, but if that's what you want 
to discuss, here's my view:

	- allow zero checksums when you don't care about checksums

	- if you care about checksums, don't allow zero checksums

Anything else, esp. per endpoint, is madness, IMO. The use of checksums 
ought to depend on whether you care about the errors they find - not the 
failure modes that might generate zero checksums.