Re: [Int-area] WG Adoption Call: IP Fragmentation Considered Fragile

Tom Herbert <> Mon, 30 July 2018 15:11 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 31FD01310FC for <>; Mon, 30 Jul 2018 08:11:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 3kOnt5mP-Xak for <>; Mon, 30 Jul 2018 08:11:52 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:400d:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 2AF6F1310F9 for <>; Mon, 30 Jul 2018 08:11:52 -0700 (PDT)
Received: by with SMTP id b5-v6so8014537qkg.6 for <>; Mon, 30 Jul 2018 08:11:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=RNWj7KVD5zZ4MROeY7Yr4Z9R+y3m8Ltd4cDsMnGNWE8=; b=Qet23AkASE17cOogmYtnwZfKa9WXoUL2ZiPCYSGlgaVNYt78XpsyKqXcl6jpZ1/sFN qpigj3Vlsst7qoEgZATB44vRJ0f4PhF74dDhryR9xxreTcJnc7jFDDjYWwKlLtFUZVX3 m68qe3z2xi4vDgvNHPVtAsuYnTSvcrK3HdbdGwebPbWvjvGxdGoLToehrpnQE1sjO/T4 APKnCU5uYyvDK0vEa6zDVgDBx5WqIkk+p2C1FM1cBIGmhHlMqB3a2uI7RoEc2UflWwwc 2NnI8kp5TtKpbeHtq/zDHXmKbf3/T4kQFtfcfkHCTuTB3/Rr9NtBm3K/pWjvBps3mprz rnPQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=RNWj7KVD5zZ4MROeY7Yr4Z9R+y3m8Ltd4cDsMnGNWE8=; b=Da7tC7pqyYOQd6sdUzBMubM3Zu0m6TZLHHTIINrpjvO+tuSB4/SZ2FHs9UGC6U4VZT +PyEkIANmQPTham8yTTijLfItJkddGpnLiHXoExlpQORIsQ0uwiBYk0lJdm22LYJI9GB Gz9gAa04+5xPGozGVcXD/YUYtZlZXpG9U9IZUFiYVAQBgNUB7KdFZHg9ConqWr9iv5fa nnB3iste7HIOUKO8G406AcPfkKefwm8b82gq/OhjmqgxPMYwgEJiz5EUNQyqo6gV/Vm5 GIAo5sYDXCXlDYtQ0VITS4tcHwdHfScnkfVxj0QzcIDsmC/xCX1nPueTdTN8lMqEUNn4 BBzQ==
X-Gm-Message-State: AOUpUlHveb8536i0uzOUMh8sNoaTGBOqpVxzA0o+PcbgE8lTcsJsdV0b SeQIE8yCX8CqnKl+OGtt7UDmUxO2xdNCQNwaq3zRrQ==
X-Google-Smtp-Source: AAOMgpcXU2044HNtU5j9J7VI0KQZu1xNtqtr+1J4FITvMtCFaGycYVWjGCdyWQACI/nKcIA7VorHWebNmPCH8Vchato=
X-Received: by 2002:a37:a4d:: with SMTP id 74-v6mr16934129qkk.148.1532963511031; Mon, 30 Jul 2018 08:11:51 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:ac8:3304:0:0:0:0:0 with HTTP; Mon, 30 Jul 2018 08:11:49 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <> <>
From: Tom Herbert <>
Date: Mon, 30 Jul 2018 08:11:49 -0700
Message-ID: <>
To: Joe Touch <>
Cc: Ole Troan <>, "" <>, "" <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [Int-area] WG Adoption Call: IP Fragmentation Considered Fragile
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Internet Area Mailing List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 30 Jul 2018 15:11:55 -0000

On Sun, Jul 29, 2018 at 9:22 AM, Joe Touch <> wrote:
> On Jul 29, 2018, at 9:11 AM, Tom Herbert <> wrote:
> ...
> That said, there’s no real problem with a NAT *IF* it acts as a host on the
> Internet
> (see ouch, J: Middlebox Models Compatible with the Internet. USC/ISI
> (ISI-TR-711), 2016.)
> Joe,
> It's still a problem though. A NAT (or any stateful device in the
> network) forces the requirement in network architecture that all
> packets of a flow are routed through the same device.
> I didn’t make that requirement. The Internet does - it’s what it *means* to
> have an IP address.
It's not a requirement of the Internet and certainly not a core IETF
requirement that packets always follow the same path. It's an ad hoc
requirement imposed by _some_ solutions that have been deployed.

> A NAT *has* the address of the packets it sources; if it isn’t the sink of
> that address, then it’s being used incorrectly. If it doesn’t reassemble
> those packets before translating them (i.e., by translating only
> unfragmented packets and dropping fragmented ones), then it is broken and
> ought to be returned for a refund.
You are only considering the return path. Packets sent from the client
origin to the remote server need to always follow the same path to hit
the same device that has the NAT state for the flow. The destination
address is not the address of the NAT device, so the only way this
works is if the routing in the internal network consistently routes
packets through the right device. If routing changes, packets could be
sent to a different NAT device that doesn't have the state to handle
the packet so it will just drop it. This is a fundamental problem.
Vendors have mostly gotten away with it because NAT and firewalls are
often used in very simple networks, like home networks, that only have
one default router. But in a more complex network with multiple egress
points, including increasing use of multi-homing in home networks and
personal devices, getting consistent routing to satisfy the
requirements of stateful network devices is a major issue.

Fragmentation exacerbates the problem because fragments must be routed
precisely the same way that non-fragments are in order to hit the same
egress device. ECMP routing non-fragments using ports and fragments
without using ports means that they take different paths. Using flow
label instead of ports for ECMP is the best way to ensure all packets
(fragments and non-fragments) of a flow follow the same route (or at
least will produce the same hash for load balancing and packet
steering algorithms).

> This has killed
> our ability to use multi-homing and multi-path.
> No, the Internet supports multi path between two IP endpoints and allows
> multihoming for a single address when managed by a single endpoint (physical
> or virtual).
See above explanation.


> The disconnect is a failure to understand that a NAT *is* an IP endpoint.
> The term “middlebox” is wrong in that sense, at least it’s not a middle box
> to the Internet (it is to the device behind the NAT).
> The best way for an
> intermediate devices to deal with transport layer state is to be an L4
> proxy. The intermediate is a host endpoint for the proxy connections,
> but then that has its own problems since it breaks E2E functionality
> (like TCP auth). So the only real solution is to eliminate transport
> state from the network.
> That would work only if the network didn’t look at or modify transport
> information - and it did work when that was the case.
> I'm still holding out hope that IPv6 will
> start to obsolete use of NAT! FAST (draft-herbert-fast-02) is intended
> to provide a viable alternative to stateful firewalls.
> Getting rid of NATs is only part of the problem. Anything that does DPI is a
> problem when it discards messages it can’t parse because they’re fragmented.
> Joe