Re: [Int-area] WG Adoption Call: IP Fragmentation Considered Fragile

Joe Touch <> Wed, 29 August 2018 15:07 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 78F0F130DF6; Wed, 29 Aug 2018 08:07:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.989
X-Spam-Status: No, score=-1.989 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id UMEho44Cz9ov; Wed, 29 Aug 2018 08:07:32 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 17DF5130E90; Wed, 29 Aug 2018 08:07:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;; s=default; h=Message-ID:References:In-Reply-To:Subject:Cc: To:From:Date:Content-Type:MIME-Version:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=BPJX7K8YAC2OWOZ6d748qwn6BemCBij6yk0jq0zlPrU=; b=ydVwxQ6HTLW6EYnv0z08SheUm +iJbW901PhrLXC6fta5jgp5ZPcqjZUDPgh34jvHMwQchdl00aS1F/mPqRx+ejfKyRrUkf3WQPZtzV SmIvKcIJaIK3JzbDAkkUIFwWXIYy+51beSvlWtBvtRxHf/HJwXNInDS7XEK7jMeFN8BduHIrLVJRP f/vjtTLOXd/aY2KDDAGlABzDQphu0UlpOfWujm0Jix3ZCwkeeEr9MNh0qnyA+eS5v6CWiJGJUegZb I6lFqZef6gSaGVEXCl1rc2N+yPlvLrFouiN9sUgQg236ffTVdfV38TcwTfPcZuvr8/NuXBiRBUCFA X8J3QEvgQ==;
Received: from [::1] (port=35070 by with esmtpa (Exim 4.91) (envelope-from <>) id 1fv244-002U9U-1C; Wed, 29 Aug 2018 11:07:25 -0400
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=_562ddc912810491c3579df9961a95a54"
Date: Wed, 29 Aug 2018 08:07:23 -0700
From: Joe Touch <>
To: Toerless Eckert <>
Cc: Christian Huitema <>, Tom Herbert <>, int-area <>,
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <> <>
Message-ID: <>
User-Agent: Roundcube Webmail/1.3.3
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname -
X-AntiAbuse: Original Domain -
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain -
X-Get-Message-Sender-Via: authenticated_id:
X-From-Rewrite: unmodified, already matched
Archived-At: <>
Subject: Re: [Int-area] WG Adoption Call: IP Fragmentation Considered Fragile
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Internet Area Mailing List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 29 Aug 2018 15:07:36 -0000

Hi, Toerless, 

Overall, I think that it's OK for the doc to remind of us of what is
*already* required and best practice: 

- IPv4 hosts SHOULD avoid enabling in-net fragmentation (needed, in
part, for IP ID compliance at high rate per RFC 6864) 

- IP routers MUST support forwarding of fragments 

    - a router that forwards on information not available in the first
fragment is acting as a proxy for a host; in that case, it SHOULD
maintain enough context to forward fragments (which may involve delaying
non-initial fragments and/or keeping state equivalent to reassembly
between fragments) 

- NATs MUST support forwarding of fragments in both directions because
they act as proxies for the private side host (done in the same manner
as for routers, noted above) 

- transport layers SHOULD use PLPMTUD rather than PMTUD, if
fragmentation is supported at the transport layer  

IMO, all other recommendations are premature at best.  

Pushing transport IDs into the IP header doesn't help; the transport
layer already has EXACTLY the extra state needed (the frag ID), and new
IP options are both costly and not robust -- adding them at best
"shuffles the deck chairs". 


On 2018-08-28 15:09, Toerless Eckert wrote:

> Thanks, Joe
> This has gotten pretty long. Let me sumarize my suggestions upfront:
> For the draft itself, how about it also consideres recommendations not only
> for IPv6 but IPv4. Such as simply also only do what we've accepted to be
> feasible for IPv6. Like: do never rely on in-network fragmentation but
> only use DF packets.
> The remaining considerations are more generic, and i wonder if this draft
> wants to have the guts to even mention them. Probably not. But IMHO
> we will continue to be architecturally stuck in the hole we are if we do not
> tackle them for poor middleboxes:
> The IETF may think they are bad and not needed, but reality does need middleboxes
> that function at linerate of only per-packet inspection and not at the lower
> speed achievable with "virtual reassembly based inspection". Therefore we
> need options to have in every fragment all the same context that middleboxes
> reasonably should be able to inspect. 
> Its IMHO a clean pragmatic solution to say that this additional context can
> be in the higher-layer protocol and that layer willingly exposes it to middleboxes -
> and encrypts everthing else. This view then requires that higher-layer
> protocol to perform packet layer fragmentation. Like we can do with
> TCP. Thats why that approach is IMHO a good recommendation to use. If
> other transport protocols want to support the same degree of interaction
> with middleboxes. Fine. If not, fine too.
> Given how we do have TCP PLMTUD, i think it would be nice to suggest the use
> of it instead of relying on IP layer fragmentation to make TCP flows more
> middlebox friendly. In this draft. AFAIK, its the only option that does not
> require new spec work, thats why it could make the cut for this draft.
> For longer term architectural evolution of middlebox friendly IP fragmentation,
> we could indeed have a new IP option carrying that context, and fragmentation
> would put this option into every fragment. The definition of this context could
> be per-transport proto.
> I think there could be better middlebox contexts better than port numbers,
> but to make fragments work better for existing TCP/UDP middlebox
> functions, those 32 bits are it. But given how we can expect exposure of
> information only from willing higher layers, they will have a much easier
> way to get what they want to support by packet layer fragmentation. A
> simple generic packet layer fragmentation for UDP would therefore be nice IMHO
> so that UDP applications wanting to be friendly would not have to
> reinvent that wheel.
> If we actually ever do such an IP option, it MUST be a destination option,
> because the insufficient RFCs defining the treatment of hop-by-hop options
> burned any ability to deploy those.
> For IPsec, IP in IP or similar higher layer protocols, i would either
> use them as the key beneficiary of generic UDP fragmentation (IPsec/IP-UDP-IP)
> for pragmatic short term solutions, or else the IP option would equally
> be applicable to them (interesting discussionw aht the best context for
> them would be, but the two port numbers would make them be most compatible
> with those typcialyl very TCP/UDP centric middlebox functions).
> Specific answers to your points below
> Cheers
> Toerless
> On Sun, Aug 26, 2018 at 08:01:07PM -0700, Joe Touch wrote: 
>> IPv6. IP options.
> IMHO hop-by-hop optsion got burned and are undeployable because the
> RFCs never made it mandatory enough to have them never impact forwarding
> performance randonly and badly. We had to abandon perfectly good
> hop-by-hop inspection solutions because there whee so many stupid router
> implemntations out there that didn't even have the feature but would still
> punt those packets to slow-path, then the operators saw those packets as
> DoS packets and filtered them. 
>> And (perhaps) any new proposed solution.
> The main issue of everything written on top is that the main
> business interests the IETF works against is that of non-middlebox
> friendly participants.
>> We have to aim at what network components *need* to do to participate. IP fragmentation is exactly that.
> See above. The technical question is how to enable sufficient per-packet context.
>> That???s easy to say, but since any host might be an IPsec tunnel endpoint, you???re back where we are now - needing IP fragmentation support everywhere, ultimately.
> asked and answered.
>> My view is simple - if you fix what we KNOW is wrong with NATs and firewalls - in KNOWN ways - then not only don???t we have to solve this problem, we don???t have a lot of other problems either (e.g., lack of state when a flow takes a different path into an enterprise).
> Thats an orthogonal discussion. The goal in this fragmentation
> thread is purely to eliminate virtual-reassembly complexity on
> middleboxes. However good or bad it is what they do.
> Actually, its not fully orthogonal. By eliminating virtual
> reassembly needs, we make the middlebox also work for
> cases where fragments use different paths.
> And yes, that would enable
> me to make NAT and firewalls (for the firewall functions i think make sense)
> for host stack traffic something that does not require to bother about
> fragmentation and could therefore be done easier at higher speed
> and architecturally as something only in the network layer. 
> You???re optimizing a long-term impact solution for a short-term limitation. That???s a bad idea; protocols last a very long time.

The diference between per-packet operation and
is orders of magnitude of complexity. Its the same order of magnitude
a business problem for network device development as those unnecessary
in pre-QUIC transport are for companies trying to make make money
from ADD millenials on web pages.

>> The draft in question argues to limit what future work should do
>> within the existing requirements, which is fine. I was merely
>> pointing out that we could move more into what i think would be 
>> a useful evolution if we also went beyond our current arch
>> and evolved it.
> I am fine with encouraging the *search* for new solutions, as long as *in the meantime* we also call out firewalls and NATs for how they are already broken. Until IP fragmentation is deprecated, that has to be our position as a community.

Probably easier trying to figure out what subset of NAT/FW/middleboxes
we'd find to be worthwhile enough to support explicitly. Especially
for firewalls, many business interests say NONE and they will only
revert position when e.g.: they fail to get enough market share
with that position.

>> I think fragmentation is best pushed up on the stack.
> Again, please tell me how to do that for IPsec tunnels - which, again, can start/end anywhere in the network.

See above. I was talking about the pragmatic approach for consenting
hosts and
middleboxes to get away without having to enhance IP.

> If i wouldn't have to worry about such proxy forwarding plane capabilities,
> i definitely would prefer models like SOCKS. If i have to think about them
> it becomes certainly difficult to even model this well. When you find a complete model better than the Internet, propose it.
> Until then...

HTTPs over DWDM with application layer proxies on every hop.
You didn't define how to measure "better" ;-) 
Agreed, but that???s exactly where you???re headed when you say ???kick
the can down the road to the upper layer protocol???. 
I am still not sure about the meaning of your answers to Ole, but i
may have tried to argue on this point ("i like SOCKS") maybe in
the direction of how you may view middleboxes in your paper, e.g.: doing
a lot
more higher layer. But thats a much larger story independent of the
problem of virtual fragment reassembly, which i think should be the
only relevant issue in this fragmentation specific thread (and
not the general bitching or improvments on firewall semantics).