Re: [Int-area] WG Adoption Call: IP Fragmentation Considered Fragile

Tom Herbert <tom@herbertland.com> Sun, 29 July 2018 16:11 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 25506130E8E for <int-area@ietfa.amsl.com>; Sun, 29 Jul 2018 09:11:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SeQTykavq-jI for <int-area@ietfa.amsl.com>; Sun, 29 Jul 2018 09:11:15 -0700 (PDT)
Received: from mail-qt0-x235.google.com (mail-qt0-x235.google.com [IPv6:2607:f8b0:400d:c0d::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 21FD0130E8A for <int-area@ietf.org>; Sun, 29 Jul 2018 09:11:15 -0700 (PDT)
Received: by mail-qt0-x235.google.com with SMTP id m13-v6so9802097qth.1 for <int-area@ietf.org>; Sun, 29 Jul 2018 09:11:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Qpr5VK+JuD0j4AI8HPng6U+bbhn2IlDS+jiuO6Tcv/Y=; b=XXYV8uaIpKO0WtlHE6ry6yOIa8sDqokdQy5bJqApFORjCumpCMkprnJmlDxDzsCC/i Rxh/GdochyNyhNt5IAzyBHC7o92+YQf/a/ucU2UaE3zBQGXqtJTAhxUn0U0ZEk5Uvdd2 GS3kVFSQ5TJCV7rfx4OD/4CfcI0ukR+tXaAc0vhxPuIKmjGQf8URy0kuHGaHSKqSMJ/N ATurt8xuub0/4BcXSX1H8os+kyvGJtzKOOqO5fKNygnLerNMmuh1OVhOmteb6qAwfHKJ hWkdADaEM7lERdqvNNGJ6AQxNaG3em1KMPJTrfSbhYiBuy2jWHlqHRYO625Ioh+jjqpS XtWg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Qpr5VK+JuD0j4AI8HPng6U+bbhn2IlDS+jiuO6Tcv/Y=; b=kx2RxxjTXTMtpEEfZYGbBNmMXKtkxExC77qRryqHFPx7meI2itngvSKhfkLbaTsshl WywiyeSxZYp0rN/a5vRfLQcpcLLNvUqRKw4NAu5mQgv4sxLdkpEcK1LMa35G+OSQ2ZHB mXiNBQ1IO4Zfuk3dnt7G9EUxGkLiHKgnczAaNTuR35S+cyrZGiQqWvxzyOJE7+n/60Wv sheTeE7hNbOwtbQgUapFUu6Qot/JmutGtfLvfjYtCb66FZPuMUkhIvd4WrOXMRfNZSUG 9jv3bDpvtgrgQ4d5sxveMJSc+fA+vDLzrVUmGYGDsLr9Smv8A3FZfyJLiEt9tPvOscyK Td8Q==
X-Gm-Message-State: AOUpUlFMMpX4VUzKOulwKqJW9xi7gKnZB0mTDrDsfOll2Sz9zUNgYWYS 25iiHOmXa30OxPj4s/+4UbNM98VFM9RCMaF6cq5UUA==
X-Google-Smtp-Source: AAOMgpdsf6h2slvn/8XLC5O4y0EI+sckNW4NVBvChbYVMVLYPFMJ0O6DC9mpYsjdylwjmTDOhKrvG1D7sdE+5OfXzkA=
X-Received: by 2002:a0c:a281:: with SMTP id g1-v6mr12474096qva.60.1532880674042; Sun, 29 Jul 2018 09:11:14 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:ac8:3304:0:0:0:0:0 with HTTP; Sun, 29 Jul 2018 09:11:13 -0700 (PDT)
In-Reply-To: <A248CA44-B568-4CB9-B450-067B1845AF9B@strayalpha.com>
References: <F227637E-B12D-45AA-AD69-74C947409012@ericsson.com> <0466770D-C8CA-49BB-AC10-5805CFDFB165@strayalpha.com> <6EDF0F79-C8F3-4F05-8442-FF55576ADDD0@employees.org> <alpine.DEB.2.20.1807271530280.14354@uplift.swm.pp.se> <CALx6S35LthDLRry7k-pF8KSoX4BXBA8kyArOpDUAcJMDCoLQpQ@mail.gmail.com> <alpine.DEB.2.20.1807280811540.14354@uplift.swm.pp.se> <8640DCF6-A525-4CF7-A89D-2DEDBF0FADC8@strayalpha.com> <FFF1C23B-7A24-46BC-929E-DD56C77D69A2@employees.org> <A248CA44-B568-4CB9-B450-067B1845AF9B@strayalpha.com>
From: Tom Herbert <tom@herbertland.com>
Date: Sun, 29 Jul 2018 09:11:13 -0700
Message-ID: <CALx6S36w=5J0-=JQqrX0_PR7254V0HrhJct7oomPKdxSOSU43w@mail.gmail.com>
To: Joe Touch <touch@strayalpha.com>
Cc: Ole Troan <otroan@employees.org>, "internet-area@ietf.org" <int-area@ietf.org>, "intarea-chairs@ietf.org" <intarea-chairs@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/Sqp7a-x7Di3ymdyUcCmM4OcrO38>
Subject: Re: [Int-area] WG Adoption Call: IP Fragmentation Considered Fragile
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Internet Area Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 29 Jul 2018 16:11:18 -0000

On Sun, Jul 29, 2018 at 8:38 AM, Joe Touch <touch@strayalpha.com> wrote:
>
>
> On Jul 28, 2018, at 11:24 AM, Ole Troan <otroan@employees.org> wrote:
>
> Here’s the thing about fragmentation:
>
> 1. all links have a maximum packet size
> 2. all tunneling/encapsulation/layering increases payload size
>
> 1+2 implies there is always the need for fragmentation at some layer:
>
>
> 1 implies that.
> There is enough head room designed in 1 to accommodate 2.
>
>
> As was noted, there’s never enough headroom because you can’t control (or
> even know) how many layers of tunnels your traffic experiences.
>
>
>
> 3. fragmentation always splits info across packets
>
> And there’s something important about layering:
>
> 4. layering intends to isolate the behavior of one layer from another, such
> that
> it will always be impossible for an upper layer to know exactly what is
> going on below,
> i.e., to determine that limiting size across an entire path of possibly
> virtual tunnels
>
> The next two are where we get into trouble:
>
> 5. network devices increasingly WANT to inspect contents beyond the layer at
> which they are intended to operate
>
>
> not that network devices have an intent in themselves, but yes, it seems
> like network operators want to inspect content or are forced into it because
> of the necessity of IPv4 address sharing.
>
>
> They were “forced” into differentiating commercial from home customer
> pricing. NATs didn’t need to exist when they were first deployed.
>
> That said, there’s no real problem with a NAT *IF* it acts as a host on the
> Internet
> (see ouch, J: Middlebox Models Compatible with the Internet. USC/ISI
> (ISI-TR-711), 2016.)

Joe,

It's still a problem though. A NAT (or any stateful device in the
network) forces the requirement in network architecture that all
packets of a flow are routed through the same device. This has killed
our ability to use multi-homing and multi-path. The best way for an
intermediate devices to deal with transport layer state is to be an L4
proxy. The intermediate is a host endpoint for the proxy connections,
but then that has its own problems since it breaks E2E functionality
(like TCP auth). So the only real solution is to eliminate transport
state from the network. I'm still holding out hope that IPv6 will
start to obsolete use of NAT! FAST (draft-herbert-fast-02) is intended
to provide a viable alternative to stateful firewalls.

Tom

>
> 6. inspecting contents ultimately means reassembly, at some level
>
>
> _some_ content inspection would require that, but I don't think you can make
> that the general rule.
> e.g. a NAT or an L4 ACL only needs access to the L4 header.
>
>
> It’s a general rule; you merely refer to an instance that is relevant ONLY
> when the L3 header at which a router operates is followed (or expected to be
> followed) by an L4 header.
>
> That may have been the Internet of 10 years ago, but it is less and less the
> Internet moving forward.
>
>
> Which brings us to the punchline:
>
> 7. but network device vendors want to save money, so they don’t want to
> reassemble at any layer
>
>
> We'd all wish it to be that simple. Can you substantiate that claim?
> You can easily make the speculation that customers don't want to pay what it
> costs to be able to do reassembly at terabit speeds...
> Or accept that it's technically hard.
>
>
> I’m not claiming it isn’t hard; I’m claiming that it’s always cheaper to
> *not* do something.
>
> My concern isn’t that reassembly isn’t being done; it’s that vendors sell
> devices that don’t make this clear - AND that they don’t pass on packets
> they don’t/can’t reassemble.
>
> The implementations of e.g. NATs, IPv4 address sharing implementations I'm
> aware of do flavours of network layer reassembly.
>
>
> If they did so, we would not be here talking about considering IP
> fragmentation fragile.
>
> However much money you throw at it, you can't reassemble fragments
> travelling on different paths, nor can you trivially make network layer
> reassembly not be an attack vector on those boxes.
>
>
> Agreed, but here’s the other point:
>
> Any device that inspects L4 content can do so ONLY as a proxy for the
> destination endpoint.
>
> I.e., I know vendors WANT to sell devices they say can be deployed anywhere
> in the network, and operators believe that, but it’s wrong.
>
> Basically, if you’re not at a place in the network where you represent that
> endpoint, you have no business acting as that endpoint - “full stop”.
>
>
>
> So I agree, IP fragmentation has its flaws - but those flaws are created not
> only because it leaves out the transport port numbers, but also because DPI
> and NAT devices don’t reassemble. And they don’t because it’s cheaper to
> sell devices that say they run at 1 Gbps (e.g.) that don’t bother to
> reassemble.
>
>
> I don't agree with your conclusion.
> NATs extend the network layer to include the L4 ports. NAT implementations
> of course do reassemble.
>
>
> See Touch, J: Middlebox Models Compatible with the Internet. USC/ISI
> (ISI-TR-711), 2016.
>
> NATs do not extend the network layer to include L4; they act as endpoints in
> the public net and routers in the private net - for the reason above
> (multipath), among others.
>
>
> I.e., it will never matter what layering we add to fix this - GRE, GUE,
> Aero, etc. - ultimately, we’re doomed to need fragmentation support down to
> IP exactly because:
>
> a. #1-4 mean we need frag/reassembly at any tunnel ingress
> b. vendors want to sell #5 at a price that is too low for them to support #6
> (i.e., point #7)
>
>
>
> So pushing this to another layer will never solve it. What will solve it
> will only be a compliance requirement for #6 - which could be done right
> now, and has to be done for ANY solution to work.
>
>
> For IPv4 address sharing specifically removing network layer fragmentation
> would be a solution.
>
>
> For ONE problem there is ONE patch until a new device looks further (to do
> port filtering on UDP tunneled traffic, e.g.). Then we’re back here talking
> about fragmentation over GUE considered fragile.
>
>
> NOTE: even rewriting EVERY application won’t fix this, nor will deploying a
> new layer at any level.
>
>
> For some type of content inspection that would require reassembling the
> whole application context.
> But that's quite different from IPv4 address sharing, which we have
> unfortunately made an integral part of the Internet architecture.
>
>
> The only difference is that the impact of the current L4 problem is being
> currently discussed. If you consider why it’s happening, it might be more
> clear that this problem will cycle around again and cannot be solved by
> patching this way.
>
> Joe
>
> _______________________________________________
> Int-area mailing list
> Int-area@ietf.org
> https://www.ietf.org/mailman/listinfo/int-area
>