[rrg] weighing core network v edge network obstacles

Paul Jakma <paul@jakma.org> Sun, 14 February 2010 13:34 UTC

Return-Path: <paul@jakma.org>
X-Original-To: rrg@core3.amsl.com
Delivered-To: rrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 197A53A7698 for <rrg@core3.amsl.com>; Sun, 14 Feb 2010 05:34:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.249
X-Spam-Level:
X-Spam-Status: No, score=-2.249 tagged_above=-999 required=5 tests=[AWL=0.350, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RlD1JuCQCqMu for <rrg@core3.amsl.com>; Sun, 14 Feb 2010 05:34:58 -0800 (PST)
Received: from hibernia.jakma.org (hibernia.jakma.org [212.17.55.49]) by core3.amsl.com (Postfix) with ESMTP id 95A953A7353 for <rrg@irtf.org>; Sun, 14 Feb 2010 05:34:56 -0800 (PST)
Received: from stoner.gla.jakma.org (stoner.jakma.org [81.168.24.42]) (authenticated bits=0) by hibernia.jakma.org (8.14.3/8.14.3) with ESMTP id o1EDZat1012096 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 14 Feb 2010 13:35:46 GMT
Date: Sun, 14 Feb 2010 13:35:36 +0000
From: Paul Jakma <paul@jakma.org>
To: Robin Whittle <rw@firstpr.com.au>
In-Reply-To: <4B777152.8030706@firstpr.com.au>
Message-ID: <alpine.LFD.2.00.1002141159300.27055@stoner.jakma.org>
References: <4B7617EB.5090800@firstpr.com.au> <alpine.LFD.2.00.1002131045540.27055@stoner.jakma.org> <4B76ABB1.6040302@firstpr.com.au> <alpine.LFD.2.00.1002131438040.27055@stoner.jakma.org> <4B777152.8030706@firstpr.com.au>
User-Agent: Alpine 2.00 (LFD 1167 2008-08-23)
Mail-Copies-To: paul@jakma.org
Mail-Followup-To: paul@jakma.org
X-NSA: al aqsar fluffy jihad cute musharef kittens jet-A1 ear avgas wax ammonium bad qran dog inshallah allah al-akbar martyr iraq hammas hisballah rabin ayatollah korea revolt pelvix mustard gas x-ray british airways washington peroxide cool
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"; format="flowed"
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.1.1 (hibernia.jakma.org [212.17.55.49]); Sun, 14 Feb 2010 13:35:49 +0000 (GMT)
Cc: RRG <rrg@irtf.org>
Subject: [rrg] weighing core network v edge network obstacles
X-BeenThere: rrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IRTF Routing Research Group <rrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/rrg>
List-Post: <mailto:rrg@irtf.org>
List-Help: <mailto:rrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Feb 2010 13:34:59 -0000

On Sun, 14 Feb 2010, Robin Whittle wrote:

> Your proposal involves using IPv4 header options.  These are
> theoretically compatible with IPv4, but they can't be relied upon in
> across the DFZ.  If all routers were updated, then  your system could
> proceed,

Not all - some, likely a /very/ few (see below). That study shows 
that most paths are fine (Exactly how representative that study is, 
who knows - more samples might show the problem is greater or lesser 
in extent). However, aren't you advocating upgrading /all/ routers? 
:)

> The techniques Fred and I developed are different, but both can 
> cope with there being PMTU problems between ITRs and ETRs which do 
> not result in PTBs being sent to the ITR.

This is an interesting point. Which problem is worse:

1. some DFZ routers having problems with IP options
2. some edge networks doing extremely dumb things and causing PMTU
    problems

?

Personally, I'll go for 1. Cause I know those routers are 
well-managed by generally very clueful people. The anecdotal evidence 
available to me suggests PMTU problems are caused by certain strange 
ideas about the importance of blocking everything but (tcp and dst 
port 80) having become ingrained in well-meaning but less than 
clueful edge network administrators (companies selling 
firewall/network security products bear a lot of responsibility for 
instilling that idea).

I see you have a protocol to deal with 2. Am I correct in thinking 
its Inter-TR, and exists to deal with PMTU problems between the 
xTRs? If so, then doesn't it leave the problem of where the PMTU hole 
is near the remote, non CES, edge network. e.g.:

                                 remote network
host--ETR---ITR-------------------[FW----host]


Host is "CESed". "Remote network" unfortunately is behind a PMTU hole 
- i.e. FW drops incoming PTB ICMPs, so the remote-network host can 
never do PMTU properly. Hence many consumer routers clamp TCP MSS to 
local MTU - 40.

The problem is now there's another tunnel in the way, and it's eaten 
20 from the PMTU, which 'host' (or its local access router) doesn't 
know about at all, and so can't automatically adjust its MTU for, and 
that 20 is larger than the 8 bytes of PPPoE overhead that routers 
usually are adjusting for.

So this creates another PMTU problem when remote-host tries sending a 
packet to host and never receives the PTB from the ITR.

In theory, this problem seems like it should be much easier to solve 
than fixing core routers to be nicer to things like IP options. 
However, I have my doubts :).

I notice you also have a draft that gets around this by rewriting the 
IP header. Though, that requires upgrading the vast majority of 
routers on the internet.

So it seems the problems we can choose between come down to:

1. Fixing the majority of those routers that are causing the problems
    for IP options on 15 to 20% of paths.

    If each tested path was completely independent and each path was
    10 hops on average, and the problems on each path were down to one
    router, then we're talking about about something like 1% of
    routers that need to be fixed. If average was 16 hops, then we're
    talking 0.5%. It'd be interesting to see field studies to pin this
    number down.

2. Upgrading all end-host software

3. Fixing the dumb edge networks that are causing the PMTU problems.

4. Upgrading all DFZ routers to handle a modified IP header.

It's hard to think 2 is achievable. It boils down to try re-educating 
people who don't really have much interest anyway in all this. We've 
had years and years of this problem and awareness and yet it's still 
there, and there are still widely used security products shipping 
that block ICMP by default and do all kinds of stupid things, IME. 
Also, the workaround for 2 is for end-hosts to clamp down their MSS 
even further. So adding tunneling to the internet has a risk of an 
ever escalating "race to 0 MTU" (PPPoE, VPNs, CES, etc - all of which 
can stack!).

So then it boils down to 1, 3 and 4. I think 1 is hard, but 
achievable. If 1 is considered so hard as to be unachieavable, then 
surely 4 must be impossible?

For me, the above order (1,2,4,3) corresponds to least-hardest to 
most-hardest. I.e. if you have to choose between which of those 
problems your solution must overcome, then I'd choose 1 obviously.

Obviously the choice may be more complex than that. E.g. it could be 
"1 AND 2" (e.g. having end-hosts add an IP option in some kind of 
CEE, or CEEish solution) versus "3 or 4" (e.g. CES). Then it's not so 
clear which is worse.

NB: I have finally found your terminology draft, so hopefully I can 
abuse the terminology less.

regards,
-- 
Paul Jakma	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
"I will make no bargains with terrorist hardware."
-- Peter da Silva