Re: Warren Kumari's No Record on draft-ietf-6man-rfc1981bis-06: (with COMMENT)

Warren Kumari <warren@kumari.net> Tue, 09 May 2017 09:46 UTC

Return-Path: <warren@kumari.net>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B0708129BAB for <ipv6@ietfa.amsl.com>; Tue, 9 May 2017 02:46:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=kumari-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y3V3Qge6QwgE for <ipv6@ietfa.amsl.com>; Tue, 9 May 2017 02:46:20 -0700 (PDT)
Received: from mail-ua0-x22b.google.com (mail-ua0-x22b.google.com [IPv6:2607:f8b0:400c:c08::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3B50D129409 for <ipv6@ietf.org>; Tue, 9 May 2017 02:46:20 -0700 (PDT)
Received: by mail-ua0-x22b.google.com with SMTP id g49so69063912uaa.1 for <ipv6@ietf.org>; Tue, 09 May 2017 02:46:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kumari-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=1O//fRd+H191gaSMMeuo8XHFE/AfP2B42gNd8EApPew=; b=bEoBxSaeOygFEBjC7DnKoEsRXygoPpHFiNltwDosBEnjyVihxrMsFRqPYPKGGKEmET +UAX3R+Xpr4deNzW82O4VfemcJ8OVUgJKhlP/dxejIDLaorFeYjgatKjZbtK455f6hCI jITqt5azezmBZluFaOuh+nd/YtXmaWAV+zkp7jzf5vRzt6cY/HTL1CDtwnNzDk2H3YrS dUp9/740r2Fg7FZGJKPJnF7kO6OtWjucZD942Biz/jWYRNhBFXYDeO8ntKBJFBLIjO5F yHajZvDODLBm2Vy4qzu/u1iGnomCzhBnpePT3nxXIO00EJFz3vrXtHaNxHhmId1xxKQM dLrg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=1O//fRd+H191gaSMMeuo8XHFE/AfP2B42gNd8EApPew=; b=ZLuLLDuHH2brtaTgO8JTfoScud+9nPjslESOobpJlId/4AglEB3og65rdqNiQyL8yM id3GlF2BQ/n1nPDRfy4/JQLLl4N2uOb71n9A4WNRLcgUVyAZdyTvay1h3wT9AuySFUOm ABXYLu4dYCIYBzfbjFhvxAUbSCWlyexsX1TPtqgvIwPuiwRrNCqP+hTBy1V35RaUgzst h3Y5lE3CZ45wuCvhrBtYQirRzE5j1CgFuJ+h/1s7lCmexbE94nBOWM4AMGl3IVfek6yb VxRakmTw5NWBSxWDaJ+zlvb6hWCYxOnASjiBXdXFdPXg+hUH+ut3zIrjFtQmnxdBbVYQ mkYQ==
X-Gm-Message-State: AN3rC/5HbCoASsJAYvo1+zBurjpZti5iY7vZ6DHTmVbaNfTjoe1hbvBB 5A2ICZwUpgIhrY4ZvaETmVK2k52YN6yj
X-Received: by 10.31.154.194 with SMTP id c185mr26984923vke.35.1494323179121; Tue, 09 May 2017 02:46:19 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.176.71.83 with HTTP; Tue, 9 May 2017 02:45:38 -0700 (PDT)
In-Reply-To: <CA+MHpBp9uuMxsYAOGh6o61Lt1LMVHbxFsoayK8ML6dNpoxWUcQ@mail.gmail.com>
References: <149427371611.7824.6370727830803946449.idtracker@ietfa.amsl.com> <CA+MHpBp9uuMxsYAOGh6o61Lt1LMVHbxFsoayK8ML6dNpoxWUcQ@mail.gmail.com>
From: Warren Kumari <warren@kumari.net>
Date: Tue, 09 May 2017 05:45:38 -0400
Message-ID: <CAHw9_iKXtcB5PgY2L7qb6=w_j5qTAJHv3eAfVJLp8Ev7HSYJ3A@mail.gmail.com>
Subject: Re: Warren Kumari's No Record on draft-ietf-6man-rfc1981bis-06: (with COMMENT)
To: Suresh Krishnan <suresh.krishnan@gmail.com>
Cc: The IESG <iesg@ietf.org>, Ole Trøan <otroan@employees.org>, 6man-chairs@ietf.org, IPv6 List <ipv6@ietf.org>, draft-ietf-6man-rfc1981bis@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/3-1a1CKH8FTsjWrkWZjD7OFXNfU>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 May 2017 09:46:22 -0000

On Tue, May 9, 2017 at 12:54 AM, Suresh Krishnan
<suresh.krishnan@gmail.com> wrote:
> Hi Warren,
>   Thanks for your comments. Please see responses inline.
>
> On Mon, May 8, 2017 at 4:01 PM, Warren Kumari <warren@kumari.net> wrote:
>> Warren Kumari has entered the following ballot position for
>> draft-ietf-6man-rfc1981bis-06: No Record
>>
>> When responding, please keep the subject line intact and reply to all
>> email addresses included in the To and CC lines. (Feel free to cut this
>> introductory paragraph, however.)
>>
>>
>> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
>> for more information about IESG DISCUSS and COMMENT positions.
>>
>>
>> The document, along with other ballot positions, can be found here:
>> https://datatracker.ietf.org/doc/draft-ietf-6man-rfc1981bis/
>>
>>
>>
>> ----------------------------------------------------------------------
>> COMMENT:
>> ----------------------------------------------------------------------
>>
>> [ Reminder to myself - I'm leaving this as No Record while looking
>> more... ]
>>
>> I sent email about this to the authors on Feb 23rd - I seem to still have
>> have many of the same questions...
>>
>> Comments:
>> 1: Sec 1: "Path MTU Discovery relies on such messages to determine the
>> MTU of the path."
>>  -- it is unclear which "such" refers to. Perhaps s/such/ICMPv6/ (or
>> PTB).
>
> This was new text added in the -bis document. In the context of the
> sentence (regarding ICMPv6 filtering leading to black hole connection)
> I think ICMPv6 works better than singling out PTB.

Yup, WFM.


>
>>
>> 2: Sec 3: "Upon receipt of such a message, the
>>    source node reduces its assumed PMTU for the path based on the MTU
>> of
>>    the constricting hop as reported in the Packet Too Big message" --
>> this says that it reduces it *for the path*. But (as somewhat alluded to
>> later in the draft) the nodes doesn't know what the path *is* -- it can
>> decrease for the destination, or flow, or even interface, but (unless it
>> is strict source routing) it doesn't control or really know the path (see
>> also #4)
>
> Right. This is explored further in Section 5.2
>
> "  However, in most cases a node will not have enough
>    information to completely and accurately identify such a path.
>    Rather, a node must associate a PMTU value with some local
>    representation of a path.  It is left to the implementation to select
>    the local representation of a path."
>
> Would you like some additional text in this section?
>

I don't really know -- I have a weird feeling that the document is
explicit about what you must do under certain circumstances, but then
handwaves away the fact that you cannot really tell. I think that it
would be helpful to point at Sec 5.2 from here.


>>
>> 3: Sec 4: "The recommended setting for this timer is twice its minimum
>> value (10 minutes)." - as above. This was from 1996 - were these metrics
>> discussed at all during the -bis? I suspect that the average flow is much
>> shorter these days (more web traffic, fatter pipes, etc) and so a flow of
>> 10 minutes seems really long (to me at least).
>
> AFAIK, there was no discussion regarding this number. I have a hard
> time seeing how this number relates to the (shortened) length of the
> flow, though.

It's fine if it wasn't discussed (these are just comments), but my
thinking goes:

In 1996 I beleive that average flows were much longer and so flows
would actually hit the 10 minute timer and attempt to increase the
increase MTU process, but these days, with much shorted flows, you
will hit it much less often.

Also, in 1996 a 45Mbps DS3 / T3 was fast. If I started transferring a
large file and my first few packets went down a pipe with a 1280byte
MTU path, which then changed to a path with 1500bytes, after 10
minutes I would have transferred ~3.3GB, and 2.6million packets., for
a "wastage" of 580MB (2.6m * (1500 - 1280)).
On a 10GE, in the same situation, after 10 minutes I would have
transferred 750GB and 585million packets, for a "wastage" of almost
13GB.
(Note: Back of the envelope calculations, may have missed a zero somewhere)

>
>>
>> 4: Sec 5.2: "The packetization layers must be notified about decreases in
>> the
>>    PMTU.  Any packetization layer instance (for example, a TCP
>>    connection) that is actively using the path must be notified if the
>>    PMTU estimate is decreased.
>>       Note: even if the Packet Too Big message contains an Original
>>       Packet Header that refers to a UDP packet, the TCP layer must be
>>       notified if any of its connections use the given path."
>>  - this is related to #2 -- I don't know *which* path my packets take -
>> once I launch them into the void, they may be routed purely based upon
>> destination IP address, or they may be hashed based upon some set of
>> header fields to a particular ECMP link or LSP. Once packets hit a load
>> balancer, it is probably even *likely* that the UDP and TCP packets end
>> up on different things. So, if I get a PTB from a router somewhere, I can
>> probably guess that other packets to the same destination address will
>> also follow that path, but I cannot know that for sure. I'm fine to
>> decrease MTU towards that destination IP, but is that what this is
>> suggesting? If so, please say that. If not, please let me know what I
>> should do. The above is even more tricky / fun when I'm using flow id as
>> the flow identifier -- if I get a PTB for flow 0x1234, what do I do?
>
> Yes. You are right. Any out of band mechanism for probing the MTU will
> likely end up having the same problem if the probe packets are treated
> differently from the payload packets by the load balancers and other
> middleboxes.


Yup. This is related to the #2. The definition of "path" is very vague
- if I use flow id as my path representation, how do I know what else
to notify?
I think that this should have some better description of what exactly
I adjust if I get the PTB...

If this were a new protocol, which wasn't already implemented
basically everywhere, I'd say that this needs to be much much clearer
and more specific -- but, the fact that there are already
implementations which a: work, and b: new implementations can look at
means that I'm not too fazed about this...
>
>>
>>  5: Sec 5.3: "Once a minute, a timer-driven procedure runs through all
>> cached PMTU values, and for each PMTU whose timestamp is not "reserved"
>> and is older than the timeout interval ...". Please consider providing
>> clarifications here. The wording implies that I should set a timer to
>> fire on the minute, and trigger the behavior. If all of the (NTP synced!)
>> machines in my datacenter do this, and all try send bigger packets (on
>> 1/10th of long flows) their first hop router will get many, many
>> over-sized packets and it will severely rate-limit the PTBs.
>
> I am not sure why the first hop router will send PTBs. At timer
> expiry, the PMTU estimate will only be set to the MTU of the first-hop
> link and hence the first hop router should be able to handle this, no?


Nope - example scenario...

┌───────────┐               .───────.             ┌────────────┐
│ Computer  │──MTU:1500───▶( Router  )──MTU:1400──▶ "Internet" │
└───────────┘               `───────'             └────────────┘

The outgoing interface from my router is 1400 bytes. This means that
every connection will get back a PTB when it tries to reach the
internet - I *should* change the default MTU on my machine, but, well,
I'm a user...

In many scenarios the lower MTU will be close to the sending machine -
if you synchronize the "try redo  PMTUd" you run the risk of spikes.
Many machines (like my Apple laptop) are now NTP synced - even if the
MTU bottleneck is far away, when everyone tries to use 1500byte
packets at the same time, whatever device has a smaller MTU will have
a bad day...

W



>
>>
>>
>> Nits (Some of these are purely academic.)
>> I understand that you are trying to limit the changes, so feel free to
>> ignore these:
>
> I will let the editor chime in on these.
>
> Thanks
> Suresh



-- 
I don't think the execution is relevant when it was obviously a bad
idea in the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair
of pants.
   ---maf