Re: [Taps] MTU / equivalent at the transport layer

Michael Welzl <michawe@ifi.uio.no> Tue, 13 December 2016 12:53 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: taps@ietfa.amsl.com
Delivered-To: taps@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3F454129698 for <taps@ietfa.amsl.com>; Tue, 13 Dec 2016 04:53:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.096
X-Spam-Level:
X-Spam-Status: No, score=-7.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-2.896] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HFTLerJqfiHC for <taps@ietfa.amsl.com>; Tue, 13 Dec 2016 04:53:23 -0800 (PST)
Received: from mail-out01.uio.no (mail-out01.uio.no [IPv6:2001:700:100:10::50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37DCE129673 for <taps@ietf.org>; Tue, 13 Dec 2016 04:53:23 -0800 (PST)
Received: from mail-mx4.uio.no ([129.240.10.45]) by mail-out01.uio.no with esmtp (Exim 4.82_1-5b7a7c0-XX) (envelope-from <michawe@ifi.uio.no>) id 1cGma9-0002bA-K7 for taps@ietf.org; Tue, 13 Dec 2016 13:53:21 +0100
Received: from boomerang.ifi.uio.no ([129.240.68.135]) by mail-mx4.uio.no with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) user michawe (Exim 4.80) (envelope-from <michawe@ifi.uio.no>) id 1cGma8-0005EE-Us; Tue, 13 Dec 2016 13:53:21 +0100
Content-Type: text/plain; charset="iso-8859-1"
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\))
From: Michael Welzl <michawe@ifi.uio.no>
In-Reply-To: <584FC7DF.5040108@erg.abdn.ac.uk>
Date: Tue, 13 Dec 2016 13:53:18 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <9746E98C-5187-413B-B773-FC839ACDA7D5@ifi.uio.no>
References: <5F2E34E4-7D32-4BDB-B762-2ADB7994672B@ifi.uio.no> <c6b1d261-8c3c-ed50-78e1-9b5e472815fc@isi.edu> <0213051A-C761-43B3-8750-1B999A8A893A@ifi.uio.no> <00e457fb-4708-6b43-46d4-e065b14dedd9@isi.edu> <19C25F7D-BB05-457C-89E2-450A1E808FDD@ifi.uio.no> <3a5ce4a7-5365-59d9-0ff2-eadeebae4d0a@isi.edu> <4FF8823B-4C6E-41B4-9C30-C6028A2EDBD6@ifi.uio.no> <db79e464-0e14-bbdf-30e0-988d874fd836@erg.abdn.ac.uk> <725460cb-73ab-8c16-8647-9d8711b48fa7@isi.edu> <aafaea3a-75ce-957e-a9d9-8390647966cf@erg.abdn.ac.uk> <3fb7b303-91da-83af-9b9d-14b2ea196c9d@isi.edu> <7EDE888D-1E12-4725-849E-D080D6998D6D@erg.abdn.ac.uk> <8EFE0C27-355F-4E63-8E1F-D7E059D0E249@ifi.uio.no> <584FC7DF.5040108@erg.abdn.ac.uk>
To: "<gorry@erg.abdn.ac.uk> Fairhurst" <gorry@erg.abdn.ac.uk>
X-Mailer: Apple Mail (2.2104)
X-UiO-SPF-Received:
X-UiO-Ratelimit-Test: rcpts/h 10 msgs/h 6 sum rcpts/h 12 sum msgs/h 7 total rcpts 49929 max rcpts/h 54 ratelimit 0
X-UiO-Spam-info: not spam, SpamAssassin (score=-6.5, required=5.0, autolearn=disabled, RP_MATCHES_RCVD=-1.534, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 8F8D717E4313A61BA9513AEBB86412C3AFD11A74
X-UiO-SPAM-Test: remote_host: 129.240.68.135 spam_score: -64 maxlevel 80 minaction 2 bait 0 mail/h: 6 total 11890 max/h 21 blacklist 0 greylist 0 ratelimit 0
Archived-At: <https://mailarchive.ietf.org/arch/msg/taps/fodjKfMAy2WoBKaYXUNvxtmAQ4c>
Cc: Joe Touch <touch@isi.edu>, "taps@ietf.org" <taps@ietf.org>
Subject: Re: [Taps] MTU / equivalent at the transport layer
X-BeenThere: taps@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Discussions on Transport Services <taps.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/taps>, <mailto:taps-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/taps/>
List-Post: <mailto:taps@ietf.org>
List-Help: <mailto:taps-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/taps>, <mailto:taps-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Dec 2016 12:53:26 -0000

> On 13 Dec 2016, at 11:05, Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
> 
> On 13/12/2016 09:13, Michael Welzl wrote:
>> Hi,
>> 
>> This direction definitely makes sense to me, too. I see some tension here, though - on the one hand, Joe is (as usual) arguing "cleanliness", i.e. keep layering right. On the other hand, applications tend to want to know a message size that doesn't get fragmented along an IPv4 path (as identified by the authors of draft-trammell-post-sockets and draft-mcquistin-taps-low-latency-services).
>> Raising the abstraction level is fine, but I think Joe's suggestion below misses something.
>> 
>> In an earlier email, Joe wrote about these two sizes:
>> 
>> ***
>> 1) the size of the message that CAN be delivered at all
>> 
>> 2) the size of the message that can be delivered without network-layer
>> fragmentation
>> ***
>> and stated that 2) should not be exposed.
>> 
>> So, in the proposal below, "largest transmission size" is 1) from above, and sending it would fail if it's bigger than 2) above AND "native transmission desired" is set to TRUE. So this is how the application would then do its own form of PMTUD.
>> 
>> Given that we don't know which protocol we're running over, probing strategies that involve common MTU sizes (like using the table in section 7.1 of RFC1191) can't work. So it's not the world's most efficient PMTUD that applications will be using, to eventually find the value of 2).
>> A protocol like SCTP is even going to do PMTUD on its own, so it could provide a number for 2), which would have less overhead than requiring applications to do their own PMTUD.  =>   If we have to "go dirty" anyway, which we already do by exposing the binary "native transmission desired", why not offer the value of 2) as well?
>> In other words: how is this boolean better than offering 2) ?
>> 
>> Cheers,
>> Michael
>> 
>> 
>> 
>>> On 12 Dec 2016, at 21:53, Gorry (erg)<gorry@erg.abdn.ac.uk>  wrote:
>>> 
>>> This is fine - it looks a like what I pointed to in the DCCP spec. But specifically,  I agree you don't need the DF flag visible - if you have a way to convey the info needed to set the flag at the transport (and anything else appropriate -as you note). I am all in favour of such appropriate abstraction.
>>> 
>>> Gorry
>>> 
>>>> On 12 Dec 2016, at 19:09, Joe Touch<touch@isi.edu>  wrote:
>>>> 
>>>>> On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:
>>>>>> IMO, the app should never need to play with DF. It needs to know what it
>>>>>> thinks the transport can deliver - which might include transport
>>>>>> frag/reassembly and network frag/reassembly.
>>>>> How does the App handle probes for path MTU then in UDP?
>>>>> 
>>>>> Gorry
>>>> I think there needs to be two parts to the API:
>>>> 
>>>> - largest transmission size
>>>> - native transmission desired (true/false)
>>>> 
>>>> If the app says "YES" to native transmission size, then that would suggest that UDP would do *nothing* and pass that same kind of flag down to IP, where IP would not only set DF=1, but also not source fragment.
>>>> 
>>>> I.e., I don't think it's the app's job to know how to explicitly control a mechanism two layers down, and DF isn't really what you want anyway. DF isn't the same as "don't source fragment".
>>>> 
>>>> Joe
>>>> _______________________________________________
>>>> Taps mailing list
>>>> Taps@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/taps
>> _______________________________________________
>> Taps mailing list
>> Taps@ietf.org
>> https://www.ietf.org/mailman/listinfo/taps
> So I'd like to return to RFCs that have been through part of this discussion before,
> 
> (1) I think we need a parameter returned to the App that is equivalent to Maximum Packet Size, MPS, in DCCP (RFC4340). It is useful to know how many bytes the app can send with reasonable chance of unfragmented delivery.

I agree; that seems to be what I ended up proposing above.


> (2) It's helpful for Apps to be able to retrieve the upper size allowed with potential fragmentation - that could be useful in determinine probe sizes for an application.  Apps should know the hard limt, In DCCP this is called the current congestion control maximum packet size (CCMPS), the largest permitted by the stack using the current congestion control method. That's bound to be less than or equal to what is permitted for the local Interface MTU. This limit lets the App also take into consideration other size constraints in the stack below the API.

Agreed; I think that was Joe's item 1) ("the size of the message that CAN be delivered at all").


> (3) Apps need to be allowed to fragment datagrams more than MPS - This is not expected as the default, the stack needs to be told.
> 
> (4) Apps need to be allowed to not allow datagram fragmentation - The stack needs to be told. You could do this by using the DF semantics (i.e., don't source fragment a DF-marked packet). Thinking more, this seems the easiest.

These two are hard to parse, making me wonder if they mean what was intended. E.g. for (3): applications are always allowed to fragment their data as they wish, right?  Did you mean to say "Apps need to be allowed to allow to fragment datagrams more than MPS" ?  :-)   I think so...


> Sorry, if this goes over what I said before, but I think we should first explore the approaches that have already been put forward in RFCs (alebit these were not RFCs about UDP).

Makes perfect sense to me. Anyway it seems (if I parse them right) that (3) and (4) are just Joe's "native transmission desired" boolean.

So in conclusion, IIUC, this is just a way of saying that:
your item 1: you (as I) also think we should have a "Maximum Packet Size" type thing in addition to what Joe said,
your items 2, 3 and 4: you (as I) agree to offer the other things that Joe also thinks should be offered

Right?

Cheers,
Michael