Re: [Taps] MTU / equivalent at the transport layer

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Tue, 13 December 2016 13:35 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: taps@ietfa.amsl.com
Delivered-To: taps@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CD081296D5 for <taps@ietfa.amsl.com>; Tue, 13 Dec 2016 05:35:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.796
X-Spam-Level:
X-Spam-Status: No, score=-4.796 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-2.896] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JWuk9ixvLbaW for <taps@ietfa.amsl.com>; Tue, 13 Dec 2016 05:35:01 -0800 (PST)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:241:204::f0f0]) by ietfa.amsl.com (Postfix) with ESMTP id 4CDC61296D3 for <taps@ietf.org>; Tue, 13 Dec 2016 05:35:01 -0800 (PST)
Received: from Gs-MacBook-Pro.local (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPA id 565081B001E9; Tue, 13 Dec 2016 15:32:40 +0000 (GMT)
Message-ID: <584FF8E7.9070905@erg.abdn.ac.uk>
Date: Tue, 13 Dec 2016 13:34:31 +0000
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Organization: University of Aberdeen
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: Michael Welzl <michawe@ifi.uio.no>
References: <5F2E34E4-7D32-4BDB-B762-2ADB7994672B@ifi.uio.no> <c6b1d261-8c3c-ed50-78e1-9b5e472815fc@isi.edu> <0213051A-C761-43B3-8750-1B999A8A893A@ifi.uio.no> <00e457fb-4708-6b43-46d4-e065b14dedd9@isi.edu> <19C25F7D-BB05-457C-89E2-450A1E808FDD@ifi.uio.no> <3a5ce4a7-5365-59d9-0ff2-eadeebae4d0a@isi.edu> <4FF8823B-4C6E-41B4-9C30-C6028A2EDBD6@ifi.uio.no> <db79e464-0e14-bbdf-30e0-988d874fd836@erg.abdn.ac.uk> <725460cb-73ab-8c16-8647-9d8711b48fa7@isi.edu> <aafaea3a-75ce-957e-a9d9-8390647966cf@erg.abdn.ac.uk> <3fb7b303-91da-83af-9b9d-14b2ea196c9d@isi.edu> <7EDE888D-1E12-4725-849E-D080D6998D6D@erg.abdn.ac.uk> <8EFE0C27-355F-4E63-8E1F-D7E059D0E249@ifi.uio.no> <584FC7DF.5040108@erg.abdn.ac.uk> <9746E98C-5187-413B-B773-FC839ACDA7D5@ifi.uio.no>
In-Reply-To: <9746E98C-5187-413B-B773-FC839ACDA7D5@ifi.uio.no>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/taps/_qd_nAVXpbYivIxAZ6EsBQVj624>
Cc: Joe Touch <touch@isi.edu>, "taps@ietf.org" <taps@ietf.org>
Subject: Re: [Taps] MTU / equivalent at the transport layer
X-BeenThere: taps@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
Reply-To: gorry@erg.abdn.ac.uk
List-Id: Discussions on Transport Services <taps.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/taps>, <mailto:taps-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/taps/>
List-Post: <mailto:taps@ietf.org>
List-Help: <mailto:taps-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/taps>, <mailto:taps-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Dec 2016 13:35:05 -0000

On 13/12/2016 12:53, Michael Welzl wrote:
>> On 13 Dec 2016, at 11:05, Gorry Fairhurst<gorry@erg.abdn.ac.uk>  wrote:
>>
>> On 13/12/2016 09:13, Michael Welzl wrote:
>>> Hi,
>>>
>>> This direction definitely makes sense to me, too. I see some tension here, though - on the one hand, Joe is (as usual) arguing "cleanliness", i.e. keep layering right. On the other hand, applications tend to want to know a message size that doesn't get fragmented along an IPv4 path (as identified by the authors of draft-trammell-post-sockets and draft-mcquistin-taps-low-latency-services).
>>> Raising the abstraction level is fine, but I think Joe's suggestion below misses something.
>>>
>>> In an earlier email, Joe wrote about these two sizes:
>>>
>>> ***
>>> 1) the size of the message that CAN be delivered at all
>>>
>>> 2) the size of the message that can be delivered without network-layer
>>> fragmentation
>>> ***
>>> and stated that 2) should not be exposed.
>>>
>>> So, in the proposal below, "largest transmission size" is 1) from above, and sending it would fail if it's bigger than 2) above AND "native transmission desired" is set to TRUE. So this is how the application would then do its own form of PMTUD.
>>>
>>> Given that we don't know which protocol we're running over, probing strategies that involve common MTU sizes (like using the table in section 7.1 of RFC1191) can't work. So it's not the world's most efficient PMTUD that applications will be using, to eventually find the value of 2).
>>> A protocol like SCTP is even going to do PMTUD on its own, so it could provide a number for 2), which would have less overhead than requiring applications to do their own PMTUD.  =>    If we have to "go dirty" anyway, which we already do by exposing the binary "native transmission desired", why not offer the value of 2) as well?
>>> In other words: how is this boolean better than offering 2) ?
>>>
>>> Cheers,
>>> Michael
>>>
>>>
>>>
>>>> On 12 Dec 2016, at 21:53, Gorry (erg)<gorry@erg.abdn.ac.uk>   wrote:
>>>>
>>>> This is fine - it looks a like what I pointed to in the DCCP spec. But specifically,  I agree you don't need the DF flag visible - if you have a way to convey the info needed to set the flag at the transport (and anything else appropriate -as you note). I am all in favour of such appropriate abstraction.
>>>>
>>>> Gorry
>>>>
>>>>> On 12 Dec 2016, at 19:09, Joe Touch<touch@isi.edu>   wrote:
>>>>>
>>>>>> On 12/12/2016 10:58 AM, Gorry Fairhurst wrote:
>>>>>>> IMO, the app should never need to play with DF. It needs to know what it
>>>>>>> thinks the transport can deliver - which might include transport
>>>>>>> frag/reassembly and network frag/reassembly.
>>>>>> How does the App handle probes for path MTU then in UDP?
>>>>>>
>>>>>> Gorry
>>>>> I think there needs to be two parts to the API:
>>>>>
>>>>> - largest transmission size
>>>>> - native transmission desired (true/false)
>>>>>
>>>>> If the app says "YES" to native transmission size, then that would suggest that UDP would do *nothing* and pass that same kind of flag down to IP, where IP would not only set DF=1, but also not source fragment.
>>>>>
>>>>> I.e., I don't think it's the app's job to know how to explicitly control a mechanism two layers down, and DF isn't really what you want anyway. DF isn't the same as "don't source fragment".
>>>>>
>>>>> Joe
>>>>> _______________________________________________
>>>>> Taps mailing list
>>>>> Taps@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/taps
>>> _______________________________________________
>>> Taps mailing list
>>> Taps@ietf.org
>>> https://www.ietf.org/mailman/listinfo/taps
>> So I'd like to return to RFCs that have been through part of this discussion before,
>>
>> (1) I think we need a parameter returned to the App that is equivalent to Maximum Packet Size, MPS, in DCCP (RFC4340). It is useful to know how many bytes the app can send with reasonable chance of unfragmented delivery.
> I agree; that seems to be what I ended up proposing above.
>
>
>> (2) It's helpful for Apps to be able to retrieve the upper size allowed with potential fragmentation - that could be useful in determinine probe sizes for an application.  Apps should know the hard limt, In DCCP this is called the current congestion control maximum packet size (CCMPS), the largest permitted by the stack using the current congestion control method. That's bound to be less than or equal to what is permitted for the local Interface MTU. This limit lets the App also take into consideration other size constraints in the stack below the API.
> Agreed; I think that was Joe's item 1) ("the size of the message that CAN be delivered at all").
>
>
>> (3) Apps need to be allowed to fragment datagrams more than MPS - This is not expected as the default, the stack needs to be told.
>>
>> (4) Apps need to be allowed to not allow datagram fragmentation - The stack needs to be told. You could do this by using the DF semantics (i.e., don't source fragment a DF-marked packet). Thinking more, this seems the easiest.
> These two are hard to parse,
Sorry - trying hard.
> making me wonder if they mean what was intended. E.g. for (3): applications are always allowed to fragment their data as they wish, right?  Did you mean to say "Apps need to be allowed to allow to fragment datagrams more than MPS" ?  :-)   I think so...

Rewritten as:

(3) Apps need to be able to ask the stack to try hard to send datagrams 
larger than the current MPS - This is not expected as the default, the 
stack needs to be told to enable this use of source/router fragmentation 
and send IPv4 datagrams with DF=0 (For some IPv4 paths, the PMTU, and 
hence MPS can be very small).

(4) Apps need to be able to ask the stack to send datagrams larger than 
the current MPS, but NOT if this results in source fragmentation. Such 
packets need to be sent with DF=1.  - This is not expected as the 
default, the stack needs to be told to enable this -  for UDP it would 
be needed to perform PMTUD. That's I think what has been just called 
"native transmission desired ".
>
>> Sorry, if this goes over what I said before, but I think we should first explore the approaches that have already been put forward in RFCs (alebit these were not RFCs about UDP).
> Makes perfect sense to me. Anyway it seems (if I parse them right) that (3) and (4) are just Joe's "native transmission desired" boolean.
> So in conclusion, IIUC, this is just a way of saying that:
> your item 1: you (as I) also think we should have a "Maximum Packet Size" type thing in addition to what Joe said,
> your items 2, 3 and 4: you (as I) agree to offer the other things that Joe also thinks should be offered
>
> Right?
We may now be in agreement
> Cheers,
> Michael
Gorry