RE: flow control and DATAGRAM

"Lubashev, Igor" <ilubashe@akamai.com> Tue, 30 October 2018 03:55 UTC

Return-Path: <ilubashe@akamai.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 304C91286D9 for <quic@ietfa.amsl.com>; Mon, 29 Oct 2018 20:55:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.818
X-Spam-Level:
X-Spam-Status: No, score=0.818 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.47, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, KHOP_DYNAMIC=1.999, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4K3cH7fKsgpj for <quic@ietfa.amsl.com>; Mon, 29 Oct 2018 20:54:57 -0700 (PDT)
Received: from mx0b-00190b01.pphosted.com (mx0b-00190b01.pphosted.com [IPv6:2620:100:9005:57f::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B671B126CB6 for <quic@ietf.org>; Mon, 29 Oct 2018 20:54:57 -0700 (PDT)
Received: from pps.filterd (m0122331.ppops.net [127.0.0.1]) by mx0b-00190b01.pphosted.com (8.16.0.23/8.16.0.23) with SMTP id w9U3poX3005241; Tue, 30 Oct 2018 03:54:53 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=jan2016.eng; bh=V3pEXaGOaonCMDwTtzJqPdFl7h5F8dc9feZWLbfGlYQ=; b=HQ1xxLdj8m5vDff44bK8PLmzK+P6v7cr/hU5wvJStQ6YAHXTldjY1AuDukgfx5s/Yog/ L36W+Qh3FWOGn9wbEMZQdl6fzeEpsg4ggBAJeYQ3TkRYDTwOeQDiz3ynco7rqq02bbKl BpIlPuHXxWyHBmrO7CbZWTpPFvNPya6lZzQdBWlhmIZQ9YhUU1totweZ1WSI5BD09cdU P2nGXiTsJ8BILURDj+d4NsFLhwIdz9J94pBAcYF8DLDUL8rfcaSQqUCFX9TFkZNB/cbn 4QmC2Je7zaI6phRlejwnmxLS2Q/HMtN5fRzMFi/MTKkMV2M57U0AEkx0/HcDrBGWh9uo 0Q==
Received: from prod-mail-ppoint2 (prod-mail-ppoint2.akamai.com [184.51.33.19]) by mx0b-00190b01.pphosted.com with ESMTP id 2neekh84r9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 30 Oct 2018 03:54:53 +0000
Received: from pps.filterd (prod-mail-ppoint2.akamai.com [127.0.0.1]) by prod-mail-ppoint2.akamai.com (8.16.0.21/8.16.0.21) with SMTP id w9U3o3mU022847; Mon, 29 Oct 2018 23:54:52 -0400
Received: from email.msg.corp.akamai.com ([172.27.123.34]) by prod-mail-ppoint2.akamai.com with ESMTP id 2nckcaka7e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 29 Oct 2018 23:54:52 -0400
Received: from USMA1EX-DAG1MB5.msg.corp.akamai.com (172.27.123.105) by usma1ex-dag1mb1.msg.corp.akamai.com (172.27.123.101) with Microsoft SMTP Server (TLS) id 15.0.1365.1; Mon, 29 Oct 2018 23:54:52 -0400
Received: from usma1ex-dag1mb6.msg.corp.akamai.com (172.27.123.65) by usma1ex-dag1mb5.msg.corp.akamai.com (172.27.123.105) with Microsoft SMTP Server (TLS) id 15.0.1365.1; Mon, 29 Oct 2018 23:54:51 -0400
Received: from usma1ex-dag1mb6.msg.corp.akamai.com ([172.27.123.65]) by usma1ex-dag1mb6.msg.corp.akamai.com ([172.27.123.65]) with mapi id 15.00.1365.000; Mon, 29 Oct 2018 23:54:51 -0400
From: "Lubashev, Igor" <ilubashe@akamai.com>
To: "tpauly@apple.com" <tpauly@apple.com>, "jri.ietf@gmail.com" <jri.ietf@gmail.com>
CC: "quic@ietf.org" <quic@ietf.org>, "martin.thomson@gmail.com" <martin.thomson@gmail.com>, "ianswett@google.com" <ianswett@google.com>
Subject: RE: flow control and DATAGRAM
Thread-Topic: flow control and DATAGRAM
Thread-Index: AQHUb0aOcr0oK2RjxUS//IweRKrnD6U2a2QAgAAvQ4CAAEeKAIAABpaAgABq9ID//9ZW3Q==
Date: Tue, 30 Oct 2018 03:54:51 +0000
Message-ID: <4b7b02d110f74139bacae8e01c54a4af@usma1ex-dag1mb6.msg.corp.akamai.com>
References: <CABkgnnU26aYD=TybuD0FZGYtfEa6np-Sk3Jo6t7LRp0wzKh3Lg@mail.gmail.com> <CAKcm_gMDOyJC0AwOo2knN6AMxVbySjGsrLvjcC9A8UA9xcvcWw@mail.gmail.com> <19D3595B-8845-45B6-A60D-9E934DD49FAC@apple.com> <CACpbDcfk+GXb3aL5LG0wQM87thRGO5Y4Q+9cbXf5YuW1=jWCig@mail.gmail.com> <B7BE2454-2A4D-4323-B0A5-0D73BD4B819F@apple.com>, <CACpbDcd5YWcxyEH3TS1FssgAT13oqeBBQg92+J0uJqhxPz2qkA@mail.gmail.com>
In-Reply-To: <CACpbDcd5YWcxyEH3TS1FssgAT13oqeBBQg92+J0uJqhxPz2qkA@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
Content-Type: multipart/alternative; boundary="_000_4b7b02d110f74139bacae8e01c54a4afusma1exdag1mb6msgcorpak_"
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-30_02:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810300031
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-30_02:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810300032
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/b6X2R_DMv8TwRnTTc33EXvGN5OE>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Oct 2018 03:55:00 -0000

Jana,

> The simplest scheme would be to reserve a stream ID for DATAGRAM data -- [...]
> -- and then MAX_STREAM_DATA uses this
> stream ID for flow control of DATAGRAM octets.

There are many dragons that way. Flow control indicates commitment to buffer data. When sender does not retransmit, when does receiver give up its commitment to wait for (and buffer) earlier data so it can advance MAX_STREAM_DATA to allow the sender to send more data? Can the signal from sender tolerate loss/reordering vs STREAM frames (assuming the datagram can be split for sending in multiple QUIC packets)? These dragons can be conquered (see my partially reliable streams draft, v2 or v3), but this is not simple. :(

- Igor

-----Original Message-----
From: Jana Iyengar [jri.ietf@gmail.com]
Received: Monday, 29 Oct 2018, 10:24PM
To: tpauly@apple.com [tpauly@apple.com]
CC: Ian Swett [ianswett@google.com]; QUIC WG [quic@ietf.org]; Martin Thomson [martin.thomson@gmail.com]
Subject: Re: flow control and DATAGRAM

Hey Tommy,

I'm not suggesting any changes to when the packet gets ack'ed—specifically, I was responding to Martin's hypothetical of having an ACK to a DATAGRAM frame meaning that the application had processed the frame. My impression is that the ACK to a packet containing a DATAGRAM frame means:
- The packet made it across the network to the QUIC endpoint on the other side

... and was processed by the QUIC receiver (not necessarily by the application). This is the same semantic as the ack of a regular frame.

- The QUIC implementation will deliver the DATAGRAM frame to application (it won't drop it locally)

In which case you will need flow control. If you agree, then we're on the same page so far.

Having a separate outstanding data limit for the DATAGRAM "stream" is an interesting solution to the space. It would then have the nice property of not looking like traditional flow control. It could even be measured in number of frames, rather than bytes (depending on what the limiting factors are).

In terms of flow control, I don't think DATAGRAM flow control is any different than sending stream data on any stream and then canceling it -- there are no retransmissions, but the sender still accounts for it in flow control. My thought here was basically to have exactly the same flow control as stream-level flow control, and allow for DATAGRAM bytes to fall within connection-level flow control as normal.

The simplest scheme would be to reserve a stream ID for DATAGRAM data -- this could be 0 or 2^62-1 -- and then MAX_STREAM_DATA uses this stream ID for flow control of DATAGRAM octets. Alternatively, define a new frame called MAX_DATAGRAM_DATA that is carries the largest allowed offset that can be sent as a DATAGRAM, and introduce an offset field in the DATAGRAM frame.

- jana

Thanks,
Tommy

On Oct 29, 2018, at 12:37 PM, Jana Iyengar <jri.ietf@gmail.com<mailto:jri.ietf@gmail.com>> wrote:

Tommy,

Changing the semantics of an acknowledgment to include delivery up to the application is a fundamental change to the QUIC machinery, and it doesn't work. First, an ACK frame acknowledges packets, and you can't have different semantics of an acknowledgment for different frames that are carried in the same packet. Second, it interferes with RTT measurement, and it conflates flow control with congestion control, which gets messy. (This conflation is an interesting problem to consider theoretically, but not one for us at this time IMO.)

I am wondering if applying a stream-level flow control for DATAGRAMs makes sense instead. Meaning that you treat DATAGRAMs as a separate stream for flow control purposes. You might benefit from having an offset in the DATAGRAM frame for this purpose.

- jana

On Mon, Oct 29, 2018 at 8:21 AM Tommy Pauly <tpauly@apple.com<mailto:tpauly@apple.com>> wrote:
Hi Martin, Ian,

Yes, very good points!

My tendency would be to prefer what Ian's implementation does of passing these DATAGRAM frames up immediately to the application. I don't think that the acknowledgment needs to indicate that the frame was processed by the application, but merely that it has been delivered to the application (that is, the application doesn't get to do anything with the frame that can influence the acknowledgment).

The current draft indicates that the content of the DATAGRAM frames contributes to the limit used for MAX_DATA, and that if that amount is reached, the frames are blocked along with STREAM data. I think this works fine for the sender, while the receiver gets into the discussion you present. On the sender side, reaching MAX_DATA could mean dropping the DATAGRAM frames when unable to send more (and sending BLOCKED instead). Since the frames are unreliable, they can be dropped in this situation without violating the API contract.

On the receiver side, I agree that queuing the DATAGRAM frames to let the application drive flow control in the way it does for STREAM frames adds complexity and diminishes the utility of the frame and ACKs.. However, I can imagine taking a fairly simplistic approach in which the data limit is automatically increased upon reception of the frame (and the frame is immediately passed to the application). This allows the initial_max_data to put a cap on the amount of data in a given flight of DATAGRAMS, and allow the size of a flight of DATAGRAM frames to be limited by the amount of room left over from STREAM data that may be consuming the connection-wide flow control.

Perhaps this approach needs a clearer name other than "flow control", since it has a somewhat different meaning in effect.

As for ACKs, if we never discard on the receiver side, the ACK is pretty useful for detecting if there was network-based packet loss.

Thanks,
Tommy

On Oct 29, 2018, at 5:32 AM, Ian Swett <ianswett@google.com<mailto:ianswett@google.com>> wrote:

Good catch Martin, I missed that in the draft as well, and I also think it's impossibly with the proposed design.

And yes, I think Martin's proposed solution is likely the only practical one.  In my implementation, the frame is passed up to the application immediately, so technically QUIC processed it, and it's the application's job to decide what to do with it.

On Mon, Oct 29, 2018 at 1:16 AM Martin Thomson <martin.thomson@gmail.com<mailto:martin.thomson@gmail.com>> wrote:
Hi Tommy,

Your slides - <https://github.com/quicwg/wg-materials/blob/master/ietf103/IETF103-QUIC-Datagram.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_quicwg_wg-2Dmaterials_blob_master_ietf103_IETF103-2DQUIC-2DDatagram.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=BOGSfZ4hKaubJ6yww-9_8EDX3OccYvdip0g3k4t66hQ&s=qiyr1YY1MHWjhIrPfwHXN6FCdsG8wUuhAuqczkosLlI&e=>>
- say that DATAGRAM frames respect connection-level flow control.  I
missed that in the draft, and I don't know how they can do that in the
face of packet loss, especially when you don't necessarily retransmit
lost DATAGRAM frames.

For that to work, you would need a bunch more machinery to make the
connection-level flow control sync between endpoints in the case that
packets are lost.  A disagreement about how much flow control is used
causes things to break down badly.  Ian and I discussed this point at
the last meeting and quickly agreed that while it might be nice to
have flow control for this stuff, the increase in complexity is
considerable and (at the time) we thought it wouldn't be worth it.

The problem that introduces is that you could end up having too many
DATAGRAM frames arrive.  The receiver has to drop something at the
point that it can't handle them.  And we say that when you acknowledge
something, you processed it.  That's tricky.

It might be easier to say that a QUIC acknowledgment for a DATAGRAM
frame doesn't mean that it was received and processed by an
application.  An endpoint might discard these frames before passing
them on to applications if it doesn't have space.  In other words,
acknowledgment of DATAGRAM means that QUIC got it, not that the
application got it.  Sadly, that means that the QUIC acknowledgment
machinery doesn't help the application that uses DATAGRAM all that
much.  Also, the lower bound on reliability is 0, which isn't the best
thing ever.

Hard choices, I know.  I don't have a good design for maintaining
connection-level flow control (or any back pressure mechanism with
equivalent properties) that doesn't add both complexity and overhead.

Cheers,
Martin