Re: Deadlocking in the transport

Roberto Peon <fenix@fb.com> Fri, 12 January 2018 18:53 UTC

Return-Path: <prvs=45509646aa=fenix@fb.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 09FF0127201 for <quic@ietfa.amsl.com>; Fri, 12 Jan 2018 10:53:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.72
X-Spam-Level:
X-Spam-Status: No, score=-2.72 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=fb.com header.b=mtAemNNm; dkim=pass (1024-bit key) header.d=fb.onmicrosoft.com header.b=RIHhhbQB
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I-eJ-k1rDGLd for <quic@ietfa.amsl.com>; Fri, 12 Jan 2018 10:53:16 -0800 (PST)
Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 82E02126D0C for <quic@ietf.org>; Fri, 12 Jan 2018 10:53:16 -0800 (PST)
Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w0CIpNqk018964; Fri, 12 Jan 2018 10:53:13 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=facebook; bh=S8n2I15v2SFbPUcSQ7rbElMJwQhUW5PJR2bG7a5R5b4=; b=mtAemNNm0iELU4MYVH9KtD5Kdw2c/1WSMoH5w5AALvvFACdoAWvopiko9diLmVckAQOo VQQOvd9z5cy0TwT5dCtSO6vC7jEC+cnRNP0NuxI6ap1C8BPSxzaQvtBIi8w+HovA+ImO k1OF8BQg7oyLBXfk5pZwDeDMYNUrOXdXBhA=
Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2fex7ch54m-1 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 12 Jan 2018 10:53:12 -0800
Received: from NAM03-CO1-obe.outbound.protection.outlook.com (192.168.54.28) by o365-in.thefacebook.com (192.168.16.15) with Microsoft SMTP Server (TLS) id 14.3.361.1; Fri, 12 Jan 2018 10:53:11 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.onmicrosoft.com; s=selector1-fb-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=S8n2I15v2SFbPUcSQ7rbElMJwQhUW5PJR2bG7a5R5b4=; b=RIHhhbQB5qYTgsX/iPE7+pVgQB9fwxsjRUWy+icThkr4UHEcLEMnrZgl4ZI6sQxUYpxipEI/AhiTGoj3OBGny5WgDmvebrQZlV2mKC6lYpm4TVpp4BmPyQvER8es+McCIJoOPYZrfAplIjplJhXRVnzf/dMobA+X8Cgq8mAB9Nw=
Received: from SN6PR1501MB2191.namprd15.prod.outlook.com (52.132.120.28) by SN6PR1501MB2190.namprd15.prod.outlook.com (52.132.120.27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.386.5; Fri, 12 Jan 2018 18:53:02 +0000
Received: from SN6PR1501MB2191.namprd15.prod.outlook.com ([fe80::a873:4c9f:f589:4b2d]) by SN6PR1501MB2191.namprd15.prod.outlook.com ([fe80::a873:4c9f:f589:4b2d%13]) with mapi id 15.20.0366.011; Fri, 12 Jan 2018 18:53:02 +0000
From: Roberto Peon <fenix@fb.com>
To: "Lubashev, Igor" <ilubashe@akamai.com>, Jana Iyengar <jri@google.com>
CC: QUIC WG <quic@ietf.org>, Martin Thomson <martin.thomson@gmail.com>
Subject: Re: Deadlocking in the transport
Thread-Topic: Deadlocking in the transport
Thread-Index: AQHTidq/6Mg61g6LuEipz9qoFIGKPqNtCOaAgAJivQD//3yugIAAjUGAgAEj3RI=
Date: Fri, 12 Jan 2018 18:53:02 +0000
Message-ID: <SN6PR1501MB2191934520466DD727CA0BDCCD170@SN6PR1501MB2191.namprd15.prod.outlook.com>
References: <CABkgnnUSMYRvYNUwzuJk4TQ28qb-sEHmgXhxpjKOBON43_rWCg@mail.gmail.com> <E55BA3F8-39ED-404D-9165-C5E68362206E@fb.com> <CAGD1bZa5H0jf1NqwoA7d0-TREXNdRECeT3BJtgW0rN3Vt-04GA@mail.gmail.com> <18061ACE-08F7-4219-81E6-0B73BBC56170@fb.com>, <a891c4d345e446e3b172c7a9d7e815b3@usma1ex-dag1mb5.msg.corp.akamai.com>
In-Reply-To: <a891c4d345e446e3b172c7a9d7e815b3@usma1ex-dag1mb5.msg.corp.akamai.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2620:10d:c090:200::7:63ef]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; SN6PR1501MB2190; 20:orcnb0MHaMLlxMGo5EIfZC30LNNQr5vtIQSasGpfc0bZCSASVc7BU2xvz9fU3w0rWzxx5pn3tLfxiMrIufJX3PC7XMafZXW5xI3Etimbnp80eXaul/171UZIH131+2mBA2/dKRyDSBh0kefkBACQ00fmoyCyg9W7Keeg231C3lc=
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: 1bb0dc16-fb8b-45b9-c63d-08d559edb541
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020085)(4652020)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:SN6PR1501MB2190;
x-ms-traffictypediagnostic: SN6PR1501MB2190:
x-microsoft-antispam-prvs: <SN6PR1501MB21902DEF2F83AEBB4393BC1BCD170@SN6PR1501MB2190.namprd15.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(158342451672863)(278428928389397)(85827821059158)(67672495146484)(211936372134217)(153496737603132);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040470)(2401047)(5005006)(8121501046)(3231023)(11241501184)(944501146)(93006095)(93001095)(10201501046)(3002001)(6041268)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123558120)(20161123564045)(6072148)(201708071742011); SRVR:SN6PR1501MB2190; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:SN6PR1501MB2190;
x-forefront-prvs: 0550778858
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(39380400002)(39860400002)(396003)(346002)(376002)(366004)(24454002)(199004)(54094003)(189003)(8936002)(5250100002)(86362001)(8676002)(54896002)(110136005)(2906002)(6436002)(55016002)(3480700004)(54906003)(14454004)(59450400001)(316002)(93886005)(53546011)(6506007)(102836004)(236005)(76176011)(74316002)(9686003)(81156014)(7696005)(53936002)(81166006)(2900100001)(6116002)(68736007)(7736002)(229853002)(3660700001)(99286004)(106356001)(33656002)(105586002)(97736004)(5660300001)(478600001)(39060400002)(6246003)(4326008)(25786009)(3280700002)(2950100002)(42262002); DIR:OUT; SFP:1102; SCL:1; SRVR:SN6PR1501MB2190; H:SN6PR1501MB2191.namprd15.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: fb.com does not designate permitted sender hosts)
x-microsoft-antispam-message-info: n/XKylHx2VWk1P45buZT5WxurWm/7s0X1AdOljCbeOmDmzwbW+2Y0z+rn6bvTI7QV3RmLJLDf0gbxRjRvKOYrg==
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_SN6PR1501MB2191934520466DD727CA0BDCCD170SN6PR1501MB2191_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 1bb0dc16-fb8b-45b9-c63d-08d559edb541
X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Jan 2018 18:53:02.4494 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR1501MB2190
X-OriginatorOrg: fb.com
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-01-12_10:, , signatures=0
X-Proofpoint-Spam-Reason: safe
X-FB-Internal: Safe
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/69lZ9NuFvsHzPwqHa8ZlWmjsyKc>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Jan 2018 18:53:20 -0000

I'm talking about this in the context of e2e application-level deadlocks.

If we simply increase the global flow control window when things are blocked , the wrong stream may make forward progress; there is no way to indicate a particular stream should make forward progress.

This may be ok-- The standard way this is resolved today is a timeout, error, and retry.

Since detecting and resolving all of these races is not reasonably solvable, this (timeout, retry) is the ultimate mechanism for any application to resolve issues anyway.

Imho, the 'blocked on flow control' message, if reliably sent by blocked senders, will give info on shared-resource (i.e. proxy-> server) deadlocks, and thus better inform actors' timeouts.

-=R



Sent via the Samsung Galaxy S7, an AT&T 4G LTE smartphone


-------- Original message --------
From: "Lubashev, Igor" <ilubashe@akamai.com>;
Date: 1/11/18 5:28 PM (GMT-08:00)
To: Roberto Peon <fenix@fb.com>;, Jana Iyengar <jri@google.com>;
Cc: QUIC WG <quic@ietf.org>;, Martin Thomson <martin.thomson@gmail.com>;
Subject: RE: Deadlocking in the transport

Transport cannot solve the problem of deadlocking due to a higher protocol’s data priority on its own.  The application would need to cooperate carefully.

The problem is that it is not only the receiver that is influencing flow control, because sender transport has buffers as well.  The sender transport can refuse to accept higher-priority data from the application, if its buffers are already full with lower-priority data.

So you can end up with receiver refusing to grant additional credits for stream Y (because it cannot process stream Y w/o stream X).  If the sender’s buffers are full of stream Y data, it cannot send that data and cannot accept any additional data (for stream X).

The application must be careful not to enqueue stream Y data to the transport before it enqueues all data from stream X that stream Y data depends on.


  *   Igor


From: Roberto Peon [mailto:fenix@fb.com]
Sent: Thursday, January 11, 2018 8:03 PM
To: Jana Iyengar <jri@google.com>;
Cc: QUIC WG <quic@ietf.org>;; Martin Thomson <martin.thomson@gmail.com>;
Subject: Re: Deadlocking in the transport

The receiver would be the one to provide the override; The sender could hint to the receiver that it should wish this by signaling it is blocked (or deadlocked) on flow control.
-=R

From: Jana Iyengar <jri@google.com<mailto:jri@google.com>>
Date: Thursday, January 11, 2018 at 4:53 PM
To: Roberto Peon <fenix@fb.com<mailto:fenix@fb.com>>
Cc: Martin Thomson <martin.thomson@gmail.com<mailto:martin.thomson@gmail.com>>, QUIC WG <quic@ietf.org<mailto:quic@ietf.org>>
Subject: Re: Deadlocking in the transport

Roberto,

I'm not convinced that we need something as heavy-weight as this... specifically, flow control so far has been in the receiver's control, which is also the thing that owns the resource it is trying to limit exhaustion of. This moves it out to the sender, which would require a receiver to now commit memory based on what a sender believes is important. Reasoning about this seems non-trivial. I worry about how all this degree of freedom will get abused, and potential sharp edges.

If we can ensure priority consumption of the shared resource, that seems adequate to resolve the problem on hand. Do you think something more is required?

- jana

On Wed, Jan 10, 2018 at 12:26 PM, Roberto Peon <fenix@fb.com<mailto:fenix@fb.com>> wrote:
Another option:
Allow a flow-control ‘override’ which allows a receiver to state that they really want data on a particular stream, and ignore the global flow control for this.

How you’d do it:
A receiver can send a flow-control override for a stream. This includes the stream id to which the global window temporarily does not apply, the receiver’s current stream flow-control offset, and the offset the receiver would wish to be able to receive.
A receiver must continue to (re)send the override (i.e. rexmit) until it is ack’d. It cannot send other flow-control for that stream until the override is ack’d.
Thus:
  global-flow-control-override: <stream-id> <current-flow-control-offset>, <override-flow-control-offset>

The sender (which receives the override) credits the global flow control with the difference between the data sent beyond the receivers currently-known flow-control offset upon receipt of the override.
This synchronizes the global state between the receiver and the sender.
The sender can then send the data on the stream (without touching any other flow control data).

Why:
This allows for a receiver to resolve priority inversions which would otherwise lead to deadlock, even when the data/dep leading to this was not known to the transport. This extends beyond such issues beyond just header compression.

Since the global flow-control exists to protect the app from resource exhaustion, this poses no additional risk to the application.
Simply increasing the global flow control provides less strong guarantees—any stream might consume it, which doesn’t resolve the dep inversion. Rejiggering priorities can help to resolve this, but would require the sender to send priorities to the client, which is problematic w.r.t.races and just a web of ick.

Having a custom frame type is also a less strong guarantee as it requires the knowledge that the dep exists to be present at the time of sending, which is often impossible.

-=R

On 1/9/18, 10:17 PM, "QUIC on behalf of Martin Thomson" <quic-bounces@ietf.org<mailto:quic-bounces@ietf.org> on behalf of martin.thomson@gmail.com<mailto:martin.thomson@gmail.com>> wrote:

    Building a complex application protocol on top of QUIC continues to
    produce surprises.

    Today in the header compression design team meeting we discussed a
    deadlocking issue that I think warrants sharing with the larger group.
    This has implications for how people build a QUIC transport layer.  It
    might need changes to the API that is exposed by that layer.

    This isn't really that new, but I don't think we've properly addressed
    the problem.


    ## The Basic Problem

    If a protocol creates a dependency between streams, there is a
    potential for flow control to deadlock.

    Say that I send X on stream 3 and Y on stream 7.  Processing Y
    requires that X is processed first.

    X cannot be sent due to flow control but Y is sent.  This is always
    possible even if X is appropriately prioritized.  The receiver then
    leaves Y in its receive buffer until X is received.

    The receiver cannot give flow control credit for consuming Y because
    it can't consume Y until X is sent.  But the sender needs flow control
    credit to send X.  We are deadlocked.

    It doesn't matter whether the stream or connection flow control is
    causing the problem, either produces the same result.

    (To give some background on this, we were considering a preface to
    header blocks that identified the header table state that was
    necessary to process the header block.  This would allow for
    concurrent population of the header table and sending message that
    depended on the header table state that is under construction.  A
    receiver would read the identifier and then leave the remainder of the
    header block in the receive buffer until the header table was ready.)


    ## Options

    It seems like there are a few decent options for managing this.  These
    are what occurred to me (there are almost certainly more options):

    1. Don't do that.  We might concede in this case that seeking the
    incremental improvement to compression efficiency isn't worth the
    risk.  That is, we might make a general statement that this sort of
    inter-stream blocking is a bad idea.

    2. Force receivers to consume data or reset streams in the case of
    unfulfilled dependencies.  The former seems like it might be too much
    like magical thinking, in the sense that it requires that receivers
    conjure more memory up, but if the receiver were required to read Y
    and release the flow control credit, then all would be fine.  For
    instance, we could require that the receiver reset a stream if it
    couldn't read and handle data.  It seems like a bad arrangement
    though: you either have to allocate more memory than you would like or
    suffer the time and opportunity cost of having to do Y over.

    3. Create an exception for flow control.  This is what Google QUIC
    does for its headers stream.  Roberto observed that we could
    alternatively create a frame type that was excluded from flow control.
    If this were used for data that had dependencies, then it would be
    impossible to deadlock.  It would be similarly difficult to account
    for memory allocation, though if it were possible to process on
    receipt, then this *might* work.  We'd have to do something to address
    out-of-order delivery though.  It's possible that the stream
    abstraction is not appropriate in this case.

    4. Block the problem at the source.  It was suggested that in cases
    where there is a potential dependency, then it can't be a problem if
    the transport refused to accept data that it didn't have flow control
    credit for.  Writes to the transport would consume flow control credit
    immediately.  That way applications would only be able to write X if
    there was a chance that it would be delivered.  Applications that have
    ordering requirements can ensure that Y is written after X is accepted
    by the transport and thereby avoid the deadlock.  Writes might block
    rather than fail, if the API wasn't into the whole non-blocking I/O
    thing.  The transport might still have to buffer X for other reasons,
    like congestion control, but it can guarantee that flow control isn't
    going to block delivery.


    ## My Preference

    Right now, I'm inclined toward option 4. Option 1 seems a little too
    much of a constraint.  Protocols create this sort of inter-dependency
    naturally.

    There's a certain purity in having the flow control exert back
    pressure all the way to the next layer up.  Not being able to build a
    transport with unconstrained writes is potentially creating
    undesirable externalities on transport users.  Now they have to worry
    about flow control as well.  Personally, I'm inclined to say that this
    is something that application protocols and their users should be
    exposed to.  We've seen with the JS streams API that it's valuable to
    have back pressure available at the application layer and also how it
    is possible to do that relatively elegantly.

    I'm almost certain that I haven't thought about all the potential
    alternatives.  I wonder if there isn't some experience with this
    problem in SCTP that might lend some insights.