Re: Deadlocking in the transport

"Charles 'Buck' Krasic" <> Wed, 10 January 2018 17:47 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 56B4D12DA4E for <>; Wed, 10 Jan 2018 09:47:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.01
X-Spam-Status: No, score=-2.01 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 2RWHqc1JDkwv for <>; Wed, 10 Jan 2018 09:47:41 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4001:c0b::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 7C1B812DA4F for <>; Wed, 10 Jan 2018 09:47:25 -0800 (PST)
Received: by with SMTP id p139so364209itb.1 for <>; Wed, 10 Jan 2018 09:47:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=mfr6TtVQ3wUbUl4oBvAxHsXqmaS155jjAPOATMMdhtw=; b=O9G7IAQjLBgfzhGcCvSi2RZ/WbLAKhmnZLWLEemRyNkFuHNJAsdgDXip1nKArexKEM YbeWYeHObVVP+g+t4sYqPW7r4wRyZyX56baMGIxf+fhNLO9urP/ZKkl3ZIKcTOItYm/5 O/ybLnQsWCjY/vrzIyAtSpdH2gYn8D84RMC0s03J2bg/KmceGfhy6GGhvHnUUfftfmcp IYztvYypk2i/SYiDn9IVVIC3MDS+X2wD/rxhP1N5vVsnuR3gzvQf3/DgxP31dzgsOJon 304l8Q0wlHSlj1XUuZazF/hl2333Y536uMBcJ7FSnCjSFa6q6CpgdzirsHrAvC4SiOaX 3ldw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=mfr6TtVQ3wUbUl4oBvAxHsXqmaS155jjAPOATMMdhtw=; b=eLxLN6zgFJRmXi9BYIqv75MUba8cOtLWX0/Qry1mYe7y+9C9ZTPem0JW5dXejLLFNV 1pbDUL3K0d6iZ7GlWEQ1G6HaGRG3cdel2i5cA9Zu/1f1ktzxZw0GKLGeleOaiSlXSIHG pN5Ena2Cp0r33tZgvzoiK38ZWxJmFxrD02TfsGfUbJH86Ox/T0Nu2WhCRxZCzGCTKg92 ciiUNVQYm4nbFVykshxP4xBlzdA40YinP2EhiGEUJSwKxMZb8t1Td4lm3Y9b+Hil5mKG Iipatm7kHYi+68acD6H96689sFs8oHBcbsc+IP9v5h+fvo0CaqY37YQPAasghcIm0Pv/ vRMQ==
X-Gm-Message-State: AKwxytdmy/JGeC2B3W4v15SvMWK5nmHJXK26w/TJyJunF5jv2MMboEfz Mc0Fpu/LASKcjjbzmUaulaJsUzNbwbp1JFcV0ZSTSaFLnA==
X-Google-Smtp-Source: ACJfBovSBr8iHHXUhJ3cOdWag9/bJ4iK2gf6vbbHCHn9CYtlKJYRFLBYL2dJoXqp8jdVAbOBArjmIeT4JZUxO7YszFo=
X-Received: by with SMTP id l15mr12812905iti.8.1515606444088; Wed, 10 Jan 2018 09:47:24 -0800 (PST)
MIME-Version: 1.0
Received: by with HTTP; Wed, 10 Jan 2018 09:47:03 -0800 (PST)
In-Reply-To: <>
References: <> <>
From: "Charles 'Buck' Krasic" <>
Date: Wed, 10 Jan 2018 09:47:03 -0800
Message-ID: <>
Subject: Re: Deadlocking in the transport
To: Jana Iyengar <>
Cc: Martin Thomson <>, QUIC WG <>
Content-Type: multipart/alternative; boundary="f403045dc016b8eccd05626f9fc1"
Archived-At: <>
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 10 Jan 2018 17:47:44 -0000

On Tue, Jan 9, 2018 at 10:49 PM, Jana Iyengar <> wrote:

> Martin,
> You are right that this isn't a new concern, and that this is worth noting
> somewhere, perhaps in the applicability/API doc.
> The crux of this issue is that there's structure in application data that
> the transport is unaware of. Specifically, there are dependencies among
> application data units that is opaque to the transport. Using the transport
> buffers as a part of handling these dependencies seems like a bad idea,
> especially since the transport is likely to make decisions about flow
> window updates based on rate at which data is consumed out of the receive
> buffer(s). GQUIC does this, and so does every respectable TCP receiver
> implementation.
> The SCTP API avoids this problem by not allowing the application to read
> specific stream data out of the socket buffers. The receiving app receives
> data that could belong to any stream and has to demux after reading out of
> the socket. (Note that SCTP does not have per-stream flow control, so the
> receive side here is more like SPDY/TCP, modulo HoL blocking at the
> transport.)
> Protocols that create inter-stream dependency should be able to express
> that in priorities down to the transport, which I believe is expected to be
> part of the API. I believe that handles this issue, doesn't it?
I thought priorities wouldn't solve the problem, but now I am not so sure.

Here's an example we've been using in the compression design group:

 o   Client sends requests A, B and on streams A, B.
 o   Server receives requests, sends responses A, B.
       o the compression scheme allows inter-stream dependencies, so B's
compressed header ends up depending on A's.   There
          is a small prefix in each compressed header block the tells the
client this.  The idea is that the client can read the prefix,
          but will not read the rest until the dependency is satisfied on
its end.
       o suppose A's header was only partially written so far (this is what
we assumed, but I'm not sure this makes sense if A has priority over B)
       o and that due to sending A+B's (partial) responses the server is
now connection-level flow control blocked.
o  Before the client receives A's header, client decides to cancel A, so it
        o e.g. the resource A is no longer desired, perhaps because the
user scrolled and A no longer needed to render the on-screen area.
 o   server receives STOP_SENDING/RST_Swhere the proxy treats streams as
opaque and forwards dataTREAM on A, it resets it's side of A.
       o this will undo A's outstanding flow-control debt, but that will
not be enough to resend A's full header (on another stream), since the
adjustment to
          the connection level only corresponds to the partial header sent
so far.

I think priorities actually do help solve this in the following sense.   It
makes a variation of #4.   A write can occur, even without immediately
consuming flow control, but the priority does ensure it will get flow
control when it needs it.
This isn't necessarily HQ priorities though.   A workable, but less
performant variation, might be to say that if the transport does buffer
writes, it must service buffered writes in a global FIFO order.

With priorities, the part where A sends a partial header seems invalid.
 The header compression does have requirements that headers be encoded
strictly sequentially, so partial header write doesn't make sense there.

The counter example given was a proxy, but I'm not sure that it is
reasonable to guarantee that QUIC proxies can always be oblivious to
mapping level semantics.  In the case of HTTP and header compression, it
seems especially counter-intuitive to me, since most cases will not have a
1:1 upstream downstream connection topology.

> - jana
> On Tue, Jan 9, 2018 at 10:17 PM, Martin Thomson <>
> wrote:
>> Building a complex application protocol on top of QUIC continues to
>> produce surprises.
>> Today in the header compression design team meeting we discussed a
>> deadlocking issue that I think warrants sharing with the larger group.
>> This has implications for how people build a QUIC transport layer.  It
>> might need changes to the API that is exposed by that layer.
>> This isn't really that new, but I don't think we've properly addressed
>> the problem.
>> ## The Basic Problem
>> If a protocol creates a dependency between streams, there is a
>> potential for flow control to deadlock.
>> Say that I send X on stream 3 and Y on stream 7.  Processing Y
>> requires that X is processed first.
>> X cannot be sent due to flow control but Y is sent.  This is always
>> possible even if X is appropriately prioritized.  The receiver then
>> leaves Y in its receive buffer until X is received.
>> The receiver cannot give flow control credit for consuming Y because
>> it can't consume Y until X is sent.  But the sender needs flow control
>> credit to send X.  We are deadlocked.
>> It doesn't matter whether the stream or connection flow control is
>> causing the problem, either produces the same result.
>> (To give some background on this, we were considering a preface to
>> header blocks that identified the header table state that was
>> necessary to process the header block.  This would allow for
>> concurrent population of the header table and sending message that
>> depended on the header table state that is under construction.  A
>> receiver would read the identifier and then leave the remainder of the
>> header block in the receive buffer until the header table was ready.)
>> ## Options
>> It seems like there are a few decent options for managing this.  These
>> are what occurred to me (there are almost certainly more options):
>> 1. Don't do that.  We might concede in this case that seeking the
>> incremental improvement to compression efficiency isn't worth the
>> risk.  That is, we might make a general statement that this sort of
>> inter-stream blocking is a bad idea.
>> 2. Force receivers to consume data or reset streams in the case of
>> unfulfilled dependencies.  The former seems like it might be too much
>> like magical thinking, in the sense that it requires that receivers
>> conjure more memory up, but if the receiver were required to read Y
>> and release the flow control credit, then all would be fine.  For
>> instance, we could require that the receiver reset a stream if it
>> couldn't read and handle data.  It seems like a bad arrangement
>> though: you either have to allocate more memory than you would like or
>> suffer the time and opportunity cost of having to do Y over.
>> 3. Create an exception for flow control.  This is what Google QUIC
>> does for its headers stream.  Roberto observed that we could
>> alternatively create a frame type that was excluded from flow control.
>> If this were used for data that had dependencies, then it would be
>> impossible to deadlock.  It would be similarly difficult to account
>> for memory allocation, though if it were possible to process on
>> receipt, then this *might* work.  We'd have to do something to address
>> out-of-order delivery though.  It's possible that the stream
>> abstraction is not appropriate in this case.
>> 4. Block the problem at the source.  It was suggested that in cases
>> where there is a potential dependency, then it can't be a problem if
>> the transport refused to accept data that it didn't have flow control
>> credit for.  Writes to the transport would consume flow control credit
>> immediately.  That way applications would only be able to write X if
>> there was a chance that it would be delivered.  Applications that have
>> ordering requirements can ensure that Y is written after X is accepted
>> by the transport and thereby avoid the deadlock.  Writes might block
>> rather than fail, if the API wasn't into the whole non-blocking I/O
>> thing.  The transport might still have to buffer X for other reasons,
>> like congestion control, but it can guarantee that flow control isn't
>> going to block delivery.
>> ## My Preference
>> Right now, I'm inclined toward option 4. Option 1 seems a little too
>> much of a constraint.  Protocols create this sort of inter-dependency
>> naturally.
>> There's a certain purity in having the flow control exert back
>> pressure all the way to the next layer up.  Not being able to build a
>> transport with unconstrained writes is potentially creating
>> undesirable externalities on transport users.  Now they have to worry
>> about flow control as well.  Personally, I'm inclined to say that this
>> is something that application protocols and their users should be
>> exposed to.  We've seen with the JS streams API that it's valuable to
>> have back pressure available at the application layer and also how it
>> is possible to do that relatively elegantly.
>> I'm almost certain that I haven't thought about all the potential
>> alternatives.  I wonder if there isn't some experience with this
>> problem in SCTP that might lend some insights.

Charles 'Buck' Krasic | Software Engineer | | +1 (408)