Re: Deadlocking in the transport

Martin Thomson <martin.thomson@gmail.com> Sun, 14 January 2018 22:41 UTC

Return-Path: <martin.thomson@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 565F012D882 for <quic@ietfa.amsl.com>; Sun, 14 Jan 2018 14:41:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8yyk4Sh3Ni-b for <quic@ietfa.amsl.com>; Sun, 14 Jan 2018 14:41:03 -0800 (PST)
Received: from mail-oi0-x22e.google.com (mail-oi0-x22e.google.com [IPv6:2607:f8b0:4003:c06::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 01EAD12D88D for <quic@ietf.org>; Sun, 14 Jan 2018 14:41:03 -0800 (PST)
Received: by mail-oi0-x22e.google.com with SMTP id o64so7166587oia.9 for <quic@ietf.org>; Sun, 14 Jan 2018 14:41:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=aa13f3FqjCc6sva80BwWZaINciktH+rL2htuAyTazeo=; b=qO22XG7Y7oqR8xxVtkRaRx64PcRbr/FI7ooKFXpMp8YpH43wBxZlwN1OkERVZ2jr0Q UPFbFKKsAeeSCZLiPASEfiDhxHn1xj4OojqC905tJQlZONDTh5EnDRwA+HooZ/rQNPBJ YEnxawNPF1BTpmBS85raNVCEvxuGlSQNGhOINNiV1SZy4WmHR7eMl3w9vVOtH9NSLj8A CiNipUF+YkNQVqsAIrxiYCPXaxw8Igt298Yo9HYLbXoxvIFkXTyXaPuEfvk+6Vjq52bT 87THh5IJvU96CnGV6dFiMtk/Vb06Zd6vYxjhrqUEm5mnCUAs7sVEUtlymQ1GGRrZ2AkM xBkQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=aa13f3FqjCc6sva80BwWZaINciktH+rL2htuAyTazeo=; b=saXIHTvtXeHogUmAm+H+s3GRE1NNe6sDmOQtL9+i3NM135nd9z7UhQaxPEakGSeB7u erQKMmKE2QEuS3mi//qxzwbg7PCUEgdF7esBFlzhTcTcY7R6pUPze8EzeO9PWi/DuseA S/7oz8fZkhqCAJ+xrQNRLcJQyn3D30tGYCZwWpSUIXpK2SBCnw3aesGTOiX/nJ60xLYR GsLqQ9bYBWcJhUN4d9PnOXcSJvIc9almd0Ak215wwlqzcOzIGAyPExJHc4c/0N+7+cxv PybmuCLWnwPiElEXdNObWo2tGTvFUZ+yAVHS29p+FVZP0UVAXV/EgO+IcOuPHfLZETwV W0HA==
X-Gm-Message-State: AKwxyteeTmISNxZSw/YDwNrKQkF17ZoNme65vCOqwJQU/Knvi1AyAoBQ dnaCs6ZjEh9Wd/lO9dayvY9HZn9D+tZ2mdq8s3g=
X-Google-Smtp-Source: ACJfBoshPQJWfHsatAdbeIPvhqSK8+UmmEKZo0h6E8y+QfJhYTdUl1+WJbUHkGO705C50ZzcFQjEL8BtnHUcFV3V048=
X-Received: by 10.202.56.139 with SMTP id f133mr4184292oia.28.1515969662172; Sun, 14 Jan 2018 14:41:02 -0800 (PST)
MIME-Version: 1.0
Received: by 10.157.39.16 with HTTP; Sun, 14 Jan 2018 14:41:01 -0800 (PST)
In-Reply-To: <51C80222-9513-4CF6-82FD-5692C1DAB058@fb.com>
References: <CABkgnnUSMYRvYNUwzuJk4TQ28qb-sEHmgXhxpjKOBON43_rWCg@mail.gmail.com> <CAGD1bZYV7iHg_YarUMqUSnpbAB2q8dwEWO=dHE2wbw8Oea_zfA@mail.gmail.com> <CAD-iZUY-Y-MO_T74JmP6B9XVj=91eVovfcWnE=9s9kd0Ji+CnA@mail.gmail.com> <CAGD1bZa7ugOTT11qOKfCm4NFdi+t-pdrXnscWHgg0bO5tgUqmg@mail.gmail.com> <20180110194716.GA30573@ubuntu-dmitri> <CAGD1bZYiDOakLYNppMBr=99JreX3Xr2zkS7O2DRNfvr_o0NUbg@mail.gmail.com> <20180110200646.GB30573@ubuntu-dmitri> <CAGD1bZa-ZOw5J6oSWBYdk3uYHOpGvak+vwGp0XsZB44zbLvRrw@mail.gmail.com> <20180110202357.GC30573@ubuntu-dmitri> <CAGD1bZbPM3wnatLLN5938wGPo3e1qmxnGzobSTym6XX3W8FNJQ@mail.gmail.com> <CABkgnnU3CQkvd7m+G80sCOPJfzb_=HonbRDSQJC8wqD_uWoj0w@mail.gmail.com> <CAGD1bZbrtMEJE-OOXqG02yWmHy_2baEvaZu=rFCBTtcq94JrOg@mail.gmail.com> <CABkgnnWtmprf291pBgTOrfi6yU9tXSfKi5J5uQpm7Z4JHuiGWg@mail.gmail.com> <EDF23BB9-DA04-44A5-8682-3D22C1DD7380@tik.ee.ethz.ch> <51C80222-9513-4CF6-82FD-5692C1DAB058@fb.com>
From: Martin Thomson <martin.thomson@gmail.com>
Date: Mon, 15 Jan 2018 09:41:01 +1100
Message-ID: <CABkgnnU+9u3pqzN7QbowAktxwFwj2XqJDhVyB5h5XOszF1CuXg@mail.gmail.com>
Subject: Re: Deadlocking in the transport
To: Roberto Peon <fenix@fb.com>
Cc: QUIC WG <quic@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/RgQdac1cZLnezHmMrCYdUCGIOmA>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Jan 2018 22:41:05 -0000

I am reminded again that intermediation is why we can't have nice things.

Yes, you are right, an intermediary cannot perfectly propagate flow
control state.  That means that even if either endpoint does what it
can to avoid a deadlock, the intermediary might end up creating that
deadlock anyway.  In that case, the most expedient solution is to
terminate the dependent stream and start over.

In part, it's the ignorance of the intermediary that causes this
particular problem.  If the intermediary was aware of the protocol
details then it might be able to recognize and avoid these situations,
but that seems a little too much to expect of intermediaries. Ruling
out the entire class of intermediary that operates purely at the
transport layer is extremely harsh.

What is more likely here is that we describe this situation, explain
that it is impossible to prevent in the presence of intermediation,
and explain how to kill the right streams in order to ensure forward
progress doesn't stall indefinitely.

I'd caution against an over-reliance on BLOCKED-style messages because
they will mark the blocked stream as a problem, where the best way to
proceed is to kill dependent streams.  That is, the blocked stream is
the one most likely to hold critical state; killing that might not
unbind the blockage.  But again, identifying the right stream to kill
requires a degree of additional sophistication, effectively tracking
the dependency graph.  What seems most likely - at least in the hq
case - is that requests will have the timers and those are what will
get killed.  That works better if compression updates are on separate
streams from requests, but the other mechanisms work reasonably well
too.


On Sat, Jan 13, 2018 at 5:01 AM, Roberto Peon <fenix@fb.com>; wrote:
> The transport *CANNOT* provide ordering guarantees w.r.t. prioritization because flows are end-to-end, whereas connections are point to point.
> Flow-control, being a property of flows, ends up being asserted end-to-end though the mechanism at each hop is point-to-point.
>
> Consider the ‘standard’ client->proxy->server topology.
> If the server asserts flow control, the proxy must respect it.
> As the client<->proxy hop is asynchronous from the proxy<->server hop in most circumstances with a loadbalancing (i.e. shared flow) proxy, the proxy cannot guarantee flow-control window will be available for sending to the server even when flow control window is available from the client<->proxy.
>
> Preemption/cancellation/retry, the addition of more resources, or finer management of resources can be used to resolve the (transitive) deadlock.
>
>
> The sub-case of compression may be solvable without solving the general case as compression is point-to-point.
> -=R
>
>
> On 1/12/18, 8:04 AM, "QUIC on behalf of Mirja Kühlewind" <quic-bounces@ietf.org on behalf of mirja.kuehlewind@tik.ee.ethz.ch>; wrote:
>
>     I disagree. I think it can be good to provide the transport hints about dependencies in application data which then could be used by the transport to optimize the sent-out scheduling, but I don’t think the transport should provide guarantees for ordered processing between streams. For me the whole point of having streams in the transport is to resolve dependencies between independent data.
>
>     Therefore, I think it would probably be good to mention this deadlock problem in the main draft but I don’t think we should add strict priorities as a transport feature to QUIC. There are several ways how an application can avoid (don’t have dependencies by using the same stream for dependent data or wait until data has be sent before sending dependent data) or resolve (read all streams and implement an additional application layer buffer) such a deadlock and these should be explained in the applicability document respectively.
>
>     My 2c,
>     Mirja
>
>
>     > Am 10.01.2018 um 23:42 schrieb Martin Thomson <martin.thomson@gmail.com>;:
>     >
>     > On Thu, Jan 11, 2018 at 9:39 AM, Jana Iyengar <jri@google.com>; wrote:
>     >> Yes, I think there's easy misinterpretation here -- something I realized
>     >> today as I've had conversations about this. What I meant was specifically at
>     >> the API between the app and the transport, at the sender. This is basically
>     >> saying that to avoid deadlock due to dependencies across streams, the
>     >> transport write API must allow the app to express strict priorities.
>     >
>     > Ahh, excellent, then I think at least we two are in agreement.  It
>     > seems like there is an emerging acceptance of the problem and proposed
>     > approach, I guess we might start considering how to address it.
>     >
>     > I think that this needs to be in the main spec.  Failing to document
>     > this sort of pitfall could be fatal.  Does anyone disagree?
>     >
>
>
>