RE: Questions about QUIC and Anycast

"Lubashev, Igor" <ilubashe@akamai.com> Tue, 11 December 2018 01:39 UTC

Return-Path: <ilubashe@akamai.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DEB081277D2 for <quic@ietfa.amsl.com>; Mon, 10 Dec 2018 17:39:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.161
X-Spam-Level:
X-Spam-Status: No, score=-4.161 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.46, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=akamai.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yeWyxhTHTILR for <quic@ietfa.amsl.com>; Mon, 10 Dec 2018 17:39:20 -0800 (PST)
Received: from mx0a-00190b01.pphosted.com (mx0a-00190b01.pphosted.com [IPv6:2620:100:9001:583::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 397BC126CC7 for <quic@ietf.org>; Mon, 10 Dec 2018 17:39:19 -0800 (PST)
Received: from pps.filterd (m0122332.ppops.net [127.0.0.1]) by mx0a-00190b01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id wBB1bC9C021874; Tue, 11 Dec 2018 01:39:14 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=jan2016.eng; bh=mKC4CRaAN2sOPaL+3VhtCcbt+O+ZuYzrroYrfBaEEQk=; b=HyptM7D5I/HpB5w4++6zXoDFxLCyeDctQoKOO30otnNiO0bgj9dN5B5KV0nfWv5kdcgd X7FN9KdZS+GhfhYA/YzWIgixCiAs3RYuqGvky0ROcedvPaSRuNo3lhh5PYWHVpAJ4Jxs Js5opJBNfziXKkKnOo4tQgiIqx70KWpGd4lKGIi6B7lGAf3j+lrtkQrPxY2BvEViVP6B 54VSKGmrB8J+y1yMWlQURIMRlTj9v1WIx7phebxsKMCfJcA9Elm0qWTl1UCkrxLCl6O6 akPRZJM5s6iGfQifWfE3VpM4Feq+i5btv9I2My1mBDaTpJM6lOoDaRkWrRRmc5WJv6z9 6Q==
Received: from prod-mail-ppoint4 (a96-6-114-87.deploy.static.akamaitechnologies.com [96.6.114.87] (may be forged)) by mx0a-00190b01.pphosted.com with ESMTP id 2p86vkt5pu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Dec 2018 01:39:11 +0000
Received: from pps.filterd (prod-mail-ppoint4.akamai.com [127.0.0.1]) by prod-mail-ppoint4.akamai.com (8.16.0.21/8.16.0.21) with SMTP id wBB1WlDQ005589; Mon, 10 Dec 2018 20:39:10 -0500
Received: from email.msg.corp.akamai.com ([172.27.27.25]) by prod-mail-ppoint4.akamai.com with ESMTP id 2p8a61jxdv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Mon, 10 Dec 2018 20:39:10 -0500
Received: from USTX2EX-DAG1MB5.msg.corp.akamai.com (172.27.27.105) by ustx2ex-dag1mb1.msg.corp.akamai.com (172.27.27.101) with Microsoft SMTP Server (TLS) id 15.0.1365.1; Mon, 10 Dec 2018 19:39:09 -0600
Received: from USTX2EX-DAG1MB5.msg.corp.akamai.com ([172.27.27.105]) by ustx2ex-dag1mb5.msg.corp.akamai.com ([172.27.27.105]) with mapi id 15.00.1365.000; Mon, 10 Dec 2018 19:39:09 -0600
From: "Lubashev, Igor" <ilubashe@akamai.com>
To: Martin Thomson <martin.thomson@gmail.com>, "bill@herrin.us" <bill@herrin.us>
CC: QUIC WG <quic@ietf.org>
Subject: RE: Questions about QUIC and Anycast
Thread-Topic: Questions about QUIC and Anycast
Thread-Index: AQHUkOcggzdlddu9E0idzmmr2IuoxKV5FyCA//+kT4A=
Date: Tue, 11 Dec 2018 01:39:07 +0000
Message-ID: <a2c9233444574c61a9ce337705f76989@ustx2ex-dag1mb5.msg.corp.akamai.com>
References: <CAP-guGVPd-_YkfaGAk+dmi9TJGQ8n-RozbM0ea4wC7hDoU7Xhg@mail.gmail.com> <CABkgnnWBm1=+8FMhbqD2662uybTaYHF4aBR4QyAqJDGdU+Qo7w@mail.gmail.com>
In-Reply-To: <CABkgnnWBm1=+8FMhbqD2662uybTaYHF4aBR4QyAqJDGdU+Qo7w@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [172.19.35.9]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-12-11_01:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812110013
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-12-11_01:, , signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812110014
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/eEgrYcggYMgiRXjFO6YPor0-FdQ>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Dec 2018 01:39:23 -0000

> The details of how to deal with PTB messages is in https://quicwg.org/base-
> drafts/draft-ietf-quic-transport.html#packet-size
> 
> We realize that there is a limitation here: the PTB message won't contain the
> connection ID selected by the server, so it can get misrouted.  Kazuho
> suggested using a dummy packet with a long header on PMTUD probes for
> that reason (those packets can be skipped and contain both connection IDs),
> but I don't think that we've agreed to formalize that technique, so there
> might be some privacy implications we haven't considered.

Bill, note that this problem is not unique to Anycast (assuming that the ICMP PTB landed in the correct anycast cluster).  Any load balanced cluster that uses Connection IDs to route packets within the cluster will need to address this.

If the cluster is using a Consistent Hash algorithm to route the initial packets based on the 4-tuple of the packet, the ICMP can also be routed using the same hash.  Otherwise, one can keep state on the cluster's load balancer (active 4-tuples) or do rate-limited broadcast of the ICMP within the cluster.  None of the above (except for keeping centralized state) is a guarantee of success but does improve the chances.

If the Server's Connection ID is critical to you, you can use the long-header dummy packet coalesced with a short header packet trick for PMTU probe packets.  If a UDP packet containing these coalesced QUIC packets gets through, the long-header packet will be ignored but the short-header packet will be ACKed (just make sure to include at least one ACK-required frame -- a single PING frame will do the trick).  If that UDP packet elicits an ICMP PTB message, and that message contains more than 8 bytes of the original UDP datagram (the old IPv4 ICMP standard), the ICMP PTB message will contain your Server's Connection ID.


> -----Original Message-----
> From: Martin Thomson <martin.thomson@gmail.com>
> Sent: Monday, December 10, 2018 7:41 PM
> To: bill@herrin.us
> Cc: QUIC WG <quic@ietf.org>
> Subject: Re: Questions about QUIC and Anycast
> 
> Thanks for the questions Bill,
> 
> I'll provide some short answers.  Others might be able to expand a little.
> 
> On Tue, Dec 11, 2018 at 11:18 AM William Herrin <bill@herrin.us> wrote:
> > Question 1: Did I miss anything? Do the QUIC authors envision any
> > additional ways of supporting Anycast-routed servers?
> 
> The two methods you describe are what I understand to be the primary
> mechanisms.  You might find that connection ID-based routing (your second
> option) to be necessary though for the reasons discussed below...
> 
> > Question 2: Which packet contains the server preferred address?
> 
> In practice, taking some of the less-used exchanges (Retry, Version
> Negotiation, a TLS HelloRetryRequest) out of the picture, the server will
> often include the value in the first flight of packets it sends.
> However, that can be a couple of round trips away from the end of the
> handshake, because the handshake is multiple packets and might need to be
> repaired due to packet loss.
> 
> > Can the client start
> > trying to use the preferred address with its second packet or does it
> > have to wait?
> 
> The client has to wait, because - until the handshake is complete - it can't
> authenticated the preferred address value provided by the server.  Also,
> validation of the path to that address likely takes time.  That basically forces
> the server to use connection IDs to ensure continuity.  A server might rely on
> the route being stable for the duration of the handshake, though that would
> rely more on having timely termination of the connection; keeping in mind
> that our best mechanism, stateless reset, probably isn't viable at this point.
> 
> > Is it allowed to continue using the non-preferred address and, if so,
> > will it fail over to attempting the preferred address if its
> > communication fails in some way (no response or rejection from the
> > server at the non-preferred address)?
> 
> Yes, a client can choose to ignore the preferred address.  And yes, I believe
> that the working group understands the implications of this.
> If someone didn't and finds this distressing, yell.
> 
> > Question 3: What picks the connection ID? Does the API allow the
> > application software to influence the selection of connection ID in
> > any way?
> 
> The server picks connection IDs that the client puts in packets.  That selection
> really has to be done in consultation with load balancers or the folks
> managing routing configurations.
> 
> There's work on a protocol for managing this:
> https://tools.ietf.org/html/draft-duke-quic-load-balancers-03
> 
> > Question 4: Is the connection ID inside or outside the encrypted
> > portion of the packet?
> 
> The value is unencrypted, but it is authenticated.  So it can be read, but not
> changed.
> 
> > Question 5: Fragmentation needed / Packet Too Big messages originate
> > from an intermediate router that often has.a different return path
> > than the route taken by packets from the endpoint. How is QUIC
> > intended to deal with packet too big messages which arrive at a
> > different node in the Anycast cluster than the one which sent the
> > overlarge packet?
> 
> The details of how to deal with PTB messages is in https://quicwg.org/base-
> drafts/draft-ietf-quic-transport.html#packet-size
> 
> We realize that there is a limitation here: the PTB message won't contain the
> connection ID selected by the server, so it can get misrouted.  Kazuho
> suggested using a dummy packet with a long header on PMTUD probes for
> that reason (those packets can be skipped and contain both connection IDs),
> but I don't think that we've agreed to formalize that technique, so there
> might be some privacy implications we haven't considered.