Re: [v6ops] An Update to Happy Eyeballs

Erik Nygren <erik+ietf@nygren.org> Tue, 14 March 2017 02:02 UTC

Return-Path: <nygren@gmail.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 551E612946F for <v6ops@ietfa.amsl.com>; Mon, 13 Mar 2017 19:02:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WNhr2QgMkQ69 for <v6ops@ietfa.amsl.com>; Mon, 13 Mar 2017 19:02:48 -0700 (PDT)
Received: from mail-ua0-x229.google.com (mail-ua0-x229.google.com [IPv6:2607:f8b0:400c:c08::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 94834129649 for <v6ops@ietf.org>; Mon, 13 Mar 2017 19:02:48 -0700 (PDT)
Received: by mail-ua0-x229.google.com with SMTP id 72so162424808uaf.3 for <v6ops@ietf.org>; Mon, 13 Mar 2017 19:02:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=GDEfY5EnxQ7gUOmuDoPyTxcqPUL5Dg/jzR98qUixF1k=; b=SWIFEbVvFI6thAgncpcpjK3Bvc2yHJd6opQ1AhvibErTe6OaMNVlBl4AUnnCT9lgaA JU1Nv4L60l4dEOtzLirct+V0uXxQYczd30BqZwODCs4FYrBzTAxByKfBtBnqZIMA7eC1 Sj5PSwZZB9I57z/tttolSyk+hdnxtNgksIYUiNjjHFWf4RTJWvDjmP/TnBqOYGugB7W3 C0o+s8Q/+ZBxcAr3J8T0brQJhhSK3on/UdtM9gC9Lc3J+/zbvngPi0oEK2WEsVV1qsH6 1LUTnge1ucTXDnwIdVFSt6VMsOFra2mRr8YeljqwHh1aQCo0xiveirmImkTf4QOZTbBv TjnA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=GDEfY5EnxQ7gUOmuDoPyTxcqPUL5Dg/jzR98qUixF1k=; b=oCa7VluMz5O17qhVRWM7wmyD5X9DzUqess/dmv7DTVzpiMXvgfXMFjcqGK8Us7ib0/ j78QeWZdYF6AgMVBhmg6Svundt+mXyTJP8VHZAROJ4TmD80hWXgpp8HbKuWwSXCco/AM XIjS93wDvDerpYMuervQYShlNoi9jpVfcDkqde7jz2PR93RWeRpLEbQHJLeV0/PaJeog kOGPIDztaJfr94dxLFohJ7fdGRKYLNAonbjPOQHdM5iIZqXBggNpONHmwJOXquJJlggK ksn/zknlAuZyhVx8MqwhzjlozcGpYcs2GOTZ9N4YeXPAgjy/lma5DdmVUlcm8pkpJMJI jS5A==
X-Gm-Message-State: AMke39lcPqjvAptw8cwetYbuNKhsParYOAk7TiR8e5jTZl9UHRvG2kWWxNw+deC2tu02rgT9hO4ap36+NLCxGg==
X-Received: by 10.176.74.30 with SMTP id q30mr18576647uae.4.1489456967513; Mon, 13 Mar 2017 19:02:47 -0700 (PDT)
MIME-Version: 1.0
Sender: nygren@gmail.com
Received: by 10.103.91.201 with HTTP; Mon, 13 Mar 2017 19:02:46 -0700 (PDT)
In-Reply-To: <92EEB875-288D-4CF9-B81F-3B5C8EA49F53@apple.com>
References: <148899860042.20118.391380898590855642.idtracker@ietfa.amsl.com> <A609BABB-BDF2-4CCB-8452-F489C019748C@apple.com> <m1clvfj-0000FCC@stereo.hq.phicoh.net> <ABE752F6-895B-431C-9E94-E0CD2FDDB2E3@apple.com> <m1cmTQX-0000IcC@stereo.hq.phicoh.net> <92EEB875-288D-4CF9-B81F-3B5C8EA49F53@apple.com>
From: Erik Nygren <erik+ietf@nygren.org>
Date: Mon, 13 Mar 2017 22:02:46 -0400
X-Google-Sender-Auth: BavX3REHCq4UdjDyBnnTj57AZB4
Message-ID: <CAKC-DJjeUX1rRB_e99SGJS06RoFZ6E6A8Tpj0hPAvfS6+L+XWA@mail.gmail.com>
To: David Schinazi <dschinazi@apple.com>
Content-Type: multipart/alternative; boundary="f403045f8716750c18054aa739dd"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/oTCg2qKKU4yzbaTFn9u2Sm98WP0>
Cc: IPv6 Operations <v6ops@ietf.org>, Erik Nygren - Work <nygren@akamai.com>
Subject: Re: [v6ops] An Update to Happy Eyeballs
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Mar 2017 02:02:50 -0000

It's great to see an updated version of this guidance!

> There is zero reason for making async DNS a MUST.

The bare minimum is that revolvers must do A and AAAA lookups in parallel.
The current behavior of some stacks is to do the AAAA and A lookups in
series.
This means effectively means that adding IPv6 connectivity to a client adds
an
extra RTT in for almost all DNS lookups.
For example, see section 5 in:
https://www.akamai.com/us/en/multimedia/documents/technical-publication/a-case-for-faster-mobile-web-in-cellular-ipv6-networks.pdf
For clients such as mobile device visiting sites with lots of hostnames,
this can have a very substantial performance hit.  This also shows up
in some RUM-based measurement reports making IPv6 look slower
(due to clients with IPv6 spending more time doing DNS lookups before
doing page loads).

Doing the lookups in parallel but not waiting for both responses is better
than serial,
but still has a perf hit (whatever the recovery time plus an RTT) when
the A or AAAA lookup packet is lost.

Some additional comments/thoughts after reading through the -01 version:

* It would be good to add a section on failure cases NOT detected/mitigated
by this form of Happy Eyeballs.
   Even if not covering mitigations, it would still be good to discuss them
for awareness.

    - In particular, PMTUD seems to be the most common.  ie, if there is a
PMTUD issue between
      the client and server, then the connect will often succeed but the
connection will fail to function.
      I think most large-scale IPv6 server operators (plus many of small
ones) have broken this at least once.
      One client-side mitigation for TCP might be for the client to offer
progressively smaller MSS
      as it retries different IPs within a protocol family.
      (I don't know if anyone does or has tried this?  There is the
server-side pmtud probing feature.)
      For UDP protocols, using full-frame packets for the SYN and the
SYN-ACK (as QUIC does) seems
      like one approach to at least detect breakage early, although QUIC
doesn't key have a PMTUD
      mitigation solution AFAIK.

    -  Servers that return different content for IPv6 vs IPv4 (eg, "404 not
found" due to an unconfigured server on the IPv6 side).
       "Don't do this" as advice to server operators is likely the best way
to fix it as hacking around it on the client side is unhelpful.

* It may make sense to recommend some form of back-off in the retry
timing.  Rather than a fixed value (eg, 250ms), adding
  an increasing time value with some jitter into each retry may be safer in
the cases of overloaded servers
  or a network connection that is borderline near the retry time.  I've
seen congestive failure and lack-of-progress
  scenarios from having a fixed retry timer.  For example, with servers
that do FIFO queueing of connections to accept,
  if the queue becomes longer than the retry period then all clients fail
to make forward progress and you reach
  congestive collapse.  The same can happen with links that become
high-latency, however.

* It may be worth adding some guidance into reporting and visibility.  I'm
not sure how?
  One of the big complaints against Happy Eyeballs is that it masks
brokenness (latency spikes
  but things keep working so no one complains enough to fix the root cause).
  Having a recommendation that stacks or applications at least keep
counters and telemetry
  on failures may at least make it more viable to debug?

* It would be good to provide guidance or a reminder around protocols that
send data along with a SYN
  or an initial flight  (eg, TCP Fast Open / TFO and TLS 1.3 0RTT).  In
particular, you likely
  want to send this ONLY on one connection attempt (eg, the IPv6 attempt?)
as otherwise
  the operation may be executed twice by the server.  This may cause undue
server load
  and for apps/clients incorrectly using TFO or 0RTT for non-idempotent
operations
  it could cause duplicate actions.

Thanks!  Erik






On Sun, Mar 12, 2017 at 11:53 PM, David Schinazi <dschinazi@apple.com>
wrote:

> Hi everyone,
>
> Thanks a lot for the comments and feedback.
> We've incorporated them into -01, please let us know if they were properly
> addressed.
> https://www.ietf.org/internet-drafts/draft-pauly-v6ops-
> happy-eyeballs-update-01.txt
>
> Regards,
> David Schinazi
>
>
> On Mar 10, 2017, at 14:54, Philip Homburg <pch-v6ops-6@u-1.phicoh.com>
> wrote:
>
> In your letter dated Fri, 10 Mar 2017 09:29:55 -0800 you wrote:
>
> We can certainly soften some of the language to make it clear that if
> your system has no such option, you are not necessarily out of spec, but
> if su
> ch an option is available, we believe that it SHOULD indeed be used. This
> fits
> with the Happy Eyeballs paradigm: if I am waiting for one of the DNS
> response
> s to come back, I could have already made my connection in that time,
> getting
> the user the resource loaded more quickly.
>
>
> If the DNS requirements can be toned down to the point that an application
> can use getaddrinfo if that fits the application, then that's fine
> with me.
>
>
> _______________________________________________
> v6ops mailing list
> v6ops@ietf.org
> https://www.ietf.org/mailman/listinfo/v6ops
>
>
>
> _______________________________________________
> v6ops mailing list
> v6ops@ietf.org
> https://www.ietf.org/mailman/listinfo/v6ops
>
>