Re: [v6ops] <draft-ietf-v6ops-464xlat-optimization-00.txt> - pre-(shepherd-writeup) review

Erik Nygren <erik+ietf@nygren.org> Wed, 15 July 2020 05:53 UTC

Return-Path: <nygren@gmail.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9F9D03A0843; Tue, 14 Jul 2020 22:53:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vHrLaTXyJdZW; Tue, 14 Jul 2020 22:53:30 -0700 (PDT)
Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 511A03A0842; Tue, 14 Jul 2020 22:53:30 -0700 (PDT)
Received: by mail-wm1-f65.google.com with SMTP id f18so4123789wml.3; Tue, 14 Jul 2020 22:53:30 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pMyK+NFN6czDhQy06Ac2/aQFcmyD62LiCLgBP5p0AP8=; b=aEsvnMHvQsZoxB9D0TRv4QyBRPA7M0iXkbd7/YSJzbXZONTpjdrPMBP9aHGeM6dZRi 3R5LkSu4YHj7yjHi4WdEoDHLpBylVkxzkly006eIE5y1Z82If8ZxpIMFknwdzwLUzTyc nFsjOwyfdDnsbvcDsK41HJdMbP+PJVeut8HCAX2akGOXW32hdw01aS/fHWtS7LgEQvEF Y8Dvpq8iiL30hKcsXoAvDUw9M5pepIri09J5MLRh1LliY0b/Y9AIUW1lEkbSEXwUj1Ex Cyy54iV4o3S5k4lB/Df+Xtm70KoKOhZ0xELlLryK5xMWocbdzvtC+M/HZbqzUzOLkUWY yXLg==
X-Gm-Message-State: AOAM533n0M2tMdi/wlfhfp/5Lmrkjq820zlrlJ31HHj42Uo+MJ9pZxhF xnTnxgO0oPa1/h9Fp8hRxjFJcKt7Y6avqJRHggE=
X-Google-Smtp-Source: ABdhPJxr+ivH/gVPtgwrxXc3NsY0kDeS3hXce+HP5FQ4mii2Z6zlsX21rIPU26okiUJO/x1bQ2h6zy0YRhAMRZKg7VQ=
X-Received: by 2002:a7b:c92e:: with SMTP id h14mr6736638wml.36.1594792408641; Tue, 14 Jul 2020 22:53:28 -0700 (PDT)
MIME-Version: 1.0
References: <159393243745.16561.15755916877984628536@ietfa.amsl.com> <17D88CF8-B2CF-4737-910A-3D07881946BA@gmail.com> <24FDA390-8587-4366-8E4D-C6BBBB529CF8@theipv6company.com> <0B3CDBC8-3EBE-4FC4-AC5A-2DCD2480B502@theipv6company.com> <CAFU7BATueaCH5KL=-WVKZphs3fuwkOFvtmELPyQ9h9i4GBnkJw@mail.gmail.com> <CAFU7BAR8CaA6uKfm001J6fSfTNTrvyLffWfVurpBUs2HBxgPqw@mail.gmail.com>
In-Reply-To: <CAFU7BAR8CaA6uKfm001J6fSfTNTrvyLffWfVurpBUs2HBxgPqw@mail.gmail.com>
From: Erik Nygren <erik+ietf@nygren.org>
Date: Wed, 15 Jul 2020 01:53:16 -0400
Message-ID: <CAKC-DJi8re_SU1u8EehOxwxyXDaDaf1d2c_vO7T5SpBSRQFUvQ@mail.gmail.com>
To: Jen Linkova <furry13@gmail.com>
Cc: Jordi Palet Martínez <jordi.palet@theipv6company.com>, V6 Ops List <v6ops@ietf.org>, V6Ops Chairs <v6ops-chairs@ietf.org>, draft-ietf-v6ops-464xlat-optimization@ietf.org
Content-Type: multipart/alternative; boundary="00000000000001f6e005aa748bc1"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/yZUdmjdP3qIoEnCC8cY69MJaySc>
Subject: Re: [v6ops] <draft-ietf-v6ops-464xlat-optimization-00.txt> - pre-(shepherd-writeup) review
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jul 2020 05:53:34 -0000

On Thu, Jul 9, 2020 at 11:58 PM Jen Linkova <furry13@gmail.com> wrote:

> OK, I"ve read the document and before I finish the write up I'd like
> to discuss a number of comments I have.
> My apologies for not raising them during the WGLC - I was not paying
> attention, got busy with work...
>

Prodded by Jen's response here I also re-read
draft-ietf-v6ops-464xlat-optimization-02.
(My apologies also for not raising these during WGLC -- similar excuses to
Jen...  ;-)

While this continues to close corner-cases, I'm still concerned
that there is significant possibility for breakage in a way that
will result in content that is now dual-stacked getting switched
to be IPv4-only.

It would be good to add some examples as to why
some of the invalidation logic needs to actually work
(and the risk of corner cases where it doesn't).
To that end, it may make sense to also have some more
use of normative rfc2119 language for clearly spelling out behaviors
where there are sharp edges.

Some example cases of breakage or significant risk of breakage:
* A site is IPv4-only (breaks for or denies all IPv6 users) but shares an A
record with a dual-stack site.  If this isn't detected properly, it will
break users who get sent over IPv6 to the IPv4-only site.
* some CDNs use TLS SNI for demuxing IPv4 traffic but the IPv6 address for
demuxing IPv6 traffic (both to determine which cert to serve).  This means
that the IPv4:IPv6 associations are one to many and using the wrong one
will result in cert errors.
* Some CDNs may use cluster A for IPv4 and cluster B for IPv6 for one
customer/tenant, but then cluster A for IPv4 and cluster C for a different
customer/tenant.  An stale or invalid association will result in going to
the wrong cluster which may not be able to serve the request properly.

While the recent revisions help some, there are still a bunch of cases
where things could still break.  I fear that when they do break it's
not something that content providers or CDNs or their customers will
debug and the result will be for content to get switched back
to IPv4-only (which is not what we want happening, but will
appear to "fix" issues here).

I think there may also be some significant security issues not called out
that are independent of the DNS spoofing issue.

Another area not discussed is clients caching DNS lookups well past
their TTL.  This is fairly common in some clients (eg, some Java versions)
which could result in entries getting expired from the EAMT but then used
with a new association that should be invalid but wasn't detected as such
as the first client is caching well past the expiry.

Some other specific concerns on parts of the doc:

> 5.2.1 Detection of IPv4-only hosts or applications
> It needs to be remarked that, if the detection of the IPv4-only
> device or application is done incorrectly (either not detecting it
> or by a false detection), no harm is caused.

I'm not sure that "no harm is caused" is always the case.
There may be harm due to cases such as a client no longer
able to fall back to the IPv4 address it would have used
in a Happy Eyeballs setup.  (ie, connectivity is discussed extensively
here, but application-level breakage due to some of the issues
mentioned above may not be possible to recover from).

=========

> 5.2.1  Detection of IPv4-only hosts or applications

Will Android doing synchronous A/AAAA lookups break this
detection?  and does this impact other things like STBs built on
Android?  Android does one of the A or AAAA lookups, waits for it
to return, then does the other.  From the perspective of the CPE resolver,
could look like an IPv4-only host and just always introduce the 50ms delay?

Some more details may be needed here on the timing heuristics needed
to detect various scenarios like this.


=====

> 5.2.3
> 4. TTL
>    "In normal conditions the TTL for both A
>       and AAAA records, of a given FQDN, should be the same, so this
>       ensures a proper behavior if there is any DNS mismatch."

Actually if the resolver is a forwarding resolver talking
to the ISP's recursive resolver then it is likely the TTLs
returned will have degraded some and will be different.
For example, if a CDN is using a 60s TTL then it is likely
that the A RRset could have 12s remaining while the AAAA RRset
has 48s remaining.

Using the minimum of the two TTLs is likely better than assuming they will
match.

It may make sense to keep track of EAMT records somewhat past their
TTL to handle clients caching things longer as well as race conditions.
For example, add a "stale" state which can convert to "invalid" but won't
get used for new connections and which entries stay in for some amount
of time past expiry.

Nit: Is this really an "expiry time" rather than a "TTL"?
While mentioned later, it would be good to emphasize
that invalidating or expiring EAMT entries MUST NOT
break existing connections.  (ie, that the 5.2.4 behavior
is absolutely required.)  Implementations getting this wrong
would be really bad in-terms of breaking dual-stack content
while IPv4-only content would keep working (ie, due to
the lack of associations).   Hurting reliability for dualstack
content relative to IPv4-only content would be bad.

Also not covered here is clients caching A/IPv4 lookups for extended time.
For example, the  EAMT entry may have expired but client may have A record
cached and then another mismatching EAMT entry might get created for a
different FQDN.
=======

> 5.2.6.
> "The existing EAMT entry for 192.0.2.1 is set as invalid".

Clarify that:
* 5. Existing connections to 192.0.2.1 remapped to 2001:db8::a:b:c:d
  MUST NOT be broken.

=======

> 5.2.7.  Behavior in case of multiple A/AAAA RRs

Agreed with Jen's earlier comment that this needs more clarification.
Example algorithm:

* For each A record on the A RRset, create an EAMT entry with a randomly
  selected record from the AAAA RRset.

* Resolver should ideally randomly permute the A RRset prior to
  returning to the client

==========

> 5.2.10 Behavior in case of Foreign DNS

How will the mismatches that should trigger invalidations
get detected in this case?

* Breakage due to as mismatches won't be detected and result in
invalidations
 if one device or app is doing lookups via this resolver but another is not.
  Constraining the EAMT entry to the IPv4 client IP could at least bound
  the damage to a single device, but not if some apps on it use different
  resolver schemes.

* I believe Google ships Chromecast to hardcode DNS lookups to 8.8.8.8
  in addition to also sometimes using the local resolver.  Some other STBs
and mobile apps
  do the same.  This has significant potential for mismatches to not get
  detected and not get invalidated.

================

> 5.2.13.  Troubleshooting Implications

How do corner-cases around collisions where EAMT entries
don't get proprerly invalidated get debugged by the ISP
(and content provider!)?

This is opaque to the CDN and could result in lots of escalations from
end-users and customers.  The net result will be for content to get
switched to IPv4-only if that "fixes" the problem which might impact
users world-wide.  (Neither the CDN or their customer has any
visibility into what is going on here other than that some end users
may be reporting not being able to view dual-stacked content.)

============
General for 5.2.x:

* Should this doc use more normative language in places where warranted?
  (There are a bunch of MUST NOT sharp edges.)
  Especially in 5.2.4.

* Clarify that EAMT table updates MUST be made before A record
  is returned to the client (especially invalidations!).  Otherwise there
  is a potential race condition.

* What happens if the FQDNs mismatch but the A and AAAA RRsets are
identical?
This is going to be very common as many CDNs use lots of names
 with a smaller number of IPv4 addresses.
  What if the A and AAAA RRsets overlap but aren't identical?
  (Perhaps the EAMT entry MUST be marked invalid if the FQDNs mismatch
  and the EAMT entry does not have both its A *and* AAAA values
  in the returned RRsets.  ie, all EAMT entries must be valid
  within the mismatching FQDN.)
  (May still be corner cases here I'm forgetting?)

* We have a recommended maximum cap/threshold on the TTL?
  (even if it's something like 8 hours...)  Otherwise bad entries
  might stick around too long.

==========

Additional security considerations:

(The first of these is minor --- the others seem like they make this too
dangerous to deploy.
It may be worth having a Security AD do some focused review in this
context.)

* Filling EAMT table with very long TTL records.  (ie, triggering a client
to
  do DNS lookups that create lots of long-lived associations)

* There is an serious risk exposure to hijacking traffic by inserting bogus
binding
  (ie, return target's A record and malicious actor's AAAA record).
  In most cases these records should get marked as Invalid,
  but the corner cases here are fragile (eg, with foreign DNS or
  clients caching past the TTL mixed in here.
  An example attack here is that a client has an A record
  for "www.example.com" pointing to 192.0.2.1 cached past
  its TTL so the EAMT record is gone.  An attacker convinces
  another user in the same home network to do a DNS lookup
  that has an A record of 192.0.2.1 (which the attacker doesn't own)
  but a AAAA record for 2001:db8:eeee::eeee (which is attacker-controlled).
  Even if 192.0.2.1 is IPv4-only, if the EAMT entry has expired or never
  existed then the attacker can redirect all traffic to it to an IPv6
address
  of their choice.  No DNS spoofing is needed here at all, just making
  a "claim" that the A/AAAA are associated (which has no authentication).
  This is potentially worse as I'd hypothesize that IPv4-only STB
  clients are much less likely to use HTTPS/TLS.

* This previous risk is particularly problematic for hijacking traffic from
  users/devices/clients using Foreign DNS since they won't create
  EAMT entries but will be exposed to clients getting tricked into creating
them.
  (This cCould be especially problematic for users using DoH for privacy
  on one device, but then another device in their home allows
  them to be hjijacked.  At some point Firefox preferred IPv4 in some
  cases when using DoH so this may be a real exposure
  / vulnerability risk.)

Sorry again for not sending these earlier --- I think a bunch of these
concerns were raised in my previous feedback and while this version
has made significant improvements, I'm still very concerned that it will
be very risky from an operational fragility and security perspective.

     Erik