Re: [DNSOP] What should ANAME-aware servers do when target records are verifiably missing?

Richard Gibson <richard.j.gibson@oracle.com> Fri, 12 April 2019 15:47 UTC

Return-Path: <richard.j.gibson@oracle.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E9D0312086B for <dnsop@ietfa.amsl.com>; Fri, 12 Apr 2019 08:47:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.301
X-Spam-Level:
X-Spam-Status: No, score=-4.301 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=oracle.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WXwFWn6rOiDc for <dnsop@ietfa.amsl.com>; Fri, 12 Apr 2019 08:47:42 -0700 (PDT)
Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0597012085D for <dnsop@ietf.org>; Fri, 12 Apr 2019 08:47:41 -0700 (PDT)
Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x3CFiVnR092198; Fri, 12 Apr 2019 15:47:40 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type; s=corp-2018-07-02; bh=FAPtnJ2U34tLWi8KALLnMG4XrV5Kfn/vDaUAww2ezZI=; b=kveQOsXrenDuMGFuRxQkYiB6uoV6mZLjmCKm34Es604HwY0WM94yZ6gElR3LKwNPo06B aOadf7M769SA5FZA3kAOBKCo3mWqBz+gyEUX0nGyERlLDwivMvCAi6EteO7uH5yoPdn3 T/YXbv7DgYQSdDd5ml3AFOBPmWpzDpKr3f6O0oALq0puXGm0/RbqM+2Ttpr5iNF3YfkK b5acXjBmXfaKGRyuoryCgS97SehMp04NuW1cRjm6PY4GvREXJoImizju3f2+19QDfAuP FnzfIUpKIRZnM4bh7hztCqtCoUDT/PAIMxOqHJO9OStrQCI9avQLj/DuphvcF464rSDi kw==
Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2rpmrqq775-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Apr 2019 15:47:39 +0000
Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x3CFjjiD043812; Fri, 12 Apr 2019 15:47:38 GMT
Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3020.oracle.com with ESMTP id 2rpytdcmuc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Apr 2019 15:47:38 +0000
Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x3CFlXeQ020037; Fri, 12 Apr 2019 15:47:36 GMT
Received: from [192.168.1.213] (/67.189.230.160) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 12 Apr 2019 08:47:33 -0700
To: Matthew Pounsett <matt@conundrum.com>
Cc: Bob Harold <rharolde@umich.edu>, dnsop <dnsop@ietf.org>
References: <d8ccad4a-cd0c-4c97-b4d7-2099657351dc@oracle.com> <CA+nkc8BM+mfTBm3XyOaZUF5hMg23t9aSY4nq4Y4=BQ-sjcjkVg@mail.gmail.com> <25b38d21-c572-d782-6b35-a187fa0caae8@oracle.com> <CAAiTEH9Eg0oYw9HR9Ab5pYikFUvcbWXneF39_8xasp6tE9PpCA@mail.gmail.com> <516fda75-bb6e-67c6-cd52-0a5017bc889f@oracle.com> <CAAiTEH-ghaJB1XUm_3NJhzscH4ZTRHs34Rwndm40MVoD5FBGxw@mail.gmail.com>
From: Richard Gibson <richard.j.gibson@oracle.com>
Message-ID: <734743c3-6603-3d93-de58-e6cf347c783f@oracle.com>
Date: Fri, 12 Apr 2019 11:47:31 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <CAAiTEH-ghaJB1XUm_3NJhzscH4ZTRHs34Rwndm40MVoD5FBGxw@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------F9871F5950EA0B24457E54F3"
Content-Language: en-US
X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9224 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904120104
X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9224 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904120104
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/G9dPABkL8o5dSHPphrjBCrdvB6E>
Subject: Re: [DNSOP] What should ANAME-aware servers do when target records are verifiably missing?
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Apr 2019 15:47:50 -0000

On 4/11/19 23:45, Matthew Pounsett wrote:
> On Thu, 11 Apr 2019 at 20:02, Richard Gibson
> <richard.j.gibson@oracle.com> wrote:
>> The first problem is for the owner of the ANAME-containing zone, for whom the upstream misconfiguration will result in downtime and be extended by caching to the MINIMUM value from their SOA, which in many cases will be one to three orders of magnitude greater than the TTL of the ANAME itself.
> [snip]
>
>> But this suffers from both of the problems I have been complaining about—the resolver does not necessarily have the zone SOA, possibility necessitating an inline lookup, and per RFC 2308 the negative response will be cached according to values from the SOA that are unrelated to and far exceed the TTL of the ANAME.
> Ah, I see the confusion.  You're using definitive statements such as
> "will" when what you actually mean is "may."   There's no specific
> mechanism that causes the client to cache the negative response "for
> one to three orders of magnitude greater than the TTL of the ANAME."
> And, the TTL of the SOA doesn't necessarily "far exceed" the TTL of
> the ANAME.  You're just presupposing that will be a common
> configuration?
I am indeed claiming that will be a common configuration, and I have 
access to data supporting that claim from existing use of Oracle+Dyn 
ALIAS functionality. Also, please note that those "will" statements are 
properly understood in the context of the examples they are describing.
> I think we're still talking about misconfigurations here, and
> designing a protocol around people who misconfigure their DNS at the
> expense of people who configure it properly seems like a bad path to
> take.
You're pretty much making my point... it is a bad path to design a 
protocol around people who misconfigure their [ANAME-targeted] DNS at 
the expense of people who configure [ANAMEs with static sibling records] 
properly.
>> Both of these problems can be addressed by allowing/recommending/requiring ANAME-aware servers to preserve ANAME siblings when resolution of ANAME targets results in NXDOMAIN or NODATA responses, rather than replacing them with an empty RRSet... which, to be honest, seems to be always-undesirable behavior anyway—if anyone can think of a scenario where it would be beneficial to dynamically remove ANAME siblings, please share it.
> Yes, I can think of a case where it would be beneficial to remove
> ANAME siblings: when the target of the ANAME is removed from the DNS.
That would take the ANAME-owning domain offline, rather than supporting 
it with its static A and/or AAAA records. How exactly is that beneficial?
>> In such a configuration, the owner of the ANAME will be able to see that clients are using its static sibling records rather than its target (and therefore that they are getting no benefit from the ANAME), and can react accordingly. If your concern is excess queries for the ANAME target, then this seems no different from e.g. CNAME—the owner of the target can issue long-lived negative responses while performing whatever other exploration and/or mitigation they deem fit.
> If the target of the ANAME disappears, the owner of the ANAME will
> similarly be able to recognize the problem and deal with it.  If the
> administrator of the name owning the ANAME is concerned about downtime
> due to misconfigurations by the target, then that administrator can
> either select a different target (presumably by selecting a different
> service provider)
It seems awfully insensitive to bake into a protocol an unnecessary 
requirement of its users for all-or-nothing commitment to external 
service providers.
> or set their TTLs appropriately to not be subject to
> the potential issue you identified above.
At the unnecessary expense of reducing cache lifetime (and therefore 
undesired traffic) for /all/ negative responses, rather than just those 
associated with the ANAME.
> However, if the spec requires preserving the target in the DNS despite
> the administrator of the target zone removing it, then that is a path
> for abuse by the administrator of the zone containing the ANAME, and
> the owner of the target will have no recourse.  This is what I meant
> by my reference to serve-stale.
The spec requires nothing like "preserving the target in the DNS", with 
or without my proposed changes. The abuse path you mention is already 
present with CNAME, and mitigable by owners of ANAME targets in exactly 
the same way—increase negative caching TTL (and unlike the above, this 
is a scenario where it /would/ make sense to increase it broadly rather 
than for specific RRSets).