Re: [DNSOP] [dns-operations] dnsop-any-notimp violates the DNS standards

Paul Vixie <paul@redbarn.org> Wed, 18 March 2015 08:22 UTC

Return-Path: <paul@redbarn.org>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 472F11A1ADA for <dnsop@ietfa.amsl.com>; Wed, 18 Mar 2015 01:22:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2k3WcZT57xOO for <dnsop@ietfa.amsl.com>; Wed, 18 Mar 2015 01:22:42 -0700 (PDT)
Received: from family.redbarn.org (family.redbarn.org [IPv6:2001:559:8000:cd::5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 468B01A1A2D for <dnsop@ietf.org>; Wed, 18 Mar 2015 01:22:42 -0700 (PDT)
Received: from [IPv6:2001:559:8000:cb:cd92:c99:361e:1650] (unknown [IPv6:2001:559:8000:cb:cd92:c99:361e:1650]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by family.redbarn.org (Postfix) with ESMTPSA id 1A43F1813E; Wed, 18 Mar 2015 08:22:42 +0000 (UTC)
Message-ID: <550935CD.2050202@redbarn.org>
Date: Wed, 18 Mar 2015 01:22:37 -0700
From: Paul Vixie <paul@redbarn.org>
User-Agent: Postbox 3.0.11 (Windows/20140602)
MIME-Version: 1.0
To: Michael Sinatra <michael@brokendns.net>
References: <20150309110803.4516.qmail@cr.yp.to> <20150309151812.GA14897@xs.powerdns.com> <20150316142350.GB26918@xs.powerdns.com> <55075C41.9000208@brokendns.net> <13D58CB4-95BD-412B-A073-C95617E97BCE@redbarn.org> <55077A64.7050906@brokendns.net> <5507CA2B.5000206@redbarn.org> <55093346.2050005@brokendns.net>
In-Reply-To: <55093346.2050005@brokendns.net>
X-Enigmail-Version: 1.2.3
Content-Type: multipart/alternative; boundary="------------000108020107040003080205"
Archived-At: <http://mailarchive.ietf.org/arch/msg/dnsop/uh4CCnYsdvVrUUMAVtJA9GoJVAQ>
Cc: dnsop@ietf.org
Subject: Re: [DNSOP] [dns-operations] dnsop-any-notimp violates the DNS standards
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Mar 2015 08:22:44 -0000


> Michael Sinatra <mailto:michael@brokendns.net>
> Wednesday, March 18, 2015 1:11 AM
> On 03/16/15 23:31, Paul Vixie wrote:
> ...
>> this is an internet-affecting bug and i hope you report it as such.
>> RRsets are to be purged when the lowest TTL therein expires, but the
>> other RRsets sharing that name should not be affected. can i ask which
>> public facing dns service has mandated that the rest of us juggle their
>> chainsaws for them in this way? (this is even worse than the jerk who
>> decided that EDNS could use IP fragmentation -- but i think that guy has
>> apologized at least.)
>
> I know unbound did do this--I think it still does, but it's late and I
> haven't had a chance to test it.  The public recursive DNS service was
> google, and again, my tests six months ago showed that they were purging
> the entire ANY pseudo-RRset and requerying the authoritative servers.

let's assume for the moment that we're going to work together to find
any implementation of RDNS which purges all RRsets when one RRset
expires, and then use every means at our disposal to correct that
misbehaviour, and further, that we will be successful.
>> "hammered" sounds like a volume greater than that which would be
>> detected and strangled with DNS RRL. what am i missing here, that makes
>> this your problem, rather than the public recursive dns operator's problem?
>
> I think there are a few issues at play.  google and other public
> recursives will sometimes have multiple backend servers fetch a given RR
> when they receive a single query for that RR on one instance of, say,
> 8.8.8.8.  I am basing this both on observed behavior and on Geoff
> Huston's research (recently presented at NANOG).  And since nothing is
> cached, I get the same servers asking the same query over and over
> again.  Writ large, the result is that I end up with 1-2k of
> simultaneous TCP sessions, per server, per domain.  Nothing I can't
> handle, since usually only 2-3 of my domains are involved,  But it does
> mean that I have to tweak BIND's defaults, since the number of allowed
> simultaneous TCP sessions for that implementation is much lower.

and not unfairly so, given the unfortunate logic of RFC 1035's TCP
connection management text. however, for the purpose of this discussion,
i ask you to assume that you and i will jointly and severally discuss
this matter with google, so that afterward, their cache-miss servers
will not "pile on" in this way.
> Otherwise I deny legitimate clients, since the TCP limit is applied
> across the entire named process, without regard to QTYPE or anything else.

to be fair to the laws of physics, there's no way to refuse a TCP
connection based on the QNAME it has not had a chance to tell you yet.
>
> So if someone is good at figuring out where the rate limits lie, and
> what tuple(s) they're based on, they can try to sneak just under the
> radar of any public cache, not just google.  If they spread the tuple
> values out enough, they might have an effective attack.  I have no idea
> how debilitating it is, as I have no visibility into who the actual
> victim is in this case.

i think it's fair to say that there's no one bogey man here. we're going
to have to get the purge behaviour fixed wherever it's broken, and we're
going to have to get the refill behaviour fixed wherever it's broken,
and we're going to have to do this even while knowing all the while that
ANY is a zombie, dead but still walking around looking for a place to
lay down.
>
>> i think there's no saving ANY at this point. we're deciding how it dies
>> and where to bury it, that's all.
>>
>> however, you've provided an example of an ANY attack that can't be
>> trivially switch to TXT, so, thank you.
>
> I am pretty small-time, but given some of the domains I slave are viewed
> as useful for reflection attacks, I will usually see these odd things
> when they crop up.

yes. and it's great that you pay attention to, and share, your experiences.

-- 
Paul Vixie