Re: [DNSOP] New Version Notification for draft-pan-dnsop-swild-rr-type-00.txt

"Ralf Weber" <dns@fl1ger.de> Thu, 17 August 2017 15:16 UTC

Return-Path: <dns@fl1ger.de>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3BFA1320D8 for <dnsop@ietfa.amsl.com>; Thu, 17 Aug 2017 08:16:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EdiKRoIOKWcj for <dnsop@ietfa.amsl.com>; Thu, 17 Aug 2017 08:16:36 -0700 (PDT)
Received: from smtp.guxx.net (smtp.guxx.net [IPv6:2a01:4f8:a0:322c::25:42]) by ietfa.amsl.com (Postfix) with ESMTP id E55641241FC for <dnsop@ietf.org>; Thu, 17 Aug 2017 08:16:35 -0700 (PDT)
Received: by nyx.guxx.net (Postfix, from userid 107) id 658A45F40045; Thu, 17 Aug 2017 17:16:27 +0200 (CEST)
Received: from [172.27.7.23] (unknown [75.104.69.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by nyx.guxx.net (Postfix) with ESMTPSA id 067335F40045; Thu, 17 Aug 2017 17:16:20 +0200 (CEST)
From: Ralf Weber <dns@fl1ger.de>
To: Lanlan Pan <abbypan@gmail.com>
Cc: Ted Lemon <mellon@fugue.com>, dnsop WG <dnsop@ietf.org>
Date: Thu, 17 Aug 2017 11:16:20 -0400
Message-ID: <78BBBA80-9FAA-4F18-9EDC-F93A2790E226@fl1ger.de>
In-Reply-To: <CANLjSvWcscFQSH9KdmZPgOb9vC5immDoJZjG3msQB5TqfiHkDQ@mail.gmail.com>
References: <CANLjSvWFh0ER47=SFJB-3rkTJKT_OxcjKwcD9-DUkDDxJTo=+g@mail.gmail.com> <201708151341.v7FDfNqR039481@calcite.rhyolite.com> <CAPt1N1=2eFRBCHYptn6W=3ruFisN0xRcMQSPPakgZXnmsaTS5w@mail.gmail.com> <CANLjSvWkDTgqTg+fy2jZzfcaY7e1VWB11yiWMzO3MfcrCGVLSQ@mail.gmail.com> <FFA80661-78A3-40B8-8DBC-FE79E873BCAF@fugue.com> <CANLjSvWcscFQSH9KdmZPgOb9vC5immDoJZjG3msQB5TqfiHkDQ@mail.gmail.com>
MIME-Version: 1.0
X-Mailer: MailMate (1.9.6r5347)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/ySxf5xYbNvBCskLgiBVqrW_XZak>
Subject: Re: [DNSOP] New Version Notification for draft-pan-dnsop-swild-rr-type-00.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Aug 2017 15:16:39 -0000

Moin!

On 17 Aug 2017, at 0:09, Lanlan Pan wrote:
> Yes, I agree, in fact the *online cache rate* is small (0.12% queries), LRU
> & TTL works fine.
> SWILD not save many online cache size, because of the queries rate.
> And Temporary Domain Names/ All Names: 41.7% for 7 days statistics,  the
> rate can be about 10% for 1 day statistics. Because temporary domain names
> expire after TTL time.  Ralf has similar curious.
As you mention me, with the data you supplied it is highly unlikely that a
lot of records will still be active in the Cache because of the TTL and
how least recently used (LRU) algorithms work.

> The problem is:
>
> 1)  cache size
> Recursives commonly cache "all queried domain in n days" for some
> SERVFAIL/TIMEOUT condition, which has been documented in
> https://tools.ietf.org/html/draft-tale-dnsop-serve-stale-01
That is not what the draft suggest and is not what the current
implementations of this or similar features do. They all rely on a cache
with a fixed size and if the record that normally would be expired is
still in the cache extend it's lifetime when queried. The records you
mention are not queried and thus would be expired because other records
that are queried more frequently would have overwritten them anyway.

There also is nearly no harm if these queries fail in case the
authoriative is not responding as most of those queries you describe
are computer and not human generated. The draft above and similar
techniques where done because of the twitter.com problem. Now I can
assure you that twitter.com will always be hot (asked at least every
couple of seconds) in a regular resolver at your ISP or a provider
of DNS services, and thus the expired record will probably still be
in the cache.

> The subdomain wildcards cache are needlessly,  we can use heuristics
> algorithm for deciding what to cache, or just use simple rule like "select
> domain which queies time > 5 in last n days".
> We can use SWILD to optimize it, not need to detecting, just remove items
> which SWILD marked, to save cost.
The cost of sending a query now and then is very low resolvers do that all
the time and the rate on which they have to do that is very low. However to
actually save costs you would have to deploy your proposal on the
authoritative server that have that behaviour and the resolvers. Good luck
with that. I also assume some of the authorities are actually interested in
the queries so they would not implement your proposal even if they could,
making the theoretical improvement of 0.12% even smaller.

> 2) cache miss
> All of temporary subdomain wildcards will encounter cache miss.
> Query xxx.foo.com,  then query yyy.foo.com, zzz.foo.com, ...
> We can use SWILD to optimize it,  only query xxx.foo.com for the first time
> and get SWILD, avoid to send yyy/zzz.foo.com queries to authoritative
> server.
See above.

> 3) DDoS risk
> The botnet ddos risk and defense is like NSEC aggressive wildcard, or NSEC
> unsigned.
> For example,  [0-9]+.qzone.qq.com is a popular SNS website in China, like
> facebook. If botnets send "popular website wildcards" to recursive,  the
> cache size of recursive will rise, recursive can not just simply remove
> them like some other random label attack.
As PRSD (Pseudo Random Subdomain) attacks as I call them or waterfall attacks
as others call them are usually asking every subdomain once (and these botnets
take great care on doing this) the record would be removed by the least
recently used (LRU) algorithm when other records that are used more are
questioned.

While these attacks on recursive resolvers can stress the recursive to
authoritative part of the resolver there are techniques to limit the
exposure to clients. I did gave a talk on that at DNS-OARC 2015 Spring
Workshop in Amsterdam on that topic:
	https://indico.dns-oarc.net/event/21/contribution/29
and the summary of it is that all major vendors of recursive resolvers
handle this, so again while your solution would be one once universally
deployed there are already solutions to the problem out there, so why do
another one?

So long
-Ralf