Re: [DNSOP] [Ext] I-D Action: draft-ietf-dnsop-serve-stale-03.txt

> On 5 Mar 2019, at 2:59 pm, Paul Wouters <paul@nohats.ca> wrote:
> 
> On Tue, 5 Mar 2019, Mark Andrews wrote:
> 
>>> On Tuesday, 5 March 2019 02:21:42 UTC Christopher Morrow wrote:
>>>> can I ask, what happens when a domain is intentionally down though? For
>>>> instance, take .eg... ~4yrs back? (maybe 5?) Someone requested that the
>>>> master/shadow NS go down, hard. All public auth servers eventually (in a
>>>> day or so) went dark too.
> 
> 	"If the recursive server is unable to contact the
>         authoritative servers for a query "
> 
> So make the DNS server reachable, but return ServFail or NXDOMAIN. If
> the owner doesn't cooperate and there is legal standing, talk to the
> parent to do this for the delegation.
> 
> I don't think this draft stops a domain from being brought down
> intentionally.
> 
>>> i already raised that question, very far up-thread. got no answer.
>>> 
>>>> If someone is 'ordered' to make a zone dark, there may be reasons for that
>>>> action, and real penalties if the request is not honored.
>>>> Is this draft suggesting that the DNS operations folk go against the wishes
>>>> of the domain owner by keeping a domain alive after the auth servers have
>>>> become unreachable? How would a recursive resolver know that the auth is
>>>> down: "By mistake" vs: "By design" ?
> 
> The DNS resolvers who want to accomodate their governments need to
> manually override their resolvers anyway with new (forged) data. This
> draft does not change that.
> 
> If the owner itself wants to bring the domain down, they just need to
> make its auth servers reachable.
> 
> If the DNS hoster wants to bring it down, they just need to modify the
> data it serves resulting in NXDOMAIN, ServFail or 127.0.0.1 A records.
> 
>>> this the essence of the argument against utility for this entire proposal. no
>>> data should be served beyond its TTL unless some new leasing protocol is first
>>> defined, to obtain permission and to provide a cache invalidation method.
> 
> I don't really follow this reasoning. Are you saying that:
> 1) if the domain owner wants their domain to be reachable
> 2) and they have lost their auth servers due to a DDOS attack
> 3) they might prefer to be down over extending the TTL a bit
> 
> In the non-DDOS case, the auth server is reachable and none of the data
> is getting additional TTL added:
> 
>   Answers from authoritative servers that have a DNS Response Code of
>   either 0 (NOERROR) or 3 (NXDOMAIN) MUST be considered to have
>   refreshed the data at the resolver.  In particular, this means that
>   this method is not meant to protect against operator error at the
>   authoritative server that turns a name that is intended to be valid
>   into one that is non-existent, because there is no way for a resolver
>   to know intent.
> 
> Although perhaps it should also explicitely state this regarding
> ServFail ?
> 
>> And one can to that if we add 2 TTLs to each DNS record. One for total time to
>> live and one for freshness (old client get this).  It will require EDNS to signal
>> that multiple TTLs are desired and are present in the response and may require
>> using the last DNS flag bit to move the OPT record to in front of the question
>> section to make parsing easier (no trial and error).
>> 
>> Yes, this is radical but it will work and is incrementally deployable.
> 
> See above. It seems a bit overkill for a strange corner case. But even
> so, one could specify the maximum ever allowed TTL to be like 3 days? Which
> I think is kind of enforced by most DNS resolvers anyway?
> 
> Adding more TTLs would just make this more complicated and more error
> prone and not lead to reduced outage times which is the goal of this
> draft.

For those with long existing TTLs it would allow them to refresh the records
earlier allowing them to exist in caches longer during the DoS event.

For those with short existing TTLs it would allow them to control how long
the records exist past reachability of the DNS servers.  Most of the time
short lived TTLs are there “just in case we need to change them" or they
are doing short term load balancing between a set of servers which are
actually at stable addresses.  This would allow those functions to continue
but keep the addresses in caches longer under error conditions.

> I don't think the "4 years" is a realistic problem case.
> 
> I can see how people want to get a few hours or a few days of usage
> beyond the TTL to accomodate for errors. Although, it is likely that
> moving up the error this way will also delay the error from being
> detected before the extra time has expired, and we are just moving
> the goal post with no effective gain. But in the case of a DDOS
> attack, the draft's feature is surely useful.

Monitoring will detect problems.

> Paul
> 
> _______________________________________________
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742              INTERNET: marka@isc.org