[DNSOP] state management related to TTL

Paul Vixie <paul@redbarn.org> Wed, 15 November 2017 06:44 UTC

Message-ID: <5A0BE237.5090104@redbarn.org>
Date: Tue, 14 Nov 2017 22:44:07 -0800
From: Paul Vixie <paul@redbarn.org>
User-Agent: Postbox 5.0.20 (Windows/20171012)
MIME-Version: 1.0
To: "dnsop@ietf.org" <dnsop@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/zRuuXkwmklMHFvl_Qqzn2N0SOGY>
Subject: [DNSOP] state management related to TTL
Precedence: list

tonight's exchanges here related to "use-stale" seem discordant to me. 
i'd like to play the straight man for a moment and ask some indulgent 
person to bring me up to speed by way of correcting my impressions.

the DNS TTL field is a state management variable. in this case the held 
state is in the form of cached RRsets, and the TTL associated with the 
RRset describes the period of time during which they can be reused. by 
the original DNS specifications, after this reuse period, these RRsets 
are to be discarded, and if the data is still needed, it is re-fetched.

in practice, TTL expiry is often not discovered until the records are 
about to be reused; this avoids the cpu and memory bandwidth costs of 
sweeping the cache periodically in search of expiration-ready RRsets, 
and avoids the additional state requirements of threading these RRsets 
by TTL in addition to the standard cost of threading them by recency of 
use (to facilitate LRU based purge when the cache reaches its limit.) 
what this practice leads to is a "sudden concurrent need" for the RRset 
at the precise moment when it is being discarded.

in order to avoid simultaneous "not having" and "great need", some RDNS 
servers do in fact sweep their caches or perhaps thread their RRsets by 
TTL expiration, in order to pre-launch a refreshment query when the TTL 
still has some fraction (like 5%) or period (like one minute) remaining. 
this is non-ideal since we often find that we're refreshing data that 
will not be used soon or perhaps ever. work is underway by several teams 
to find a "tuning set" of variables and thresholds which will better 
predict reuse in order to avoid refresh costs for non-reuse.

another method that's been deployed of avoiding simultaneous "don't 
have" with "great need" is to liberally reinterpret TTL such that RRsets 
can be reused beyond their explicit TTL lifetime, while their refresh 
queries proceed in the background. commonly, the authority servers 
responsible for answering these refresh events are down or unreachable 
at the time of most acute need. therefore the term "serve stale" to 
indicate a state management method whereby stale (beyond its TTL) data 
is served for some period of time, measured in minutes or hours, until 
the authority server can be reached to either refresh the RRsets or 
verify that they have in fact disappeared.

the danger of TTL stretching is that reuse beyond TTL may cause RRsets 
that are in fact supposed to be unreachable, to be effectively 
reachable. examples include security-related takedown of criminal DNS 
servers or networks, or failover strategies where end systems will not 
try to reach their backup servers unless they cannot reach their primary 
servers, and the unreachability of those primary servers is hidden from 
them by TTL stretching. fundamentally, an RRset and its TTL are the 
property of the zone administrator, and it's controversial for any other 
party to use this data beyond its specified use parameters.

all of this trouble comes from DNS's use of a single state variable 
(TTL) to represent usability lifetime, rather than two such variables, 
one indicating the periodicity of refresh, the other indicating the 
periodicity of discard. many of us would like our data to be rechecked 
hourly by all caching servers who store it, but used for days or weeks 
if we become unreachable by some or all of those servers. using one 
variable for two purposes represents an inconvenient compromise which 
often provides "no right answer" as to setting. therefore an idealized 
solution would be to provide a second variable, and where that second 
variable is present, the meaning of the existing variable (TTL) could be 
subtly altered to support a two-variable setting.

therefore a "serve stale" team within IETF-DNSOP was convened, to try to 
standardize the methods and signal patterns necessary to extend the 
usability lifetime of records when their authority servers are not 
reachable at the time of normal TTL-based expiry. most of us recognize 
that TTL's will continue to be stretched no matter what changes are or 
are not made to the specification, and so we expect the resulting RFC to 
document current practice _without recommending it_ and to also document 
a new practice _with recommendations_ as to its proper uses.

there are hangups in signaling options due to the sloppy specification 
for EDNS, about which the author of EDNS0 feels just awful, believe me. 
however, we are all relatively sure that EDNS can be used to encode a 
desire for new state management behaviour, within the limitation that 
EDNS must first be signaled by the initiator before it can be answered 
by a responder, and we might wish it otherwise. that's why it was 
important to realize that if _any_ EDNS option is provided by an 
initiator, then _any_ EDNS option can be provided by a responder. in 
theory this means we could provide state management options in a 
response without having heard any state management options in a request 
-- so long as some form of EDNS was in fact used in the request. it's 
not yet clear that this evasive maneuver will be required, however.

the most straightforward signaling would be for an RD=0 initiator 
(normally a recursive DNS server) to ask some or all of its responders 
(normally authority servers) for permission to stretch the TTL. some 
responders will not answer this signal at all, some will say no, and 
some will say yes and give maximum tension values for the RRsets 
contained in the answer and authority sections -- but not for the 
additional section since that data might have a different authority 
server and may only be present as "glue". the new tension variable might 
be "maximum stretch interval" in which case the RRset's TTL _in this 
answer or authority section_ would be interpreted as a refresh interval. 
this system would allow gradual insertion of the new state management 
logic on an opportunistic basis -- motivated authority and recursive 
server operators, which would include CDN operators who must perform 
both services perfectly -- would be early adopters, and like ECS before 
it, the "hot" part of the community would be upgraded years earlier than 
the last outlier.

noone has proposed any new signaling between the stub and the recursive, 
but it's possible that a stub may want a true TTL and so we might add 
signaling from the stub (as initiator) saying, don't stretch, or perhaps 
saying, if this is a stretched TTL, tell me so explicitly.

if this understanding isn't wrong or incomplete, then i fail to see why 
there would be any drama that would prevent the construction of a draft.

-- 
P Vixie

[DNSOP] state management related to TTL Paul Vixie
Re: [DNSOP] state management related to TTL Dave Lawrence