Re: [DNSOP] 5011-security-considerations and the safetyMargin

Michael StJohns <> Fri, 17 November 2017 08:55 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id A7B3A1200E5 for <>; Fri, 17 Nov 2017 00:55:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 2BVpN_bZyXyc for <>; Fri, 17 Nov 2017 00:55:31 -0800 (PST)
Received: from ( [IPv6:2a00:1450:400c:c0c::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id CDD661200CF for <>; Fri, 17 Nov 2017 00:55:30 -0800 (PST)
Received: by with SMTP id k61so1449365wrc.4 for <>; Fri, 17 Nov 2017 00:55:30 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=b/c8zm4ZK6s1lY8Vd+55Txuf2xHir1RnzddIAccF2bU=; b=Y7MazjyvT4KcsPpnPUaUknBhJoKyIOsgkTmBYPIyFJTjG2F7kDtRHvJWuARj1LVgME uz/vb2TJQLnkxAk/6xsdM+K8DCTXb1ceezz75RJlnED+pxT/DexmD5lOOgAdDJOeBMwr w3/4bGGUhqrX3YL2IN+Y8LXf2rc78ylKOlWUI16R3I3k/T4MMAoUwaHjzySOSZtZYasa gKWqKJwUGTbAyvs/PzL5BhWIGzxNZGsRo2LYh1jUlFD2UdyEsuSJ6rfKMSsXyFNFe0V2 65yqhlHyOS8WG9SyvU6G7xpH5LFNn5A1p/dFYxyWLlPX5+P5ZHbj5146nkXI1l0U2qIF nDTw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=b/c8zm4ZK6s1lY8Vd+55Txuf2xHir1RnzddIAccF2bU=; b=J6Rre1tN8uSqeOG5r57Il1JzoTB7386IbIaRdNuSc+axWvq/PFZUzHAqu2GF+jR4NT aZxCN+6wJ4KlwhbdalGSAQBZDl57+e1A5FFT3LSvevX4Ow7QsMjoVVROmAc4Jbck3g1P boXOFOa7728rsIQqCQYVWnyL+jzcFdKMQBBrMP1OF+ObkJmIM0moaXC4UrInbJFJlITO 8Z301Y2sl5mRekgxavtMNOTZT4KjxlPtULaR5n16v5r5hfYB4MYZYGLOLhsNLSwNOaqP DsgLkH0h1YPrtwm3l4UrEiiouajL8wthXRNuSoAMq6CVsVtI9IrOi2zFnjQqNkbbn/6x GiEw==
X-Gm-Message-State: AJaThX4dVEgyioJ/70sv0y/rlQLkm/Ha9bYOeQjKuUi841dV6MNahw4d 1P3cHmCeWgzHM9b26WHr7661tivgQwcXKeGknPm8hg==
X-Google-Smtp-Source: AGs4zMaaNPH9hAi8TGGTlsZK7cgAWLOP78/iOMvLmlYV1SrzK7BJQsfjYIOKZtEE6Qczwb5fPGiy3vh3cZ42WfjPT+Y=
X-Received: by with SMTP id n14mr4082474wrh.46.1510908929198; Fri, 17 Nov 2017 00:55:29 -0800 (PST)
MIME-Version: 1.0
References: <> <> <>
In-Reply-To: <>
From: Michael StJohns <>
Date: Fri, 17 Nov 2017 08:55:18 +0000
Message-ID: <>
To: Bob Harold <>
Cc: IETF DNSOP WG <>, Matthijs Mekking <>
Content-Type: multipart/alternative; boundary="089e082fc4b0036097055e29e609"
Archived-At: <>
Subject: Re: [DNSOP] 5011-security-considerations and the safetyMargin
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 17 Nov 2017 08:55:34 -0000

1 something.  I also believe it should scale as to the size of the
subscription base in some way.   Basically, this should answer the question
of “ given a set of subscribers of size N, a per request failure rate of f,
a retry time of R, how long do you have to wait until P% of the subscribers
have had a successful query?”

So with P == 99.99%, R being the fast retry time, f being 5%? And N being
10M, it should be possible to calculate how long to wait.  Basically a
cumulative distribution function.  When I get back to my laptop, I’ll check
in with excel and see if I can give you an idea of what this looks like.

That gives you a reasonable lower bounds for the uptake AFTER dealing with
expiring old records and reprents N resolvers doing their last query to
validate the new trust anchor.

I estimated it as a log function time retry time in an earlier email.

Later, mike

On Fri, Nov 17, 2017 at 13:27 Bob Harold <> wrote:

> On Thu, Nov 16, 2017 at 8:33 AM, Matthijs Mekking <>
> wrote:
>> Wes,
>> My preference is to include safetyMargin and have text to explain it
>> exists because of network delays etcetera, and also reference to RFC 5011's
>> retryTime. So that's some mix of 1B 1C or 2B I guess :)
>> Best regards,
>>   Matthijs
>> On 15-11-17 02:49, Wes Hardaker wrote:
>>> The discussion has been long with respect to the safetyMargin in the
>>> 5011 security considerations document.  There hasn't been a huge
>>> conclusion and many different ideas have been floated by, and we're now
>>> at the point where we need to pick between them.  Below, I try to lay
>>> out the primary and sub-options available based on discussions so far.
>>> Please provide your opinions on your preference so I can wrap up this
>>> draft.
>>> Background: This document is not intended to provide operational
>>> guidance on what you SHOULD do.  It's intended to draw the timing line
>>> below which you MUST NOT venture.  The safetyMargin was introduced to
>>> prevent race conditions based on network delays, eg, which can mean that
>>> a RFC5011 Resolver operating at the same time as a PEP Publisher making
>>> a change at exactly at the minimum addWaitTime or addWallClockTime
>>> values would lead to a failure.  So the primary question today is "how
>>> do we want to deal with this issue of real-world speed-of-light and
>>> other issues?".  To complicate this a bit further, packets are never
>>> guaranteed to be delivered and network losses can entirely prevent a
>>> 5011 Resolver from succeeding at all for a given operation.
>>> Option 1.  Include a safetyMargin of some value.
>>>             1A) safetyMargin = MAX(2*TTL, 1.5Hr)   -- current draft
>>>             1B) safetyMargin = something based on the retryTime,
>>>                 (an example solution was suggested by MSJ)
>>>             1C) Your value here
>>> Option 2.  Don't include a safetyMargin
>>>             2A) Just ignore the issues entirely
>>>             2B) Explain that this document does not cover operational
>>>                 complexities like retries (already in the -07 version),
>>>                 network delays and other operational issues.
>>> After thinking about this for far far too long, I've now switched my own
>>> opinion to that of 2C for the principal reason that this is the
>>> line-in-the-sand document, and to be honest people should be using
>>> values much larger than this, per MSJ's guidance on how 5011 should be
>>> used.  Thus, it makes more sense to define this as the "MUST NOT go
>>> below this line" without trying to be precise about a value that can
>>> never be
>>> perfectly accurate, by definition.
>>> But, forget my opinion.  What's yours?  If nothing else, pick one of the
>>> [12][ABC] options above please, even without any text defining why you
>>> think so (until someone pokes you).
> I prefer 1A since the reasoning is well documented.
> or  MAX(1A,1B), but that is more complex for little gain.
> --
> Bob Harold
> _______________________________________________
> DNSOP mailing list