Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Wes Hardaker <wjhns1@hardakers.net> Tue, 12 December 2017 01:03 UTC

Return-Path: <wjhns1@hardakers.net>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7D3A9128D2E for <dnsop@ietfa.amsl.com>; Mon, 11 Dec 2017 17:03:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.107
X-Spam-Level:
X-Spam-Status: No, score=-1.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RDNS_NONE=0.793, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Vab39y1lFFw3 for <dnsop@ietfa.amsl.com>; Mon, 11 Dec 2017 17:03:10 -0800 (PST)
Received: from mail.hardakers.net (unknown [168.150.192.181]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 41359126DFE for <dnsop@ietf.org>; Mon, 11 Dec 2017 17:03:10 -0800 (PST)
Received: from localhost (unknown [10.1.0.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.hardakers.net (Postfix) with ESMTPSA id 9DC4E25F7B; Mon, 11 Dec 2017 17:03:09 -0800 (PST)
From: Wes Hardaker <wjhns1@hardakers.net>
To: Michael StJohns <msj@nthpermutation.com>
Cc: Wes Hardaker <wjhns1@hardakers.net>, dnsop@ietf.org
References: <151199364931.4845.3034001091375154653@ietfa.amsl.com> <yblvahshg6z.fsf@wu.hardakers.net> <9c71768d-4807-3d0a-b4b1-0ac8d066fe9f@nthpermutation.com> <yblindiavlm.fsf@w7.hardakers.net> <6d239b9a-fd1e-46a3-c705-6851dd8ffe0a@nthpermutation.com>
Date: Mon, 11 Dec 2017 17:03:09 -0800
In-Reply-To: <6d239b9a-fd1e-46a3-c705-6851dd8ffe0a@nthpermutation.com> (Michael StJohns's message of "Fri, 8 Dec 2017 00:43:36 -0500")
Message-ID: <ybl8te8kbaq.fsf@wu.hardakers.net>
User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/b4-REY66Td9nMvZpUrpDYpBtijY>
Subject: Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Dec 2017 01:03:11 -0000

Michael StJohns <msj@nthpermutation.com> writes:

Hi Mike,

Thanks for explaining your thinking because I think, after reading it:
we're actually in agreement but using different terms for where to put
in the slop you're worried about.

Specifically:

> A perfectly operating resolver with perfect clock and perfect
> connectivity and no outages MIGHT possibly keep a perfect interval
> between each query it makes (making your activeRefreshOffset
> meaningful), but 10000 resolvers ALL keeping perfect intervals?

Yes, I agree.  But, this is why I want the majority of the equation to
be defining the mathematical perfect certainty.  And then *after* that,
add the operational slop factor (safetyMargin) to account for both
problems and reality (you forgot to add "speed of light issues" in your
text above, for example).

Thus, I break the equation into two critical parts:

addWallClockTime = lastSigExpirationTime
                   + addHoldDownTime
                   + activeRefresh                   ^
                   + activeRefreshOffset             |
                                                     |
Precise Math                                         |
-----------------------------------------------------|
Needed Fuzz                                          |
                                                     |
                   + safetyMargin                    |
                                                     v


IE, if a perfect resolver hitting a RFC5011 zone with an activeRefresh
that evenly divides into 30 days:

  1) queries at T--- = lastSigExpirationTime - .000001
  2) queries at T+1--- = lastSigExpirationTime - .000001 + activeRefresh
  3) Notes that it just saw a new key (assuming worst case #1 is replayed)
  4) starts timer
  5) will query again at lastSigExpirationTime + 30 days - .000001
  6) notes this is still in waiting period
  7) will query again at lastSigExpirationTime + 30 days - .000001 + activeRefresh
  8) now notes that it's been 30 days and accepts key

There is only 1 activeRefresh in that sequence.  And that's what's in
the equation.  Because the timing distance between #7 and #2 is still 30
days when queried to the perfect sub-nano second.

Then there should be a bunch of delays inserted, network timeouts, etc.
That's where the safetyMargin should come in and catch all the issues
with the impreciseness of the real world.  Now, if you want to add an
activeRefresh to the already defined safetyMargin suggested term, I'm
willing to consider that.  But it shouldn't be listed as part of
anything but the slop term for security analysis clarity.

Would you like to add more time to the safetyMargin to deal with the
non-perfect world, including clock drift because of time delays in a
bunch of queries back to back or any other reason?


Ending note about the precise timeline: when 30 days isn't divisible by
the activeRefresh, then you need to add the other term we haven't talked
about much which is the activeRefreshOffset which accounts for this
case.

Cheers,
Wes