Re: [DNSOP] Responding to MSJ review of the previous rfc5011-security-considerations

Wes Hardaker <wjhns1@hardakers.net> Mon, 16 October 2017 21:35 UTC

Return-Path: <wjhns1@hardakers.net>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3907713247A for <dnsop@ietfa.amsl.com>; Mon, 16 Oct 2017 14:35:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zbXrU2K9U6U1 for <dnsop@ietfa.amsl.com>; Mon, 16 Oct 2017 14:35:03 -0700 (PDT)
Received: from mail.hardakers.net (dawn.hardakers.net [IPv6:2001:470:1f00:187::1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C71DC1331E7 for <dnsop@ietf.org>; Mon, 16 Oct 2017 14:35:03 -0700 (PDT)
Received: from localhost (50-1-20-198.dsl.static.fusionbroadband.com [50.1.20.198]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.hardakers.net (Postfix) with ESMTPSA id 1F8C022637; Mon, 16 Oct 2017 14:35:02 -0700 (PDT)
From: Wes Hardaker <wjhns1@hardakers.net>
To: Michael StJohns <msj@nthpermutation.com>
Cc: dnsop@ietf.org
References: <yblmv5ya5yc.fsf@wu.hardakers.net> <840723c3-899a-f57b-caa1-48f14b3686e8@nthpermutation.com>
Date: Mon, 16 Oct 2017 14:34:59 -0700
In-Reply-To: <840723c3-899a-f57b-caa1-48f14b3686e8@nthpermutation.com> (Michael StJohns's message of "Fri, 29 Sep 2017 14:49:06 -0400")
Message-ID: <yblpo9mepz0.fsf@wu.hardakers.net>
User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/jkZ-UrNAC5I_ZfLjf5SWnTDS_lc>
Subject: Re: [DNSOP] Responding to MSJ review of the previous rfc5011-security-considerations
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Oct 2017 21:35:06 -0000


Mike,

Here's some responses to your comments from last time out.  I'm
only including the ones that needed a response or had an actionable
item.


1.12 FIXED Section 4.1:  This doesn't actually describe what's in 5011 -
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  specifically bullet 3 was modified to clarify your concerns, but isn't
  what was in 5011.  You need to have both here.

  + Result: Intro paragraph added to note we're paraphrasing 5011 at a
    very high level only.

  + MSJ responds:

    Nope - not fixed.  You have "This document discusses the following
    scenario, which is one of many possible combinations of operations
    defined in Section 6 of RFC5011:" followed by "3. Begin to
    exclusively use recently published DNSKEYs to sign the appropriate
    resource records."

    5011 does not define this as a possible operation.

    Instead: "This document discusses the following scenario which is a
    combination of the operations of sections 6.1 and 6.2 but ignores
    the guidance of the first paragraph of section 6 which recommends at
    least two keys per trust point at all times."

  + Result: changed the sentence in question to:

    This document discusses the following scenario, which the principle
    way RFC5011 is currently being used (even though Section 6 of
    RFC5011 suggests having a stand-by key available)


1.16 FIXED2 Section 5.1.1 - You're missing a *very* important point here -
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  that DNSKEY RRSets may be signed ahead of their use.  You need to
  assume that once signed, they are available to be published - even by
  an attacker.  So wherever you have "signature lifetime" you want
  something like "latest signature expiration of any DNSKEY RRSet not
  containing the new key" or at least you want to calculate the
  date/time value based on that.

  + Result: There are two issues here:
    1. When we discuss the exact requirements for publication, we should
       be very very clear about the timing requirements.  I agree.
    2. We're trying to pass on the concept of the attack in this
       section, not necessarily a description that exactly covers all
       possibilities.  So, though I'm all for making it as accurate as
       we can, I don't think we should make the example text so
       confusing to cover all the corner cases that no one can follow
       it.
    3. It doesn't benefit an attacker to publish the signatures ahead of
       time. So you're right that anyone can publish new signatures, it
       really doesn't affect the timing required by the publisher to
       wait, which is the point of this draft.
    4. The important take away I take from your text is that any delay
       between signing and publication will affect the length of time to
       wait, and I'm sure this is what you mean by needing to calculate
       via wall-clock (since everything should be based on
       sigexpiration).

  With this goal in mind, I've cleaned up the text a bit to make it
  a bit more clear.

  + MSJ Responds about point 3:

    The attacker doesn't publish the signatures - the publisher has
    signatures it won't be using....    the publisher signs stuff way in
    advance of publication because getting people together and getting
    the HSM unlocked to sign things is a big huge expensive
    production. If the publisher doesn't think to modify its signing
    schedule in advance of a 5011 action, then your interval
    calculations are less then useless.

  + Fair point, thanks for clarifying.  I've added the following text to
    5.1.1:

    Note that for simplicity we assume the signer is not pre-signing and
    creating "valid in the future" signature sets that may be stolen and
    replayed even later.

    I've also changed the terminology of sigExpirationTime:

    sigExpirationTime: The amount of time remaining before any existing
    RRSIG's Signature Expiration time is reached.  Note that for
    organizations pre-creating signatures this time may be fairly
    lengthy unless they can be significantly assured their signatures
    can not be replayed at a later date.  sigExpirationTime will
    fundamentally be the RRSIG's Signature Expiration time minus the
    RRSIG's Signature Inception time when the signature is created.


1.17 NOTHINGTODO Section 5.1.1 doesn't actually apply if you use the 5011 rollover
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  approach (section 6.3).  E.g. K_old (and any RRSets it signed) will be
  revoked the first time K_new is seen and K_standby is the signing key.
  At this point this reduces to a normal denial of service attack (where
  you prevent new data from being retrieved by the resolver).  You'd
  need a different section to do that analysis. [And thinking about it,
  why is there any practical difference between this attack and a normal
  denial of service attack in the first place?]

  + Result: As we've both agreed in the past, the attack described in
    our 5.1.1 section only applies when you sign exclusively with a key
    that is too new.  So, yes when you are using K_standby, the attack
    in question doesn't work.  We're only describing the case where
    there either isn't a K_standby, or when K_standby is also newer than
    our 'waitTime' time.

  + And, yes by preventing a new key from being accepted as a trust
    anchor, this is a denial of service attack.  Though one with
    potentially serious ramifications since it may require manual
    intervention on all the devices affected by it (unlikely a
    network-based DDoS attack, it doesn't stop when the attacker stops
    sending packets; this is long lived until the configuration is
    manually fixed).

  + MSJ responds:

    Section 5.1.1 does not apply to preventing a resolver from seeing a
    revocation.  The calculations are different.   You could add a new
    section describing the revocation attack, but I think all you need
    to do is note that at the beginning of 5.1.1 and point to section
    6.2 as the mitigation math.

  + Result: 5.1.1 does not now say that it applies to revocation and
    specifically discusses "The timing schedule listed below is based on
    a Trust Anchor Publisher publishing a new Key Signing Key (KSK),
    with the intent that it will later become a trust anchor."


1.19 FIXED Section 6 - the formulas are wrong.  I also  don't understand
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  where you got MAX(originalTTL/2,15 days) - there's no support for this
  in the text.

  You're misreading the commas.  One of the terms in the outer max
  clause is "MAX(original TTL of K_old DNSKEY RRSet) / 2".  This is
  slightly different than what is in the "1/2*OrigTTL" clause in 5011
  itself.  This is because if the publisher changes TTLs over the course
  of signing, you have to take the maximum value of any of them, not
  just the most recent.  (though, to be super-accurate you need to do
  some math which we might want to describe about when a given TTL is
  published vs when Knew is introduced).

  Anyway, in the end, the formula in ours draft directly derives from
  what is in yours.  We do take into account the possibility of multiple
  TTLs for a given signature set, which 5011 doesn't take into account
  (and to some extent, it's less important, but only further shows how
  much variance a resolver might have before accepting a new trust
  anchor).

  A clear piece of advice for an eventual BCP would be to not change
  TTLs at the same time you start any 5011 publication or revocation
  process.

  + MSJ responds:

  Note that in 6.1 you have 5 terms, but in the fully expanded equation
  in 6.1.6 you have 4.  You're missing the safetyMargin which you didn't
  actually define completely in section 6.1.5.

  + Result: I think you mean activeRefreshOffset, as safetyMargin was defined
    (though we had to change it again due to the possibility of
    extremely short TTLs).  But I have added the expansion of
    activeRefreshOffset to the equation; thanks for catching that.  I
    haven't changed the definition since I don't see any missing pieces
    to it (or to the safetyMargin definition).

-- 
Wes Hardaker
USC/ISI