Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc2845bis-08.txt

Stephen Morris <sa.morris8@gmail.com> Mon, 04 May 2020 18:07 UTC

Return-Path: <sa.morris8@gmail.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B58B13A0E88 for <dnsop@ietfa.amsl.com>; Mon, 4 May 2020 11:07:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.848
X-Spam-Level:
X-Spam-Status: No, score=-1.848 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KiLFcmxqVAvu for <dnsop@ietfa.amsl.com>; Mon, 4 May 2020 11:07:20 -0700 (PDT)
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C9DE23A0E83 for <dnsop@ietf.org>; Mon, 4 May 2020 11:07:19 -0700 (PDT)
Received: by mail-wm1-x332.google.com with SMTP id g12so530821wmh.3 for <dnsop@ietf.org>; Mon, 04 May 2020 11:07:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:date:subject:references :to:message-id; bh=Mzq5pb7kOQL2SlLVi7N9dRc4wa7txZUGJv65CW5uRJg=; b=O84mh2bGKdEYJla3N0wghHzdrMiH7CyJtYrwsAaJC0BIurTCbVMhq1kPKAWLTcP/hh JcREoF4Q4t0iQLNdgld2PcGTgOaefrxg+igjt8p3S7e47i46bxZfe86H5WGajT1dV2t4 ZR2NR9Gp81a8acbxqdcm8qquJIk049Q0lkLwxz8zRSkftNj+bE6b79lTgZeYCR2y0UUC ZAQJgycL4ILD2xxZDMqhzDPfnS7v7nF/wNv0l6BlcZZbhAiluupM3V40AbWPxR+wyoaP MsG/dKkQJcOn2dhnSfxNRXC/0HWyXYX1JwIWEzgF68gjlZQS0kYa+8M6j4hlL9kjJzOf nGSg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version:date :subject:references:to:message-id; bh=Mzq5pb7kOQL2SlLVi7N9dRc4wa7txZUGJv65CW5uRJg=; b=hmQ9vxMcv/Q4DLCKkrLVuuXx4Ve6GjUWlBBJvCREg5y/K090I4A78TRg4gbdA0hVuB AAkgwftxJaRWnPiIgOJbDhJ5Dxt2t4unfXULTZkX9D80noUh9TNBnE8gzXLDHZGRK0/b reN1UD7+LQm//i2JqWN90LEAXvtbkE2SLVWxGe6E33AwehPHBslh0rSmiFu1vimoab/A z8WVZ7olszAb2u2ocLYjU5xb3QueeiuD9pB2d36GupEI2PKcAFhwKiNvvU8cCcV/M1BP hqwPAD8YtOfiHI92D5dAtbpfCuI7gcSuvfwbpMfo9NaK5mHGeKbMiXuNzLJAECrqxLD9 4m0w==
X-Gm-Message-State: AGi0PuazkiG1wkebnam9AN/T+vPlLhPJArwRZzn7HSa2N/4I2ghUA9y4 /RcB7Z36rJVlhFUS/uHnC4D/Gd5s
X-Google-Smtp-Source: APiQypI/qDCjH23wGTScjsVvpYFVjWEPN1TLa90AsmG6peQ8VdfMb4N3gAy7l/mrcwwpccWmLzkCjg==
X-Received: by 2002:a1c:41d7:: with SMTP id o206mr16069931wma.89.1588615636839; Mon, 04 May 2020 11:07:16 -0700 (PDT)
Received: from [192.168.0.10] (cpc117486-shep15-2-0-cust40.8-3.cable.virginm.net. [82.8.78.41]) by smtp.gmail.com with ESMTPSA id q184sm306947wma.25.2020.05.04.11.07.13 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 May 2020 11:07:14 -0700 (PDT)
From: Stephen Morris <sa.morris8@gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
Date: Mon, 04 May 2020 19:07:12 +0100
References: <C14C5908-2CAE-4A56-A84A-C6CC546D7B17@gmail.com>
To: dnsop@ietf.org
Message-Id: <80827E5F-F0DF-4A3C-BD2D-9E57991337FD@gmail.com>
X-Mailer: Apple Mail (2.3445.104.11)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/QgKToRCFoN3PXBdyd8FTuYvtLQ8>
Subject: Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc2845bis-08.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 May 2020 18:07:25 -0000

> On 4 May 2020, at 19:00, internet-drafts@ietf.org wrote:
> 
> 
> A New Internet-Draft is available from the on-line Internet-Drafts directories.
> This draft is a work item of the Domain Name System Operations WG of the IETF.
> 
>        Title           : Secret Key Transaction Authentication for DNS (TSIG)
>        Authors         : Francis Dupont
>                          Stephen Morris
>                          Paul Vixie
>                          Donald E. Eastlake 3rd
>                          Olafur Gudmundsson
>                          Brian Wellington
> 	Filename        : draft-ietf-dnsop-rfc2845bis-08.txt
> 	Pages           : 29
> 	Date            : 2020-05-04


The update addresses to the draft comments made during the IESG review.  There were a fair number of these which led to a number of changes to the document.  These are listed below.

One significant change is that the list of acceptable algorithms has been extended, and WG approval for this is sought.

Stephen




Comments from Roman Danyliw
---
> ** Section 1.3.  Per “In 2017, two nameservers  strictly following that document (and the related [RFC4635]) were discovered to have security problems related to this feature”, consider providing a reference to the published vulnerabilities (i.e., CVE-2017-3142 and CVE-2017-3143)

I’ve added these (and one other related CVE) as informative references.  Using just the CVE number as a reference seemed a bit spartan, so I’ve added a title to each incident. As the description of the CVEs in Mitre’s database don’t contain a title (only a description, which can be rather long), I’ve taken the title from ISC’s KnowledgeBase (for the BIND CVEs) and from the Knot release notes (for the Knot CVE).


> ** Section 6.  Per “SHA-1 collisions have been demonstrated so the MD5 security considerations apply to SHA-1 in a similar manner.  Although support for hmac-sha1 in TSIG is still mandatory for compatibility reasons, existing uses should be replaced with hmac-sha256 or other SHA-2 digest algorithms [FIPS180-4], [RFC3874], [RFC6234].
> 
> -- It’s worth repeating those MD5 security considerations here

I’m not really keen on doing this, since the security considerations in RFC 6151 cover two paragraphs and including them - or even a summary of them - would detract from the flow of the document.  However, I have explicitly included a reference to RFC 6151 in the text.


> -- (from Magnus Nystrom’s SECDIR review, thanks Magnus!) it’s worth including references to the recent SHA-1 cryptoanalysis provided in the SECDIR review

Added references to just one of these papers (”SHA1 is a Shambles”).  We’re not doing a general analysis of the algorithms, so a simple reference to a paper than describes a SHA1 collision is all that is needed.  (As an aside, the paper doesn't give the date of publication, so I had to search the Web looking for references to it.  I think I’ve got the date correct, but I’d be grateful for a double-check.)


> -- The SHA-2 family should be a normative SHOULD (or RECOMMENDED).

Done - see Table 2


> ** Section 10.  Per “For all of the message authentication code algorithms listed in this document, those producing longer values are believed to be stronger”, as noted in Magnus’s SECDIR review, this could be misconstrued as the algorithm choice not the digest length provides the security.  Recommend rephrasing (or making some statement  

I’ve reworded this paragraph (in section 10).  It now reads:

  "Although the strength of an algorithm determines its security, there
  have been some arguments that mild truncation can strengthen a MAC by
  reducing the information available to an attacker.  However, excessive
  truncation clearly weakens authentication by reducing the number of
  bits an attacker has to try to break the authentication by brute
  force [RFC2104]."


> ** Editorial
> -- Section 4.3.2.  Per “When verifying an incoming message, this is the message after the TSIG RR and been removed and the ARCOUNT field has been decremented.”, this sentence doesn’t parse (is missing a word).
> 
> -- Section 4.3.2.  Per “A whole and complete DNS message in wire format.”, this isn’t a sentence.

Section 4.3.2 has been reworded to address these issues.



Comments from Benjamin Kaduk

> Thanks for putting together this update; it's good to see the security
> issue getting closed off in the udpated spec, and progression to full
> Internet Standard!  I do have several substantive comments (as well as
> some minor/nit-level ones), many of which are listed here at the top but
> a few of which are interspersed in the per-section comments.
> 
> 
> I considered making this a Discuss point, but it should be pretty
> uncontroversial and I trust that the right thing will happen even if I
> don't: there's a couple lingering remnants of SHA-1 being the
> preferred algorithm that need to be cleaned up, in Sections 5.2.2.1 and
> 10 (further detailed in the per-section comments).

The document now mentions SHA1 collisions and notes that the MD5 security considerations apply to SHA1.  It also mentions that support for SHA1 is mandatory for compatibility reasons but in existing uses it should be replaced by a stronger one.


> I also initially had made the following point a Discuss-level point, but
> decided to not do so since I don't remember any BCP-level guidance
> relating to cross-protocol attacks.  Nevertheless, I strongly encourage
> the authors to consider that cryptographic best practice is to use any
> given key with exactly one cryptographic algorithm.  The record format
> listed in Section 4.2 has the key name and algorithm as separately
> conveyed, which would allow for a given key to be used with all
> implemented algorithms.  We should include some discussion that it's
> best to only use a single algorithm with any given key.

The following text has been added to the Security Considerations section:

  "To prevent cross-algorithm attacks, there SHOULD only be one
    algorithm associated with any given key name."


> We also have a 16-bit wide field for "Fudge", which (since it counts
> seconds) corresponds to a maximum window of something like +/- 18 hours;
> it's hard to believe that we really want to give people the rope to
> allow for that much time skew.  (Yes, I understand that implementations
> will set something sane in practice but that doesn't necessarily mean
> that the protocol still has to allow it.)

That was set in the original RFC 2845.  Although I agree with the comments, changing the size of that field would be a significant undertaking.  However, section 10 does state that the RECOMMENDED fudge value in most circumstances is 300 seconds.


> Our authoritative list of algorithm names (Table 1) is rather divorced
> from the references to be consulted for the individual hash algorithms
> to be used with the HMAC procedure.  The ones used here are sufficiently
> well-known that I'm not terribly concerned about it, though.

The first paragraph of the IANA considerations section lists the algorithms and references to where they are described, and I didn’t want to duplicate the information.


> Abstract
> 
> The title says "DNS" but maybe the body of the abstract should as well?

In the abstract, changed:

"It can be used to authenticate dynamic updates as coming from an approved client"

to

"It can be used to authenticate dynamic updates to a DNS zone as coming from an approved client"


> Section 1.1
> 
> Some of this language feels like it might not age terribly well, e.g.,
> "this can provide" or "[t]here was a need".
> 
>   addresses that need.  The proposal is unsuitable for general server
>   to server authentication for servers which speak with many other
>   servers, since key management would become unwieldy with the number
>   of shared keys going up quadratically.  But it is suitable for many
>   resolvers on hosts that only talk to a few recursive servers.

Changed to:

        "The authentication mechanism proposed here provides a
        simple and efficient authentication between clients and local servers
        by using shared secret keys to establish a trust relationship between
        two entities.  Such keys must be protected in a manner similar to
        private keys, lest a third party masquerade as one of the intended
        parties (by forging the MAC).  The proposal is unsuitable for general
        server to server authentication for servers which speak with many
        other servers, since key management would become unwieldy with the
        number of shared keys going up quadratically. But it is suitable for
        many resolvers on hosts that only talk to a few recursive servers.”


> Should zone transfers be mentioned here as well?

Zone transfers are mentioned in the preceding paragraph.


> Section 1.2
> 
> I don't understand the motivation for changing terminology from MACs to
> "signatures"; they're still MACs even though they're transaction-based.

The mention of signatures in section 1.2 is intended as a link between the name of the RR (TSIG or Transaction Signature) and the term MAC used in the rest of the document.


>   MAC of the query as part of the calculation.  Where a response
>   comprises multiple packets, the calculation of the MAC associated
>   with the second and subsequent packets includes in its inputs the MAC
>   for the preceding packet.  In this way it is possible to detect any
>   interruption in the packet sequence.
> 
> I suggest mentioning the lack of mechanism to detect truncation of the
> packet sequence.

That is a fair point.  I’ve modified the last sentence to read:

   "In this way it is possible to detect any interruption in the packet sequence,
   although not its premature termination."


> Section 4.2
> 
>   NAME  The name of the key used, in domain name syntax.  The name
>         should reflect the names of the hosts and uniquely identify the
>         key among a set of keys these two hosts may share at any given
>         time.  For example, if hosts A.site.example and 
> B.example.net
> 
>         share a key, possibilities for the key name include
>         <id>.A.site.example, <id>.B.example.net, and
>         <id>.A.site.example.B.example.net.  It should be possible for
>         more than one key to be in simultaneous use among a set of
>         interacting hosts.
> 
> I'd suggest adding a note along the lines of "This allows for periodic
> key rotation per best operational practices, as well as algorithm
> agility as indicated by [BCP201].”

Added.


>         The name may be used as a local index to the key involved and
>         it is recommended that it be globally unique.  Where a key is
> 
> (nit?): this feels more like a "but" than an "and", to me.

Well spotted.  I also think it is more “but” than “and”


>         *  MAC Size - an unsigned 16-bit integer giving the length of
>            MAC field in octets.  Truncation is indicated by a MAC size
>            less than the size of the keyed hash produced by the
>            algorithm specified by the Algorithm Name.
> 
> nit: I would suggest "output size", as there are potentially a few
> different sizes involved (key size, block size, and output size, for
> starters, though I think the possibility of confusion here is low).

I disagree here. “MAC Size” is an unambiguous reference to this field, and the other size mentioned - that of the keyed hash is, I think, also unambiguous.


>         *  Other Len - an unsigned 16-bit integer specifying the length
>            of the "Other Data" field in octets.
> 
>         *  Other Data - this unsigned 48-bit integer field will be
>            empty unless the content of the Error field is BADTIME, in
>            which case it will contain the server's current time as the
>            number of seconds since 00:00 on 1970-01-01 UTC, ignoring
>            leap seconds (see Section 5.2.3).
> 
> I'm slightly confused at the interplay between the explicit length field
> and the "empty unless" directive.  Does this mean that the only allowed
> values in the "Other Len" are 0 and 6?  Does "empty" mean "length-zero”?

I’ve rewritten this slightly in a bid to make it clearer that “Other Data” being empty means that “Other Len” is zero.


> Section 4.3.1
> 
>   Only included in the computation of a MAC for a response message (or
>   the first message in a multi-message response), the validated request
>   MAC MUST be included in the MAC computation.  If the request MAC
>   failed to validate, an unsigned error message MUST be returned
>   instead.  (Section 5.3.2).
> 
> side note: while Section 5.3.2 specifies how to create an unsigned error
> message, it looks like Section 5.2 (and subsections lists specific
> RCODEs that are to be used.

That is the case.  Section 4 is a description of the TSIG RR and its fields.  Section 5 describes the contents of the fields in various situations (client transmission, server reception, server transmission, client reception).


> Section 4.3.2
> 
>   When verifying an incoming message, this is the message after the
>   TSIG RR and been removed and the ARCOUNT field has been decremented.
>   If the message ID differs from the original message ID, the original
>   message ID is substituted for the message ID.  (This could happen,
>   for example, when forwarding a dynamic update request.)
> 
> I trust (based on this text having survived while going for full IS)
> that there are no interesting record-keeping considerations with respect
> to knowing the message ID(s) to substitute, in the "forwarding a
> dynamic-update request" case, presumably since this is just the field
> from the TSIG RDATA and not some externally retained state.

That is correct - the original ID is stored in the TSIG record’s RDATA and it is from there that the original ID is retrieved when a substitution is made.


> Section 4.3.3
> 
>   The RR RDLEN and RDATA MAC Length are not included in the input to
>   MAC computation since they are not guaranteed to be knowable before
>   the MAC is generated.
> 
> I appreciate that this is explicitly noted; there are some security
> considerations regarding the non-inclusion of the (truncated) mac length
> as input, though.  The local truncation policy helps here, but not 100%.

OK


>   For each label type, there must be a defined "Canonical wire format"
> 
> Just to check my understanding: label types only come into play for the
> two fields that are in domain name syntax, key name and algorithm name?

There was actually an error in the text here: RFC 4034 section 6.1 is concerned with Canonical Name Order.  It is section 6.2 that details the canonical wire format - I’ve changed the reference.

Going back to the original point, yes, that is correct.  Label type 00 is uncompressed name. (11 is a compressed name, and label types 01 and 10 are discouraged - see RFC 6891 section 5).


> Section 5.1
> 
>   the server.  This TSIG record MUST be the only TSIG RR in the message
>   and MUST be last record in the Additional Data section.  The client
> 
> (Is there anything else that tries to insist on being the last record in
> the additional data section?  I guess it doesn't really make sense to
> try to Update: 1035 just to note this requirement.)

Not to our knowledge.


>   MUST store the MAC and the key name from the request while awaiting
>   an answer.
> 
> [This is going to end up alongside the request-ID that it has to store
> already, right?]

Yes.  The request MAC is included as one of the components used by the server to generate the response - which should be encoded using the same key as used in the request.


>   Note that some older name servers will not accept requests with a
>   nonempty additional data section.  Clients SHOULD only attempt signed
>   transactions with servers who are known to support TSIG and share
>   some algorithm and secret key with the client -- so, this is not a
>   problem in practice.
> 
> (The context in which this "SHOULD" appears makes it feel like it's
> repeating an admontion from elsewhere and not the only instance of the
> requirement, in which case a reference might be appropriate.)

I’ve removed this paragraph.  The reference to older name servers is now completely out of date (it was a carry-over from the original 20-year old text).  The rest of the paragraph seems superfluous - rather like stating that TCP clients should only connect to servers that support TCP.  


> Section 5.2
> 
>   If the TSIG RR cannot be understood, the server MUST regard the
>   message as corrupt and return a FORMERR to the server.  Otherwise the
>   server is REQUIRED to return a TSIG RR in the response.
> 
> As written, this could be read as an attempt to make a normative
> requirement of servers that do not implement this spec.  Presumably it's
> just restating a requirement of the core protocol, though?

It is the core protocol.

I see your point though.  However, by implication we are talking about a server that implements TSIG.


> Section 5.2.2
> 
>   Using the information in the TSIG, the server should verify the MAC
>   by doing its own calculation and comparing the result with the MAC
>   received.  If the MAC fails to verify, the server MUST generate an
> 
> Is there any other way to verify the MAC?  (Should this be a "MUST
> verify”?)

Well spotted, it should be a “MUST” - changed.


> Section 5.2.2.1
> 
>   When space is at a premium and the strength of the full length of a
>   MAC is not needed, it is reasonable to truncate the keyed hash and
>   use the truncated value for authentication.  HMAC SHA-1 truncated to
>   96 bits is an option available in several IETF protocols, including
>   IPsec and TLS.
> 
> Also Kerberos, where it was the strongest option for a while and we had
> to define a new encryption type to provide a better option (RFC 8009).
> 
> This text seems to be implying that HMAC SHA-1 truncated to 96 bits is a
> recommendable option, which is ... far from clear.  I'd prefer to have a
> note that this specific truncation was appropriate when initially
> specified but may not provide a security level appropriate for all cases
> in the modern environment, or preferrably to just change the reference
> to (e.g.) SHA-384-192 or SHA-256-128.

Added the following an the end of the paragraph:

  However, while this option is kept for backwards compatibility,
  it may not provide a security level appropriate for all cases
  in the modern environment. In these cases, it is preferable to
  use a hashing algorithm such as SHA-256-128, SHA-384-192
  or SHA-256-128 [RFC4868].

I’ve also added the algorithms “hmac-sha256-128”, “hmac-sha384-192” and “hmac-sha512-256” as optional algorithms to the algorithm table.


>       This is sent when the signer has truncated the keyed hash output
>       to an allowable length, as described in [RFC2104], taking initial
>       octets and discarding trailing octets.  TSIG truncation can only
> 
> (Or when an on-path attacker has performed truncation.)

True, but an on-path attacker can manipulate the message in any way possible.  I’m not sure whether adding this caveat will add anything to the document.


> Section 5.2.3
> 
>   (BADTIME).  The server SHOULD also cache the most recent time signed
>   value in a message generated by a key, and SHOULD return BADTIME if a
>   message received later has an earlier time signed value.  A response
> 
> (But there's no fudge factor on this check, other than the inherent
> limit of seconds granularity, as alluded to by the last paragraph of
> this section?)

The last paragraph in the section states that handling this issue should be left up to implementations and that the exact method (and by implication, the size of the fudge factor) be determined by the implementation themselves.  


> Section 5.3.1
> 
>   A zone transfer over a DNS TCP session can include multiple DNS
>   messages.  Using TSIG on such a connection can protect the connection
>   from hijacking and provide data integrity.  The TSIG MUST be included
> 
> (I assume that "hijacking TCP" means a sequence-number-guessing attack
> that would require the attacker to be winning the race against the
> legitimate sender to cause modified data to be introduced into the TCP
> stream?  This is maybe not the best word to use for such a case.)

I’ve changed “hijack” to “attack”.


>   on all DNS messages in the response.  For backward compatibility, a
>   client which receives DNS messages and verifies TSIG MUST accept up
>   to 99 intermediary messages without a TSIG.  The first message is
> 
> (side note: I'm kind of curious what such compatibility is needed with.
> Also, this gives an attacker some flexibility into which to incorporate
> a collision, though given the near-real-time constraints and the
> strength of the HMAC construction I don't expect any practical impact.)

The original RFC 2845 did not require all packets in a message stream to contain a TSIG, it just stated that there be no more than 99 intermediary messages without a TSIG (presumably to cut down on the overhead required in message calculations - remember that RFC 2845 was published 20 years ago).  Although many DNS implementations now add a TSIG to every message, it is by no means certain that all do. (In fact, less than three years ago, a bug was introduced into BIND, causing it to require that all packets in a zone transfer would contain a TSIG.  A fix allowing BIND to accept up to 99 packets between signed ones was released shortly afterwards after reports were received of zone transfers failing.)


> Section 5.3.2
> 
>      Request MAC (if the request MAC validated)
>      DNS Message (response)
>      TSIG Variables (response)
> 
>   The reason that the request is not included in this MAC in some cases
>   is to make it possible for the client to verify the error.  If the
>   error is not a TSIG error the response MUST be generated as specified
>   in Section 5.3.
> 
> This makes it sound like the server excludes the request MAC from the
> digest if it failed to validate (something the client cannot predict),
> so that the client must perform trial verification of both versions in
> order to validate the response.  Is that correct?

No.  If the incoming MAC failed to validate, the server should send back an unsigned response (MAC size == 0 and empty MAC).


> Also, I think that the "in some cases" is not properly placed: as-is, it
> says that the request (not request MAC) is sometimes not included
> (implying that sometimes it is), which does not match up with the
> specification for the digest components.  I presume that the intent is
> to say that in some cases the client could not verify the error, if the
> request itself was always included in the digest.

Changed “request” to “request MAC”.

If the MAC could not be verified, it is possible that it was corrupted, which would prevent the client verifying the response. But a major reason is that an incorrect MAC included in a signed response led to the CVEs that prompted this document update.


> Section 5.4.1
> 
>   If an RCODE on a response is 9 (NOTAUTH), but the response TSIG
>   validates and the TSIG key recognised by the client but different
>   from that used on the request, then this is a Key Error.  The client
> 
> nits: missing words "key is recognized" and "but is different”

It now reads “key is recognized but different"


> Section 5.5
> 
>   destination or the next forwarder.  If no transaction security is
>   available to the destination and the message is a query then, if the
>   corresponding response has the AD flag (see [RFC4035]) set, the
>   forwarder MUST clear the AD flag before adding the TSIG to the
>   response and returning the result to the system from which it
>   received the query.
> 
> Is there anything to say about the CD bit?  (It's independent crypto, so
> I don't expect so, but it seems worth checking.)

No.  CD is just a mechanism by which a client can request that the server not do DNSSEC signature validation on the data and so allow the client to receive the data instead of a SERVFAIL response.


> Section 6
> 
>   The only message digest algorithm specified in the first version of
>   these specifications [RFC2845] was "HMAC-MD5" (see [RFC1321],
>   [RFC2104]).  Although a review of its security [RFC6151] concluded
>   that "it may not be urgent to remove HMAC-MD5 from the existing
>   protocols", with the availability of more secure alternatives the
>   opportunity has been taken to make the implementation of this
>   algorithm optional.
> 
> It seems worth noting that the advice from RFC 6151 is already nine
> years old.

I’ve use the phrasing "Although a review of its security some years ago”.


>   [RFC4635] added mandatory support in TSIG for SHA-1 [FIPS180-4],
>   [RFC3174].  SHA-1 collisions have been demonstrated so the MD5
>   security considerations apply to SHA-1 in a similar manner.  Although
> 
> I'd consider referencing (e.g.) 
> shattered.io
> for the "collisions have
> been demonstrated" claim, though it's pretty optional.

A reference has been made to the “SHA1 is a Shambles” paper.


>   support for hmac-sha1 in TSIG is still mandatory for compatibility
>   reasons, existing uses should be replaced with hmac-sha256 or other
>   SHA-2 digest algorithms [FIPS180-4], [RFC3874], [RFC6234].
> 
> Is this "sha1 mandatory for compatibility" going to age well?  If we
> have two implementations that can operate with sha2 variants, is it
> required to keep this as mandatory (vs. optional with a note about
> deployed reality at time of publication) for progression to Internet
> Standard?

This has been addressed by splitting the “Mandatory/Optional” column in RFC4635 (from which Table 2 was derived) into “Implementation” and “Use”.  SHA1 is required to be implemented (for backwards compatibility) but its use is not recommended.


>                   Optional    hmac-sha224
> 
> It's not clear there's much reason to keep this around, or if we do, it
> could probably be "not recommended".  (I assume that just dropping it
> entirely makes things annoying w.r.t. the IANA registry.)

It has been left in for compatibility reasons, but its use is not recommended.


> Section 9
> 
>   Previous specifications [RFC2845] and [RFC4635] defined values for
>   HMAC MD5 and SHA.  IANA has also registered "gss-tsig" as an
> 
> I'd suggest "HMAC-MD5 and HMAC-SHA-1", as the implied grouping where
> HMAC qualifies both hash algorithms is not terribly clear.

Changed to 

	Previous specifications [RFC2845] and [RFC4635] defined names for
	the HMAC-MD5 and the various HMAC-SHA algorithms.


> Section 10
> 
> I suggest some discussion that the way truncation policy works, an
> attackers ability to forge messages accepted as valid is limited by the
> amount of truncation that the recipient is willing to accept, not the
> amount of truncation used to generate messages by the legitimate sender.

I think this is already taken care of by the reference to a local policy in the “Truncation Check and Error Handling” section (5.2.4).


> There's also some fairly standard content to put in here about allowing
> for an unsigned error response to a signed request, so an attacker (even
> off-path) can spoof such resposnes.  (Section 5.4 indicates that the
> client should continue to wait if it gets such a thing, which helps a
> lot.)

I’ve just added text that an unsigned response is not considered acceptable because can be subject to spoofing and manipulation.


>   TKEY [RFC2930].  Secrets SHOULD NOT be shared by more than two
>   entities.
> 
> I suggest adding "; any such additional sharing would allow any party
> knowing the key to impersonate any other such party to members of the
> group”.

Added.


>   A fudge value that is too large may leave the server open to replay
>   attacks.  A fudge value that is too small may cause failures if
>   machines are not time synchronized or there are unexpected network
>   delays.  The RECOMMENDED value in most situations is 300 seconds.
> 
> Our experience with kerberos in modern network environments has shown
> that 5 minutes is much larger than needed, though it's not clear there's
> a strong need to change this recommendation right now.

The value is recommended (and even then, only in “most situations”), so implementations are free to set their own value.  However, any guidance as to typical time skews measured across a network would be useful in a number of protocols.


>   Significant progress has been made recently in cryptanalysis of hash
>   functions of the types used here.  While the results so far should
>   not affect HMAC, the stronger SHA-1 and SHA-256 algorithms are being
>   made mandatory as a precaution.
> 
> Please revise this note to not imply that SHA-1 is considered "strong”.

Text revised to

	Significant progress has been made recently in cryptanalysis of hash
	functions of the types used here.  While the results so far should not
	affect HMAC, the stronger SHA-256 algorithm is being made mandatory
	as a precaution.


> Section 11.2
> 
> I'm not sure why RFC 2104 is listed as informative.

RFC2104 is an Informational RFC and “Idnits” warns of a possible downref if the reference is placed in the “Normative” section.



Comments from Mirja Kühlewind

> I only had limited time for a quick review of this document, so I might not be aware of all the needed background and details. Still I have two quick questions on retries:
> 
> 1) Sec 5.2.3:
> "Implementations should be aware
>   of this possibility and be prepared to deal with it, e.g. by
>   retransmitting the rejected request with a new TSIG once outstanding
>   requests have completed or the time given by their time signed plus
>   fudge value has passed."
> I might not be aware of all protocol details and maybe this is already specified somewhere: While unlikely, it is possible that a request might be retransmitted multiple times, as that could cause president congestion over time, it's always good practice to define a maximum number of retransmissions. If that is already defined somewhere, maybe adding a note/pointer would be good as well.

If someone can suggest a suitable number (and a reason for it), we can consider doing this.  In the meantime, I’ve merely stated that implementation should limit the number of retransmissions and so leave the choice up to the implementation.


> 2) Sec 5.3.1:
> "   This allows the client to rapidly detect when the session has been
>   altered; at which point it can close the connection and retry."
> When (immediately or after some waiting time) should the client retry and how often?

I think that should be down to the implementation to decide.


> You further say 
> "The client SHOULD treat this the
>   same way as they would any other interrupted transfer (although the
>   exact behavior is not specified here)."
> Where is that specified? Can you provide a pointer in the text?

That was a mistake in transcribing the original RFC2845 text.  The final word “here” has been removed paragraph now reads:

	The client SHOULD treat this the same way as they would any other
	interrupted transfer (although the exact behavior is not specified).


> 3) Sec 8.  Shared Secrets: Would it be appropriate to use more normative language here? There are a bunch of lower case shoulds in this section; is that on purpose?

The context in which the lower-case “should”s appear is very general security advice.  Although this is something users ought to do, it is not really a specific recommendation.



Comments from Barry Leiba

> — Section 4.2 —
> 
>         *  Other Len - an unsigned 16-bit integer specifying the length
>            of the "Other Data" field in octets.
>         *  Other Data - this unsigned 48-bit integer field will be
> 
> Does this mean that “other data” is always 48 bits?  If so, does that mean tgat the value of “other len” is always 6?  If so, then shouldn’t it say that?  If not, then what don’t I understand?

Benjamin Kaduk also made the same comment, it has been addressed above.


> — Section 5.1 —
> 
>   Clients SHOULD only attempt signed
>   transactions with servers who are known to support TSIG and share
>   some algorithm and secret key with the client -- so, this is not a
>   problem in practice.
> 
> Why SHOULD and not MUST?

Benjamin Kaduk also noted an issue here.  The paragraph has been removed.


> — Section 5.3.2 —
> 
>   The server SHOULD also cache the most recent time signed
>   value in a message generated by a key
> 
> I tripped over this until I realized you mean “Time Signed value”.  You capitalize it elsewhere, and it helps the parsing if it’s consistent. There are four uncapitalized instances in this section.

“time signed” has been capitalised.  In addition, in the description of the field, “tims signed” has been replaced with “time the message was signed”.

Elsewhere, a “fudge field” has been replaced by “Fudge field” (although occurrences of “fudge value” have not been capitalised, as the context of that is that it is referring to the contents of the Fudge field). Also, “other data” and “other data length” have been replaced with the (capitalised) field names, “Other Data” and “Other Len”.


> — Section 9 —
> 
>   There is no structure
>   required other than names for different algorithms must be unique
>   when compared as DNS names, i.e., comparison is case insensitive.
> 
> I found this sentence to be really awkward and hard to parse.  May I suggest this?:
> 
> NEW
> There is no structure to the names, and algorithm names are compared as if they were DNS names (the comparison is case-insensitive).
> END
> 
> I don’t think you really need to say that each name is different/unique, right?

Agreed.  The text has been changed to that suggested.


>   other algorithm
>   names are simple (i.e., single-component) names.
> 
> Nitty thing that you can completely ignore, but I would avoid the Latin abbreviation thus: “other algorithm names are simple, single-component names.”

Changed.



Comments from Éric Vyncke

> Thank you for the work put into this document. It is clear and easy to read.
> 
> Please find below some non-blocking COMMENTs and NITs. An answer will be appreciated.
> 
> I hope that this helps to improve the document,
> 
> Regards,
> 
> -éric
> 
> == COMMENTS ==
> 
> There are 6 authors while the usual procedure is to limit to 5 authors. Personally, I do not care.
> 
> -- Section 1.3 --
> It is a little unclear to me whether the "two nameservers" were two implementations or two actual DNS servers.

Agreed, it was unclear.  Changed to “two name server implementations”.


> -- Section 5.2 --
> Suggest to provide some justifications about "copied to a safe location": the DNS message was sent in the clear, why does the TSIG part be copied in a safe location? Please define what is meant by "safe location" (Mainly for my own curiosity)

It is a bit over-specified and has been changed.  The text now reads:

      Upon receipt of
       a message with exactly one correctly placed TSIG RR, a copy of the
       TSIG RR is stored, and the TSIG RR is removed from the DNS Message,
       and decremented out of the DNS message header's ARCOUNT.


> "cannot be understood" is also quite vague.

Changed to “cannot be interpreted”.


> -- Section 5.3 --
> About rejecting request with a time signed value being earlier than the last received value. I wonder what is the value of this behavior if there is no 'fudge' as well... The last paragraph of this section describes this case and push the error handling to the request initiator. Any reason why being flexible on the receiving site was not selected ?

The fudge value is to cope with clock skew between the sender and receiver.  This won't impact on the order in which the packets are received by the receiver, for which the “time signed” is a first-level check. (It is not fool-proof - a number of packets signed by the sender within a second of one another may have the same “Time Signed” field.)

Pushing the error handling to the initiation goes back to the original RFC 2845.


> == NITS ==
> 
> -- Section 4.3.2 --
> Is " A whole and complete DNS message in wire format." a complete and valid sentence?

This was also pointed out by Roman Danyliw and has been addressed.


Other changes made during editing

* There was no description of the contents of the “Error” and “Other Data” fields were in a request.  This has now been corrected (Error is set to 0. The contents of “Other Data” are stated to be undefined to allow for extensions such as draft-andrews-dnsop-defeat-frag-attack.)