Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt
Wes Hardaker <wjhns1@hardakers.net> Wed, 11 September 2019 03:42 UTC
Return-Path: <wjhns1@hardakers.net>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0A77012088C for <dnsop@ietfa.amsl.com>; Tue, 10 Sep 2019 20:42:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mcIcY1jSC5SW for <dnsop@ietfa.amsl.com>; Tue, 10 Sep 2019 20:42:32 -0700 (PDT)
Received: from mail.hardakers.net (mail.hardakers.net [168.150.192.181]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81788120823 for <dnsop@ietf.org>; Tue, 10 Sep 2019 20:42:32 -0700 (PDT)
Received: from localhost (unknown [10.0.0.3]) by mail.hardakers.net (Postfix) with ESMTPA id 0FD7A2D40E; Tue, 10 Sep 2019 20:42:32 -0700 (PDT)
From: Wes Hardaker <wjhns1@hardakers.net>
To: Loganaden Velvindron <loganaden@gmail.com>
Cc: dnsop <dnsop@ietf.org>
References: <156541402569.2433.16692366614072050737@ietfa.amsl.com> <CAOp4FwTbM+aanhjkbf+FbKTibGGOQzyRCvOsmiqVaDUbDQz3Ew@mail.gmail.com>
Date: Tue, 10 Sep 2019 20:42:32 -0700
In-Reply-To: <CAOp4FwTbM+aanhjkbf+FbKTibGGOQzyRCvOsmiqVaDUbDQz3Ew@mail.gmail.com> (Loganaden Velvindron's message of "Sat, 10 Aug 2019 09:37:47 +0400")
Message-ID: <ybllfuvebx3.fsf@w7.hardakers.net>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/mddrRPX_EPbiNQqx3q7l3EVJf-0>
Subject: Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended-error-07.txt
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Sep 2019 03:42:42 -0000
Loganaden Velvindron <loganaden@gmail.com> writes: Hi Loganaden, Thanks for the comments about the EDE draft. I've marked up your comments with responses and actions below. Let us know if you have any questions. 11 Loganaden Velvindron ================================================== 11.1 NOCHANGE pass-through ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1) I see at least one more model that needs to be supported, which is how to handle edns extended codes that are generated by a remote server, i.e. passthrough. Layering multiple forwarding resolvers behind each other is common, and some way to notify the end user that the originating message was not generated by the first resolver would be important. I don't know if there needs to be some way to indicate how "deep" the error was away from the end user; it seems just two levels (locally generated or non-locally generated) would be sufficient with only minor thought on it. Re: 1) This is a good point, but implementation will likely run afoul of existing standards or else require duplicative response codes or use of an additional flag in the INFO-CODES section. Perhaps a new flag type, similar to AA, which can be used to say that this recursor will return this result reliably/deterministically. Attempting to provide depth is perhaps unlikely, but flags for stub/forwarder/recursive/intermediate recursive or a subset of those might make sense. Perhaps a non-descript flag such as 'DR' for Deterministic Response. Obviously INFO-CODES can support many different flags, of which IR (Intermediate Resolver) or such could be included at the point of response generation, with the last server providing actual data in the chain being the one to authoritatively set the flag, which then must not be modified by further downstream resolvers in the process of returning the response. + Response: this has been discussed a few times, and the current view (that at least I hold, and likely others based on past discussions) is that it would be best to get this out as is, without a pass-through model while we deploy it and get operational experience with its use. Pass-through is complex for a bunch of reasons (NAT alone, eg), and it's unclear we can come up with a solution for all the likely corner cases to appear. TL;DR: we should definitely work on it, but in the future. 11.2 DONE network error code needed beyond timeout ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1) SERVFAIL needs another error code to indicate the difference between a network error (unexpected network response like ICMP, or TCP error such as connection refused) versus timeout of the remote auth server, as that is often a confusing issue. + Response: looks like a reasonable idea, so it has been added to the latest draft. thank you! Re: 2) Specifics as an item in the below list. 11.3 NOCHANGE ~~~~~~~~~~~~~~ 1) Really, I'd like to see a definition of some of the EXTRA TEXT strings here, since that will be almost immediately an issue that would need to be sorted out before this could be useful. There have been some discussions (sorry, don't know if it's a draft or just talking) about browsers consuming "extra" data in DNS responses that can do a number of things. As an example that is important to Quad9 (or any blocking-based DNS service) it might be the case that upon receiving a request for a "blocked" qname/qtype, we would hand back a forged answer that leads to a splash page as the default result. However, if the request was made from a resolver stack that had the EDNS extensions, we might include the "real" result in the EXTRA TEXT field, as well as a URL that points the user to an explanation of why that particular qname/qtype was blocked. Or we might add a risk factor, or type of risk ("risk=100, risktype=phishing") or the like. This allows a single query to be digestable by "dumb" stacks that we want to have do the most safe thing, but also allow "smart" resolver stacks to present a set of options to the end user. + Again, I suspect that the complexity associated with standardizing on exactly a structure (including internationalization) of extra-information in a machine understandable and parsable mechanism is fraught with a very long discussion period. It might be worthy of future work, and I certainly think it would be valuable, but (IMHO) it would be better to get this out and work on that as a follow-on project *if* we could achieve consensus on it (which, I'll be honesty, will be either difficult or take a long time or both). Re: 3) Seems reasonable. 11.4 NOCHANGE blacked/censored/retry ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1) I'm confused as to why a "blocked" or "censored" result would have a retry as mandatory. The resolver gave a canonical answer from the point of policy. + the retry flag is now gone. Re: 4) See below notes. Potential inclusions/Adjustments: 11.5 NOCHANGE More retry case thoughts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4.1.3.1: A use case exists where a stale answer should attempt a retry. A declarative setting for the Retry bit should not be specified here, but instead guidance on whether or not the R bit should be set should be included. For example, when using a front-end load balancer, if the recursive backends are temporarily inaccessible but are expected to recover in time to handle a subsequent query, it would be prudent to include the R bit. No additional load would be generated towards the Authoritatives in this case, and the Intermediate Recursor may choose to set the R bit or not based on whether the failure mode appears to be temporary. 4.1.5: Another area where guidance should be provided. Some recursive resolvers process requests out of order, asynchronously, or will retry alternative authoritatives post-processing as part of infrastructure table management and thus may response to a subsequent query, where the initial will fail, likely due to timeouts. In our specific case, due to our use of multiple recursive backend technologies, a subsequent query failing DNSSEC validation has a significant chance of being answered by an alternative recursor. See also 4.2.1. 4.2.11: SERVFAIL - Network: The SERVFAIL response is being generated due to what is clearly identifiable to the answering server as a network issue. R bit should be set. 4.4.3: Abusive: The answering system considers the query in question to be abusive for reasons other than load, indicating that the specific requests are undesired. This could provide hints to Network Operators or simply poorly configured client implementations that the specific queries may be part of an amplification or other attack and should be inspected. 4.4.4: Excessive: The answering system considers the query volume of the client to be excessive, indicating that it is the volume and not the content of the queries being refused and that it may be willing to answer if volume is reduced. This could provide hints to Network Operators or poorly configured client systems that they need to add additional endpoints or reduce their request volume to restore service. 4.4.5: Go Away: The answering system considers further queries from the client/network to have to exceeded thresholds by large margins or excessive durations, and further queries are likely to be dropped. This message is an attempt to limit the continued use of resources terminating queries which will not be answered. This may simply be a sub-case of Abusive/Excessive, but also is not intended to be sent for each query, but instead only intermittently, and to bypass the need for lengthy troubleshooting efforts when drop rules cause a recursor to seem to have vanished. 4.5.1: The R flag being set here implies that there are potentially multiple policies in use and that a retry might receive an answer - which should not be the case with a single intermediate recursive service. A client, knowing that it has multiple recursive services with differring policies might retry against a different recursive service (ex: 8.8.8.8 instead of 9.9.9.9), but this effectively defeats the policies of the initial recursor, rendering it ineffective. The use of a specific server as a delineation is also confusing - it should instead specify that the answering entity - be it a single server or larger entity, has blocked this response. Also, blocked should be further defined to avoid collision with the definition of the Censored response code. Blocked in this case would be used as a catch-all for anything not otherwise categorized. 4.5.2: See 4.5.1. Censoring is inherently a governmental action and this should be reserved for that due to the severity and legal repercussions of attempts to bypass. R bits should not be set. Censored should be defined in the document to avoid confusion. 4.5.3: Filtered: Differentiated from Blocked/Censored in that this content has been specifically redacted at the perceived behest of the client - may include ad-blockers, dnsbl, or other specific cases - intended to be used by those systems. Would potentially include corporate IT policies. 4.5.4: Malicious: Differentiated from Blocked and Filtered in that the answering server believes the response to be actively malicious and harmful to the requesting systems or applications, and not merely undesired or offensive. R bits should not be set. 4.5.5: Malicious Upstream - The upstream entity is considered malicious by the answering server and thus a refusal to respond has been returned. Details should be included within the INFO-CODE and potentially EXTRA-TEXT. This is differentiated from Malicious in that in this case, it is the actual upstream server that is having all responses blocked, not the content itself - for instance a revoked or unexpected certificate (such as due to a CAA record) - from which no responses will be accepted. The R bit being set here depends on whether the server believes that the specific path is compromised - if all authoritatives are failed, then a retry will not help. If only one is, then it will help to get to the non-compromised server. In the absence of data, the R bit should be set. It may make sense to create an extension of the R bit, via additional flag or other field which adds additional context to the retry declaration, such as that the request should retry the same recursor, or should instead immediately move to and try the next available. 11.6 TODO synthesized == forged ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4.1.6: Synthesized Answer: This response could be considered a sub-case of forged. An example of this would be the id.server or version.bind queries, they cannot be considered forged, but also no authority truly holds them. + Response: I think this is worthy of further thought and I'd love to hear opinions from others. IMHO, I'm not sure we should get into micro-error coding. I would say forged, in your examples, still fits. But there are other cases where I think synthesized may make sense. Anyone else have thoughts? 11.7 NOCHANGE finish categorizing ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Other Notes: INFO-CODE: It would seem that would be best to include a basic recommendation for a standard DNS-specific RWhois/CRL-like endpoint which could provide local (non-IANA) information about returned codes, potentially at a well-known URI, or even within the DNS itself via TXT records or even within the EXTRA-TEXT field itself. + Response: per discussions with others too, which you've hopefully read, there is a lot of desire for ways to potentially standardize supplemental information within the EXTRA-TEXT field. However, for the time being the goal is to get this out and get experience with how it is used and potentially standardize on the addition of machine readable supplemental information (URLs being the other common suggestion). Publishing this first (as is) doesn't get in the way of a future RFCs extending this specification. -- Wes Hardaker USC/ISI
- [DNSOP] I-D Action: draft-ietf-dnsop-extended-err… internet-drafts
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended… Loganaden Velvindron
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended… Shane Kerr
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended… Wes Hardaker
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended… Loganaden Velvindron
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-extended… Wes Hardaker