Re: [DNSOP] Working Group Last call for draft-ietf-dnsop-dns-error-reporting
Benno Overeinder <benno@NLnetLabs.nl> Mon, 17 July 2023 13:21 UTC
Return-Path: <benno@NLnetLabs.nl>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2D7FDC15154D; Mon, 17 Jul 2023 06:21:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.096
X-Spam-Level:
X-Spam-Status: No, score=-7.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=nlnetlabs.nl
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZqGX54nAtN6O; Mon, 17 Jul 2023 06:21:06 -0700 (PDT)
Received: from dane.soverin.net (dane.soverin.net [IPv6:2a10:de80:1:4092:b9e9:2294:0:1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E4612C15154F; Mon, 17 Jul 2023 06:21:05 -0700 (PDT)
Received: from smtp.soverin.net (c04smtp-lb01.int.sover.in [10.10.4.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dane.soverin.net (Postfix) with ESMTPS id 4R4N4t3L5kz2xFb; Mon, 17 Jul 2023 13:21:02 +0000 (UTC)
Received: from smtp.soverin.net (smtp.soverin.net [10.10.4.100]) by soverin.net (Postfix) with ESMTPSA id 4R4N4t1XCSzFy; Mon, 17 Jul 2023 13:21:02 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=nlnetlabs.nl; s=soverin; t=1689600062; bh=VSKyJiQWrQiUtrQUwGhCGUZ3zWOBJtaqIrJnSeIVypA=; h=Date:To:References:Cc:From:Subject:In-Reply-To:From; b=xe1YFnv4n6QBBb6/EmDhQjfwDTxbo/tvFtf5xxb0U1sE/PMujxzxSRn71OV8Nb1Ko 7rzFOwAShW/ctVSfrkLGYCieW3EW30BdawPZPSQaHmMFdyDF8e9xT1+orBo+X2vWlK ZIPLLm1mXYfojmLvXcUuVUbbZ4hpxBhyf/r4OTvh4BSL0K1e0W4BUpoWqBy+a09eWv wxQf5XqdcrMCSynPbxALCHa7qCxo2tFzWchX3eLqOVGr2gQsN8PCU+t6YIKYPUxyt/ 2xPWudJKS3pUvcRbrfO2MvUMmMXYcntB+l+rFSvDK+M5Mh+IFS/AFQMvXj3sLZQJfW pdMSNRuQxBtCw==
Message-ID: <2587d696-bd1a-6c25-6837-0f6269bdb813@NLnetLabs.nl>
Date: Mon, 17 Jul 2023 15:21:01 +0200
MIME-Version: 1.0
Content-Language: en-GB
To: DNSOP Working Group <dnsop@ietf.org>
References: <ZJn_cwWWOKIn1wbq@straasha.imrryr.org> <76E9FBC8-9F6D-4050-9C6F-E92A2CBEB326@dnss.ec> <ZKw40DEHBUfBEoUI@straasha.imrryr.org> <1583409F-8F04-4172-B9A1-94D9900402AB@dnss.ec> <ZKyHyo4Mb8I34rZI@straasha.imrryr.org>
Cc: DNSOP Chairs <dnsop-chairs@ietf.org>
X-Soverin-Authenticated: true
From: Benno Overeinder <benno@NLnetLabs.nl>
In-Reply-To: <ZKyHyo4Mb8I34rZI@straasha.imrryr.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-CMAE-Score: 0
X-CMAE-Analysis: v=2.4 cv=Mfmq+bzf c=1 sm=1 tr=0 ts=64b5403e a=IkcTkHD0fZMA:10 a=I02-Yb95z6Xi4a-irbgA:9 a=QEXdDO2ut3YA:10
X-Cloudmark-Reporter: FQWYQNP129J+TRk7ijcMJaZowPg=
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/xGX9lFIQQr0hjrubUMimrSfhJyg>
Subject: Re: [DNSOP] Working Group Last call for draft-ietf-dnsop-dns-error-reporting
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Jul 2023 13:21:12 -0000
Dear WG, This ends the WGLC for draft-ietf-dnsop-dns-error-reporting. The last call has been extended a bit longer than initially planned, but valuable feedback has been received from the WG on the the draft. Thank you very much. The authors published a -05 revision a week ago that incorporates the feedback. Some issues may need to be addressed in a subsequent revision before the document is sent to the IESG. We are coordinating this further with the authors. Best regards, -- Benno On 11/07/2023 00:35, Viktor Dukhovni wrote: > On Mon, Jul 10, 2023 at 10:27:45PM +0100, Roy Arends wrote: > >>> Right, but surely the monitoring agent can decide whether to solicit >>> such a prefix label or not. That is whether an "_er" prefix label is >>> signalled is a *local matter* betweent the authoritative server >>> signalling the option and the monitoring agent. >> >> I agree that a monitoring agent can specify a domain that may include >> a separator as the least significant label. However, it also requires >> the monitoring agent to understand that it should sometimes include >> this separator, and that it may be redundant at other times. > > If all the monitoring agent's "customers" (authoritative servers that > return its "suffix" in the new option) are informed to signal an > "_er.agent.example" name, there's no "sometimes". The agent, by mutual > agreement with the nameservers it supports can choose whatever suffix > format meets its needs, fixed across all customers, or customer-specific. > > I haven't yet seen a reason to insist on a fixed suffix pattern. The > resolver just stutters back the suffix it was handed by the > authoritative server's extension payload. What problem does mandating > the least significant label of the suffix solve, that can't be solved by > just signalling the desired suffix, special label and all? > >> It assumes that those running the authoritative server that returns >> the agent domain and those that run the reporting agent are in sync. >> Those are a lot of assumptions. > > If they're not in sync, surely reporting will be broken, whether or not > an "_er" suffix label is used. > >>> Why should resolvers have to be responsible for this? >> >> Because this separating label is trivial to include and avoids a lot of hassle. > > The hassle in question remains unclear. I see two relevant/likely > deployment models: > > * Self-hosted reporting, directly by the authoritative server: > > - Error reports are special by virtue of a dedicated qname > suffix and perhaps qtype. > > - No special coördination required, the server both publishes > and consumes the error reporting suffix. > > * Outsourced/centralised reporting, via server IPs dedicated to > error report processing. > > - Here again no need for "_er", because all queries are > presumptively error reports, and if the signal from the > "customer" auth server was wrong (whether or not an "_er" > label is included) the error report will not be handled > correctly. > > - If the signal has the correct (mutually agreed) suffix, > again no problem. > > - And of course the monitoring agent can specify the use > of "_er" (or whatever) if that's convenient. > > What use-case actually benefits from the "_er" LSL (least-significant > label) in the signal? How is this benefit not obtained by mutual > agreement between the monitoring agent and its customers? > >>>> The sole purpose of the leading “least-significant” “_er” is to >>>> distinguish between qname-minimized queries (for lack of a better >>>> term) and “full” queries. I understand that you argue that a >>>> monitoring agent can determine this without the _er labels (as >>>> described below), but that seem suboptimal to me. >>> >>> The qname minimised query (whether or not a dedicated qtype is used) >>> will be for "A" or "NS" records, not TXT or the dedicated qtype, so >>> there's no need for "_er" in the first label, the qtype is sufficient. >> >> RFC9156 contains no hard requirement to use A/NS. So I’m not confident >> that all current and future qname-minimisation implementations use >> A/NS. > > This is where this document can specify that qname minimised error > reports MUST use a qtype other than the qtype for the final error > report. > >>> However, to avoid forwarding junk reports to the monitoring agent, a >>> resolver may well sensibly choose to not forward such queries, and >>> only source them internally. >> >> I’m not following. > > If the qtype is "TXT", then an open resolver is easily subject to > proxying forged error reports purporting errors that the resolver did > not observe. Some client of the open resolver sends an explicit query > for: > > <error-reporting-qname>. IN TXT ? > > which then looks like an error report from *that* resolver to the > monitoring agent. If instead we have a dedicated qtype for error > reports, it becomes a simple matter of refusing to iterate queries for > > <whatever>. IN <ERTYPE> ? > > Any resolver wanting to report an error must do so directly, not via a > forwarder. Especially because the forwarders won't be passing the > agent extension through to their clients! > >>> The specification might also recommend that "stub" resolvers that >>> forward most queries to a "full service" resolver, should send error >>> reports *directly* to the monitoring agent. And, of course, "full >>> service" resolvers MUST NOT *forward* the monitoring agent OPTION to >>> clients, if they send such an option, it should be locally generated >>> to signal the monitoring agent for the resolver itself. >> >> I’m not following. > > In a forwarder chain: > > stub resolver <-> full-service resolver <-> auth server > > When the stub resolver wants to report an error, it must contact the > monitoring agent directly, rather than pass it to the full-service > resolver. Any agent suffix it receives from the full-service resolver > will the monitoring agent for **that** resolver, not the auth server, > and the reports need to go to the authoritative server for specified > endpoint directly! > > [ Admittedly, in practice stub resolvers are not likely to make > error reports, and forwarders are unlikely to solicit them. ] > > >>>> Allocating a new QTYPE for this purpose just seems redundant. >>> >>> It is not. This is not a normal query, it is an error report. >> >> However, it is a normal query though. All the intermediates >> (forwarders, caches, authoritivate servers) have no idea that this >> query is any different than others. There is nothing special in this >> query. I really want to avoid OPCODE subtyping by qtype. > > But that's a problem, because forwarding of error reports masks the > origin IP, with problem reports then misattributed to the edge resolver, > that may have had no problems resolving the reported name, and may be > misused by its clients to forge such reports. > >>> I would strongly prefer a dedicated qtype (with support from Puneet >>> Sood). However, if the WG consensus is TXT, we'll grudginly cope. >>> Would it make sense to raise this narrow question by the chairs as a >>> consensus call? >> >> To me, a dedicated qtype vs TXT seems like bike-shedding. > > I disagree. We're not disagreeing on cosmetic details of the name of a > new qtype, rather we're disagreeing on whether to overload TXT, which > a substantive difference. > >>> I did not see a response to the point about moving the info code to the >>> least-significant label in the query (first or right after the leading >>> "_er" if despite my exhortations that's retained). >> >> The purpose of keeping the info code right before the separating _er >> label is that it helps to separate incoming reports by “severeness”, >> as in “lame delegation” reports go here, “expired RRSIG” reports go >> there. This can all be delegated nicely by the monitoring agent. > > Though lexically last, THIS is the point I want to most strongly > emphasise. Putting the info code in the MSL (most signficant label) of > the error qname prefixed to the agent suffix breaks NXDOMAIN caching, > because we now have 65536 parent info codes for each domain that the > agent does not serve: > > *.ru.0._er.agent.example. ; signal == _er.agent.example. > *.ru.1._er.agent.example. ; signal == _er.agent.example. > *.ru.2._er.agent.example. ; signal == _er.agent.example. > .... > *.ru.65535._er.agent.example. ; signal == _er.agent.example. > > Whereas, instead and with no loss of ability to group errors by severity > (indeed the LSL is parsed first!) the agent could return NXDOMAIN for: > > *.ru._er.agent.example. ; signal == _er.agent.example. > > and be rid of all "*.ru" reports. > >>>> Viktor, your optimisations (removing the _er labels) are premature as >>>> it turns a deterministic process at the monitoring agent into a >>>> heuristic process. >>> >>> I don't see how it becomes heuristic. The dedicated qtype signals an >>> complete error reporting query, other qtypes are minimised variants. > > There's no heuristic. The agent knows what suffix(es) it serves, and > strips that suffix to recover the error report. > >> Again, there is no guarantee that a minimised variant does not use the >> dedicated qtype. It is simply easier to recognise a minimised variant >> by checking if the QNAME starts with _er. This is far more reliable >> than assuming a dedicated QTYPE is not minimised. > > Though I think the leading "_er" is redundant, it is mostly harmless, > I'd prefer to see it go, but will grudgingly accept it staying. > > The main thing is to move the info code to the LSL (least signicant > label), modulo any final (redundant) "_er" prefix (the complete query > should be distinguished by its qtype). > > Also, resolvers SHOULD NOT do query minimisation below the signalled > error reporting suffix in the first place. Save everyone needless > latency and potential ENT issues. Let's specify that too. >
- [DNSOP] Working Group Last call for draft-ietf-dn… Benno Overeinder
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] DNSOPWorking Group Last call for draf… Wes Hardaker
- Re: [DNSOP] Working Group Last call for draft-iet… Benno Overeinder
- Re: [DNSOP] DNSOPWorking Group Last call for draf… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Dick Franks
- Re: [DNSOP] Working Group Last call for draft-iet… Willem Toorop
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Dick Franks
- Re: [DNSOP] Working Group Last call for draft-iet… Dick Franks
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Dick Franks
- Re: [DNSOP] Working Group Last call for draft-iet… Dick Franks
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Paul Wouters
- Re: [DNSOP] DNSOPWorking Group Last call for draf… Wes Hardaker
- Re: [DNSOP] Working Group Last call for draft-iet… Ben Schwartz
- Re: [DNSOP] Working Group Last call for draft-iet… Viktor Dukhovni
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Viktor Dukhovni
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Ben Schwartz
- Re: [DNSOP] Working Group Last call for draft-iet… Roy Arends
- Re: [DNSOP] Working Group Last call for draft-iet… Viktor Dukhovni
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-dns-erro… Viktor Dukhovni
- Re: [DNSOP] Working Group Last call for draft-iet… Benno Overeinder
- Re: [DNSOP] I-D Action: draft-ietf-dnsop-dns-erro… Roy Arends