[DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-capture-format-08: (with DISCUSS and COMMENT)
Benjamin Kaduk <kaduk@mit.edu> Mon, 19 November 2018 00:28 UTC
Return-Path: <kaduk@mit.edu>
X-Original-To: dnsop@ietf.org
Delivered-To: dnsop@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 98442124BE5; Sun, 18 Nov 2018 16:28:19 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-dnsop-dns-capture-format@ietf.org, Tim Wicinski <tjw.ietf@gmail.com>, dnsop-chairs@ietf.org, tjw.ietf@gmail.com, dnsop@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.88.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154258729961.2478.12875770828573692533.idtracker@ietfa.amsl.com>
Date: Sun, 18 Nov 2018 16:28:19 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/ingHrzoPcbMx6E5atAEJ5_bq9Qg>
Subject: [DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dnsop-dns-capture-format-08: (with DISCUSS and COMMENT)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Nov 2018 00:28:20 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-dnsop-dns-capture-format-08: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-dnsop-dns-capture-format/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- It is pretty shocking to not see any discussion of the privacy considerations of storing data including client addresses (and ports) alongside DNS transactions, given how central DNS resolution is to user behavior on the web. (Note that there are mentions of potentially anonymized data in Sections 6.2 and 6.2.3 which would presumably forward-reference the privacy considerations.) Data normalization would probably also be mentioned in this section, since (e.g.) the case used for a query/response could be used in fingerprinting an implementation. I'm also concerned about the policy/procedure for allocating/extending the various bitfields and similar potential extension points in the data structures. Section 8 covers the major/minor versioning semantics with respect to new map keys and new maps, but not addition of new bits within existing (uint) bitmaps. Given the usage of the CDDL .bits constraint, it's not really clear that an IANA registry is the right tool to use, but I think some indication of the expected way to allocate new bits is in order, whether it's "a future standards-track document that updates this document" or otherwise. (I've noted many, but not all, instances of such bitmaps in my COMMENT section.) There are also a couple of fields whose semantics don't seem to be sufficiently well specified for a proposed-standard document, such as vlan-ids, generator-id, name-rdata, and ae-code. (I understand that some of them are probably only going to have locally relevant semantics, but we should be explicit about when that's the case.) If I'm reading things correctly that the IP address type is inferred from the bytestring length, then I think we need to enforce a restriction on the address prefix length(s) to allow for that inference to be unambiguous (noting that we only have the *byte* length of the address fields at our disposal for disabmgituation, and not the more precise bit-length). ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- Section 2 Please consider using the RFC 8174 version of the BCP 14 boilerplate. Section 3 Because of these considerations, a major factor in the design of the format is minimal storage size of the capture files. maybe "storage and transmission"? Section 6 In Figure 2, the Query name is marked as "(q)" (only present if there is a query), but the running text in Section 4 (bullet 1) says that the Question section from the response can be used as an identifying QNAME if there is a response with no corresponding query. Am I misexpanding QNAME here, or is there a disagreement between these two parts of the text? In particular, I do not see a part of Figure 2 that would correspond to a Question section in the response, given the various "(q)"/"(r)" markings. Section 6.2.2 Messages with OPCODES known to the recording application but not listed in the Storage Parameters are discarded (regardless of whether they are malformed or not). (Do we need to say anything that the "discarded" is only w.r.t. the capture process, and not meant to imply that DNS queries would not get a normal response?) Section 6.2.4 Please consider using IPv6 examples, per https://www.iab.org/2016/11/07/iab-statement-on-ipv6/ . Section 7.2 o The column T gives the CBOR data type of the item. * U - Unsigned integer * I - Signed integer This is venturing a bit far from my normal area of expertise, but my understanding is that CBOR native major types are only provided for unsigned integer and negative integer, with "signed integer" being an abstraction at a slightly higher layer that needs to be managed in the application. Do we need to add any clarifying text here or will the meaning be clear to the reader? Section 7.4 Should probably forward-reference section 8 for the format version numbers' semantics. Section 7.4.1.1 We should we reference the IANA registries by name for any of these fields (e.g., opcodes, rr-types, etc.). (Also in Section 7.5.3.1, etc.) Are the storage flags going to be allocated in sequence by updating standards-track documents, or some other mechanism? (Is a registry necessary?) For the various address prefix fields, do we need to specify that the full addresses are stored when the corresponding prefix field is absent? Section 7.4.1.1.1 Am I parsing the "query-response-hints" text correctly to say that a bit is set in the bitmap if the corresponding field is recorded (if present) by the collecting implementation? The causality of "if the field is omitted the bit is unset" goes in a direction that is not what I expected. (Similarly for the other fields in this table.) Section 7.4.2 Do we need a reference for "promiscuous mode"? Just to check: in "server-addresses", I just infer the IP version from the length of the byte string? Do we need to say more about where the vlan-ids identifiers are taken from? Is the "generator-id" string intended to only be human readable? Only within a specific (administrative) context? Section 7.5.1 Does "earliest-time" include leap seconds? Section 7.5.3 The "ip-address" description seems to imply that very short ipv6 prefix lengths could cause confusion as to the address type being indicated (e.g., setting to 32 when no ipv4 prefix length is set, or setting to the same value as the ipv4 prefix length). Do we need to restrict the ipv6 prefix lengths to being 33 or larger? Are the "name-rdata" contents in wire format or presentation format? Section 7.5.3.2 What's the allocation policy/procedure for the remaining qr-transport-flags transport values? For additional bits in any/all of the flags fields listed here? Something of a side note, what's the mnemonic for the "sig" in "qr-sig-flags"? That is, what is it a signature of or over (it doesn't seem like it's a cryptographic signature, which may be what is confusing me)? For "query-rcode"/"response-rcode", should there be a reference for "OPT", and/or for any of the EDNS stuff in here? (The Terminology section only mentions using the naming from RFC 1035, that I can see.) The "mm-transport-flags" here bear a striking resemblance to the "qr-transport-flags" from Section 7.5.3.2; should there be a shared registry for their contents? (I guess the TransportFlags CDDL to some extent serves this function.) Section 7.7 How is the value of the "ae-code" determined? Appendix A We could perhaps apply some constraints on (e.g.) the address-prefex length fields to be .le the relevant lengths. Appendix C.6 Using a strong compression, block sizes over 10,000 query/response pairs would seem to offer limited improvements. nit: Using a strong compression scheme
- [DNSOP] Benjamin Kaduk's Discuss on draft-ietf-dn… Benjamin Kaduk
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Sara Dickinson
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Sara Dickinson
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Richard Gibson
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Tony Finch
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Mark Andrews
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Brian Dickson
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Tony Finch
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Sara Dickinson
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Sara Dickinson
- Re: [DNSOP] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk