[regext] Benjamin Kaduk's Discuss on draft-ietf-regext-dnrd-objects-mapping-09: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 27 August 2020 06:30 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: regext@ietf.org
Delivered-To: regext@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 1C1583A0DD7; Wed, 26 Aug 2020 23:30:09 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-regext-dnrd-objects-mapping@ietf.org, regext-chairs@ietf.org, regext@ietf.org, Scott Hollenbeck <shollenbeck@verisign.com>, shollenbeck@verisign.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.14.1
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <159850980859.9278.16659516718065092581@ietfa.amsl.com>
Date: Wed, 26 Aug 2020 23:30:09 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/regext/dbuyW5YTYj4VcFHUQYC-D8OMv_g>
Subject: [regext] Benjamin Kaduk's Discuss on draft-ietf-regext-dnrd-objects-mapping-09: (with DISCUSS and COMMENT)
X-BeenThere: regext@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Registration Protocols Extensions <regext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/regext>, <mailto:regext-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/regext/>
List-Post: <mailto:regext@ietf.org>
List-Help: <mailto:regext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/regext>, <mailto:regext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Aug 2020 06:30:16 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-regext-dnrd-objects-mapping-09: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-regext-dnrd-objects-mapping/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

It's possible that I just misunderstand what is required to go where,
but for several (possibly only CSV?) elements, the body text claims that
"[t]he attribute "isRequired" MUST equal "true"." but the corresponding
examples do not consistently list the "isRequired" attribute.
(Sometimes they do, but not always.)  Shouldn't the examples be
consistent with the protocol requirements?  I note some examples in my
COMMENT section but this should not be treated as an exhaustive list.

A similar property (again, if I understand correctly) holds for the
"parent" attribute of various elements (which is definitely only a thing
for the CSV objects)..

The claim in the IANA Considerations regarding "URI assignments have
been registered by the IANA", accompanied by specific URN
namespace/schema values, is codepoint squatting, in the absence of a
disclaimer about being "requested values".  The registration policy is
only Specification Required, so there is no formal guarantee that we can
actually get these values.

At least one of the examples shows RSA/MD5 DNSSEC key records.  RSA/MD5
usage is specifically disallowed (see RFC 6944 and RFC 8624);
please replace with a more modern algorithm.  (One location noted in the
COMMENT, along with some SHA-1 usage that should probably go as well.)


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

We have at least one example that shows a gurid of 123, which is an
actual value allocated to a real registrar by ICANN (details in the
section-by-section comments).  Please consider using a different value
for the example.

Section 1

   Registry Data Escrow is the process by which a registry periodically

nit: maybe include "(RDE)", since we use the acronym later on before the
definitions section.

   This document defines the following pseudo-objects:
   [...]
   o  EPP parameters: Definition of the specific EPP parameters
      supported by the Registry Operator.

nit: is the pseudo-object containing the "definition of" the specific
parameters or something related, like the values of those parameters?

Section 4.1

   Numerous fields indicate "dates", such as the creation and expiry
   dates for domain names.  These fields SHALL contain timestamps
   indicating the date and time in UTC as specified in [RFC3339], with
   no offset from the zero meridian.

Should we mention anything about leap seconds?

Section 4.4

If I understand correctly from the examples, the actual checksum value
of the CSV file is encoded as an attribute of the corresponding
<rdeCsv:file> element in the XML description/schema/thing that
accompanies the CSV files.  Is my understanding correct?

Why is CRC32 the default?  In my opinion SHA-256 is better as a default,
based on the "fail safe" principle.

Section 4.5

   The syntax of IP addresses MUST conform to the text representation of
   either Internet Protocol Version 4 [RFC0791] or Internet Protocol
   Version 6 [RFC4291].

Where exactly in RFC 791 is the text representation discussed?  "text"
only appears three times, none of which are relevant, and
"representation" not at all.  I do note that traditional/historic APIs
involving IPv4 addresses have been quite lenient, allowing even such
things as 18.1179790 (18/8 is a class A network), and I expect the
intent is to limit to "dotted quad" decimal values.

Section 4.6.1

nit: we jump pretty quickly into the example without much introduction
for the concepts and abbreviations it uses; a note about where it will
be explained further might be helpful.

Section 4.6.2

(editorial) I feel like these CSV elements would be more understandable
if they appeared after the XML model (which presents descriptions for
the XML elements that are used in the example CSV elements), though I
don't have a specific proposal for section reorderings that would make
that happen.

Section 4.6.2.1

   The following is example of the "domain-YYYYMMDD.csv" file with one
   record matching the <rdeCsv:fields> definition.

   domain1.example,Ddomain2-TEST,,,registrantid,registrarX,registrarX,
   clientY,2009-04-03T22:00:00.0Z,registrarX,clientY,
   2009-12-03T09:05:00.0Z,2015-04-03T22:00:00.0Z

Is that last 2015-04-03T22:00:00.0Z the expiration date (now five years
in the past)?  It may be worth making a pass through the examples and
thinking about which dates might be updated to more-current values.

Section 4.6.2.2

   <rdeCsv:fRegistrant>  Registrant contact identifier with
      type="eppcom:clIDType".

So the client ID type is used for both clients (registrars, per below)
and registrants (users, as used here)?

   <rdeCsv:fCrRr>  Identifier of the registrar, defined in Section 5.4,
      of the client that created the object with type="eppcom:clIDType".

   <rdeCsv:fCrID>  Identifier of the client that created the object with
      type="eppcom:clIDType".

Or am I wrong to equate clients and registrars?

   <rdeCsv:fTrStatus>  State of the most recent transfer request with
      type="eppcom:trStatusType" and isRequired="true".

We assume that there has always been a most recent transfer request?  I
note that <trnData> (and the <trStatus> it contains) "MUST NOT be
present if a transfer request for the domain name has never been
created".

Section 4.6.3

         <csvRegistrar:fGurid/>

It looks like we don't actually expand this to "globally unique
registrar identifier" anywhere; we just say it's the ID assigned by
ICANN" (in §5.1.2.1.1)?

Section 5.1.1.1

   o  An OPTIONAL <uName> element that contains the fully-qualified
      domain name in Unicode character set.  It MUST be provided if
      available.

Who is tasked with verifying that the <uName> is consistent with the
<name>?  (Similarly for other uName elements, later.)

Section 5.1.2

   include a <csvDomain:fName parent="true"> field.  All the child CSV
   file definition data for the domain name objects in the parent
   "domain" CSV File Definition (Section 5.1.2.1.1) MUST first be
   deleted and then set using the data in the child CSV files.  The
   deleted domain name object data under the <csvDomain:deletes> element
   is a cascade delete starting from the "domain" Deletes CSV File
   Definition (Section 5.1.2.2.1).

I feel like I want to check my understanding here.  When we say "All the
child CSV file definition data [...] MUST first be deleted and then set
[...]", this is not talking about the <csvDomain:deletes> stuff, right?
It's just "anything that's listed in <cvsDomain:contents> is wiped
clean, so the contents of the <cvsDomain:contents> reflect the full and
exact state after the updates"?

Section 5.1.2.1.1

   <rdeCsv:fUpRr>  Identifier of the registrar, defined in Section 5.4,
      of the client that updated the object.

   <rdeCsv:fUpID>  Identifier of the client that last updated the domain
      name object.

nit: we could probably normalize around "object" vs "domain name object"
and "updated" vs "last updated".  (This seems to occur multiple times in
the document, at least with respect to "updated"/"last updated".)

   <rdeCsv:fUName>  UTF8 encoded domain name for the <csvDomain:fName>
      field element.

[same question as for CSV about who does the consistency check]

   <rdeCsv:fUpDate>  Date and time of the last update to the domain name
      object.

Should we say anything about the case where the domain object has never
been modified?

   <rdeCsv:fTrDate>  Date and time of the last transfer for the domain
      name object.

[likewise.  I will probably not mention this for subsequent
transfer/update-related elements, though they would presumably be
similarly affected]

Section 5.1.2.1.5

   The "domainNameServersAddresses" CSV File Definition defines the
   fields and CSV file references used for supporting the host as domain
   attributes model.

Is there a typo in here?  Google didn't find much for "host as domain
attributes model".

Section 5.1.2.1.6

   Example of the corresponding dnssec-ds-YYYYMMDD.csv file.  The file
   contains two DS records for domain1.example.

   domain1.example,604800,12345,3,1,49FD46E6C4B45C55D4AC
   domain1.example,604800,12346,3,1,38EC35D5B3A34B44C39B

It would be nice to use SHA-256 rather than SHA-1 for the examples,
given SHA-1's weaknesses.

   Example of the corresponding dnssec-key-YYYYMMDD.csv file.  The file
   contains two key records for domain1.example.

   domain1.example,604800,257,3,1,AQPJ////4Q==
   domain1.example,604800,257,3,1,AQPJ////4QQQ

RSA/MD5, on the other hand, is specifically disallowed (see
RFC 6944 and RFC 8624).
Please replace with a more modern algorithm.

Section 5.2.2.1.1

   Example of the corresponding host-YYYYMMDD.csv file.  The file
   contains six host records with four being internal hosts and two
   being external hosts.

What is an internal (or external) host?

Section 5.2.2.1.2

   The following "rdeCsv" fields, defined in section CSV common field
   elements (Section 4.6.2.2), MAY be used in the "hostStatuses"
   <rdeCsv:csv> <rdeCsv:fields> element:

   <rdeCsv:fStatusDescription>  Host object status description which is
      free form text describing the rationale for the status.  The
      attribute "isRequired" MUST equal "true".

I feel like the "MAY" and "MUST" must be targetted at different parties,
but am not entirely sure I understand the distinction.

And if "isRequired" must be true, why is it not listed as such in the
example that follows or present in the corresponding example csv file?

Section 5.2.2.1.3

   <csvHost:fAddr>  IP addresses associated with the host object with
      type="host:addrStringType".  The attribute "isRequired" MUST equal
      "true".

   <csvHost:fAddrVersion>  IP addresses version associated with the host
      object with type="host:ipType".  "host:ipType" has the enumerated
      values of "v4" or "v6".  The attribute "isRequired" MUST equal
      "true".

Is anyone supposed to do consistency checks that the string
representation of the address matches the claimed address type?

   <rdeCsv:fRoid>  Host object Registry Object IDentifier (ROID)
      assigned to the host object with isRequired="true".

(I don't see the "isRequired" in the example that follows, though it is
present for fAddr and fAddrVersion.)

Section 5.3.2.1.2

   <csvContact:fId>  Server-unique contact identifier of status with
      isRequired="true".

It needs to have parent="true", too, right?

Section 5.3.2.1.3

   <csvContact:fStreet>  Contains the contact's street address line with
      type="contact:fPostalLineType".  An index attribute is required to
      indicate which street address line the field represents with index
      "0" for the first line and index "2" for the last line.  An
      OPTIONAL "isLoc" attribute to used to indicate the localized or
      internationalized form as defined in section Section 4.6.3.

nit(?): pedantically, this would seem to say that if I only have two
lines, I use indices 0 and 2 (not 0 and 1), and leave me in an awkward
situation if there is only one line.  But I don't think anyone will
actually get confused...

   <csvContact:fCc>  Contains the contact's country code with
      type="contact:ccType" and isRequired="true".  An OPTIONAL "isLoc"
      attribute to used to indicate the localized or internationalized
      form as defined in section Section 4.6.3.

Is there really localization available for ISO-3166-1 country codes?
(I didn't follow the reference.)

   <csvContact:fId>  Server-unique contact identifier for the contact
      object with isRequired="true".

"parent", too, right?  I will stop reporting additional cases where
"parent" seems to be needed.

Section 5.3.2.1.5

   <csvContact:fDiscloseNameLoc>  Exceptional disclosure preference flag
      for the localized form of the contact name with type="boolean".

   <csvContact:fDiscloseNameInt>  Exceptional disclosure preference flag
      for the internationalized form of the contact name with
      type="boolean".

What happens if both 'Loc' and 'Int' are set?
(Similarly for the other fields.)

Section 5.4.1.1

   o  An OPTIONAL <upDate> element that contains the date and time of
      the most recent RDE registrar-object modification.  This element

Is the "RDE" part here correct?  It feels weird for me for the escrow
data format contents to include information that is more properly
metadata about the escrow data; we do not seem to mention "RDE" when
talking about other modification times.

   Example of <registrar> object:

   ...
     <rdeRegistrar:gurid>123</rdeRegistrar:gurid>

"123" seems to be a real registrar ID, listed at
https://www.iana.org/assignments/registrar-ids/registrar-ids.xhtml as
corresponding to "The Registry at Info Avenue, LLC d/b/a Spirit
Communications".  Perhaps a reserved value (e.g., 1) could be used
instead?

Section 5.4.2

   Definition (Section 5.4.2.1.1).  The child CSV file definitions
   include a <csvRegistrar:fId parent="true"> field.  All the child CSV

(this is probably applicable to previous models as well, but) we seem to
allow either fId or fGurid, so this text is inconsistent with the actual
practice.

Section 5.4.2.1.1

We list <csvRegistrar:fGurid> in both the "MUST" (as part of the choice)
and "MAY" sections.  My reading of the "choice" part was that you had to
pick one or the other, so it's not entirely clear to me how internally
consistent this treatment is.

Section 5.4.2.2.1

      <csvRegistrar:fId>  Contains the server-unique registrar
         identifier with type="eppcom:clIDType" and isRequired="true".

      <csvRegistrar:fGurid>  Contains the ID assigned by ICANN with
         type="positiveInteger".  The attribute "isRequired" MUST equal
         "true".

nit: we could use the same sentence structure for these two elements'
descriptions.

Section 5.5.2.1.1

   <rdeCsv:fUrl>  URL that defines the character code points that can be
      used for the language defined by the <rdeCsv:fLang> field element.

A pointer to which <rdeCsv:fLang> element(s) this is might be helpful,
since it's not local to this section.

Section 5.6

   A domain name can only exist as a domain name object or an NNDN
   object, but not both.

What entities are charged with enforcing this invariant?

Section 5.6.1.1

      *  If an NNDN is considered a mirrored IDN variant of a domain
         object, then the NNDN will be tagged as "mirrored".  A

Where is "NS mirroring" defined?  This seems particularly poingant since
it defaults to "true".

Section 5.7

   registry supports EPP.  Only one EPP Parameters Object MUST exist at
   a certain point in time (watermark).

nit: Is "only one" a maximum or a synonym for "exactly one"?

Section 5.9.1

   o  A <count> element that contains the number of objects in the SRS
      at a specific point in time (watermark) regardless of the type of
      deposit: Differential, Full or Incremental.  The <count> element
      supports the following attributes:

If the count is supposed to be for objects in the SRS, why do we have
the "uri" that distinguishes between the XML and CSV models?  What model
is used for escrow seems independent of what the state of the SRS is.

Section 5.10

I feel like I'm left without a proper understanding of what the DNRD
Common Objects Collection actually does.

Section 8

I had asked in a few previously places what entity is tasked with
verifying some property (usually that an A-name and U-name are actually
equivalent); would that be something that could/should be done by a data
escrow agent during this sort of validation process?

Section 9

(I am only spot-checking that min/maxOccurs in the schema matches with
the prose descriptions, etc., not doing an in-depth review.)

Section 9.1

Some of the indentation seems inconsistent, e.g., around:

    </complexType>
     <complexType name="fRequiredDateTimeType">
     <complexContent>

Section 9.2

             <element name="status"
               type="domain:statusType" maxOccurs="11"/>

11 is an interesting number (§5.1.1.1 has in prose just "one or more
<status> elements"); is there a story behind it?

Section 9.3

    <!-- Variant group / tag field -->
    <element name="fVariantGroup"
      type="rdeCsv:fTokenType"
      substitutionGroup="rdeCsv:field"/>

Where is the "variant group" usage specified?

Section 9.4

             <element name="status"
               type="host:statusType" maxOccurs="7"/>

And here we have maxOccurs of 7; interesting.

Section 9.7

    <!-- Disclosure of localized name
         based on fDiscloseFlag? -->

nit: why do all these comments have a question mark?

Section 9.8

     <simpleType name="postalLineType">
       <restriction base="normalizedString">
         <minLength value="1" />
         <maxLength value="255" />

Where does the line-length limit of 255 come from?

Section 9.10

Some whitespace niggles here as well, e.g.:

           </sequence>
             <attribute name="id" type="rdeIDN:idType" use="required"/>
         </extension>

Section 9.14

    <!-- ASCII Compatible Encoding (ACE) name field -->
    <element name="fAName" type="rdeCsv:fNameRequiredType"
     substitutionGroup="rdeCsv:field"/>

How common is the ACE abbreviation/term?  It does not seem to be used
elsewhere in this document (and it is a little distracting to me, as the
name of a security-area WG).

    <!-- Variant group / tag field -->
    <element name="fVariantGroup"
      type="rdeCsv:fTokenType"
      substitutionGroup="rdeCsv:field"/>

Where is the "variant group" usage specified?

Section 11

   This document uses URNs to describe XML namespaces and XML schemas
   conforming to a registry mechanism described in [RFC3688].  Fourteen
   URI assignments have been registered by the IANA.

"Fourteen" does not seem accurate any more.

(Also, it is maybe a bit overzealous to use the generic "csv" prefix
for the RDE CSV usage, though I do not specifically object to doing so.)

Section 13

Thanks for what's here already; it covers a lot of good stuff.

We should also say something about the level of trust placed in the
escrow agent.  One might imagine a scenario where the data placed in
escrow is signed by the registry in a way that can be verified
post-facto by not only the escrow agent but other entities as well; in
this case the escrow agent is trusted only to preserve an accurate copy
and provide "new enough" data.  That does not seem to be the trust model
used by this scheme, since we also must trust the escrow agent to
provide a faithful copy of what the registry gave it, in the case when a
recovery is needed.  Whether or not this presents significant risk is a
policy and perhaps contractual matter, so it's not problematic from the
specification point of view, but we should document that we make this
strong assumption of trust of the escrow agent.  (This is particularly
true due to the number of things that are negotiated out of band between
registry and escrow agent, which would of course not be covered by any
in-protocol indication of source authenticity.)

In a similar vein, we might mention that there's no protocol mechanism
to detect a registry providing bogus data to an escrow agent.  (I'm sure
there are legal/contractual mechanisms in place to do so, though!)

It's also important to discuss the vagaries of quoting/escaping in the
CSV format, especially so since we allow for the separator to be
specified manually.  The parser needs to ensure that only the intended
separators are treated as separators and ignore that character when it
is quoted or escaped (as appropriate) for appearance in the body of a
given field.

I would suggest some mention that the various datastructures include a
few places that have some internal redundancy, and that if the values
become inconsistent there can be harmful consequences (e.g., if
different entities use different fields as their reference).

   Note: if Transport Layer Security (TLS) is used when providing an
   escrow services, the recommendations in [RFC7525] MUST be
   implemented.

Please cite this as BCP 195.

Section 19

Perhaps the YYYYMMDD could be replaced with non-placeholder values to
make the example more realistic?

Should there also be examples for the CSV contents of the indicated
files?  (This would also give the opportunity to show that the checksum
of the content matches the value indicated in the example XML.)

Section 21.1

If it is not needed for the textual representation of an IPv4 address,
RFC 791 seems to otherwise not be referenced.

Section 21.2

Since SHA-256 is a "MAY", that might qualify RFC 6234 as a normative
reference (per
https://www.ietf.org/about/groups/iesg/statements/normative-informative-references/)

Likewise for RFC 7525, which is a MUST when TLS is used.