[alto] Benjamin Kaduk's Discuss on draft-ietf-alto-unified-props-new-21: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 01 December 2021 23:00 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: alto@ietf.org
Delivered-To: alto@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A2993A0CBE; Wed, 1 Dec 2021 15:00:29 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-alto-unified-props-new@ietf.org, alto-chairs@ietf.org, alto@ietf.org, Vijay Gurbani <vijay.gurbani@gmail.com>, vijay.gurbani@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.40.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <163839962921.6785.17983443500874892419@ietfa.amsl.com>
Date: Wed, 01 Dec 2021 15:00:29 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/7nP0BzW7_edJq_BZfVx4P4p5yPc>
Subject: [alto] Benjamin Kaduk's Discuss on draft-ietf-alto-unified-props-new-21: (with DISCUSS and COMMENT)
X-BeenThere: alto@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Application-Layer Traffic Optimization \(alto\) WG mailing list" <alto.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/alto>, <mailto:alto-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/alto/>
List-Post: <mailto:alto@ietf.org>
List-Help: <mailto:alto-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/alto>, <mailto:alto-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Dec 2021 23:00:29 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-alto-unified-props-new-21: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-alto-unified-props-new/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

(1) Section 8.6 seems to have some conflicting requirements.  The
filtered property map response "MUST include all the inherited property
values for the requested entities and all the entities which are able to
inherit property values from the requested entities."  We then go on to
say that to do this, the server MAY follow three rules, that themselves
include SHOULD-level guidance, but don't say how the MUST is achieved if
the SHOULDs or MAY are ignored.  I was expecting to see a construction
of the form "SHOULD do X, but if not, MUST do Y".

(2) Many of the examples in Sections 10.X do not seem to match up with
the prose that describes them and the previous data tables that they are
intended to illustrate (see COMMENT).  We should make sure that the
examples are internally consistent.

(3) Section 4.6.2 says:

   *  Last, the entity domain types "asn" and "countrycode" defined in
      [I-D.ietf-alto-cdni-request-routing-alto] do not have a defining
      information resource.  Indeed, the entity identifiers in these two
      entity domain types are already standardized in documents that the
      Client can use.

But earlier we said that "the defining information resource of a
resource-specific entity domain D is unique", but this seems to be
saying that the defining information resource of domains of the "asn"
and "contrycode" type are *not* unique, by virtue of not existing at
all.  How can we rectify these two statements?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I suggest noting somewhere early-ish that the (semi-)formal notation
defined in Section 8.2 of RFC 7285 will be used.

Section 1

   properties.  Furthermore, recent ALTO use cases show that properties
   of entities such as network flows [RFC7011] and routing elements
   [RFC7921] are also useful.  Such cases are documented in
   [I-D.gao-alto-fcs].  The current EPS however is restricted to

This is probably more relevant as a comment on draft-gao-alto-fcs than
this document, but putting the ALTO server in a position to know about
individual flows seems like a big privacy risk, especially in the face
of pervasive monitoring (per RFC 7258).  It's not really clear that this
is actually a good idea to do, and thus whether we want to mention it
here.

Section 3.2.2

There seems to be an unfortunate risk of conflation of parsing as
((entity domain) name) vs (entity (domain name)), with domain name being
the widely-used term (see, e.g., RFC 8499).  Could we find some
alternate terminology that doesn't suffer from this potential confusion?

Section 4.4

   For some domain types, entities can be grouped in a set and be
   defined by the identifier of this set.  This is the case for domain

>From a mathematical/set-theoretic perspective, this statement is
trivially true for all domain types; that's just how sets work.  I think
what we want to say here is that they can be efficiently grouped by
utilizing an underlying structure for the entities in the given domain
type.  That might become, for example, "For some domain types, there is
an underlying structure that allows entities to efficiently be grouped
into a set and be defined by the identifier of this set".

Section 4.6

   Besides, it is also necessary to inform a Client about which
   associations of specific resources and entity domain types are
   allowed, because it is not possible to prevent a Server from exposing
   inappropriate associations.  [...]

This reasoning is a bit hard for me to follow.  It's not possible to
prevent a server from exposing nonsensical things, sure.  But often we
would just define the correct operation of the protocol to be only
exposing things that make sense, and if the server is noncompliant to
the spec, things break accordingly.

Section 4.6.1

   The defining information resource of a resource-specific entity
   domain D is unique and has the following specificities:
   [...]
   *  its media type is unique and equal to the one that is specified
      for the defining information resource of an entity domain type.

I find this definition worrisom, as it seems to imply that the given
resource only has one media type, implicily precluding the server from
ever exposing the representation of that resource via a different media
type.  This does not mesh well with my understanding of the HTTP
ecosystem and the guidelines for using HTTP as a substrate for building
other protocols.  For example, in draft-ietf-httpbis-semantics, we see
that the notion of an HTTP resource is in some sense an abstract
resource, and HTTP only conveys representations of that resource.  There
can inherently be multiple representations of a given resource, and
there is a media-type negotiation to determine what representation is
returned.  Likewise, in
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-bcp56bis#section-4.16
we see that it is an expected part of protocol evolution to define new
media types that enable new functionality by virtue of the new format,
even if the abstract resource being provided remains the same.

   A fundamental attribute of a defining information resource is its
   media type.  There is a unique association between an entity domain

Similarly here -- the media type is not a property of the resource, but
rather of a representation that is conveyed over HTTP.
[For brevity, I will try to not make further comments on this topic,
though I think that the theme continues in other locations in the
document.]

I think that what ALTO wants is a new conceptual "ALTO resource type"
that is distinct from an (HTTP) media type.  Unfortunately, I see that
RFC 7285 has already placed a strong emphasis on media types
specifically, so addressing this topic might best be done as a separate
work item.

Section 5.1.1

   [RFC8126] without a need to register with IANA.  All other entity
   domain types appearing in an HTTP request or response with an
   "application/alto-*" media type MUST be registered with the IANA,

That's a rather unusually specific requirement to have it conditioned
solely on the application/alto-* media type.  Often when we have such a
requirement we would consider adding a note to the registry to help
remind reviewers that the requirement exists, but I would rather
advocate for removing the media-type specificity of the requirement,
here.
(Also in §5.2.1.)

   A Private Use entity domain type identifier and its associated
   internal specification MUST apply to all the property maps of an IRD.

I don't think I understand what this requirement is trying to say.  The
best reading I can come up with so far is that it says "if there is a
private use type identifier presented in an IRD, that entity domain type
must be present in all of the property maps in the IRD.  On the face of
it, that seems like an absurd requirement to meet, though, since even
the primary types we're defining in this document are not going to all
apply to all property maps.
(Also in §5.2.1.)

   identifier) to reduce potential collisions.  The format of the entity
   identifiers (see Section 5.1.3) in that type of entity domain, as
   well as any hierarchical or inheritance rules (see Section 5.1.4) for
   those entities, MUST be specified at the same time.

What does "specified at the same time" mean?  In the same document?

Section 5.1.2.1

   A resource-specific entity domain is identified by an entity domain
   name constructed as follows.  It MUST start with a resource ID using
   the ResourceID type defined in Section 10.2 of [RFC7285], followed by
   the '.' separator (U+002E), followed by a string of the type
   EntityDomainType specified in Section 5.1.1.

This seems to disallow using the "priv:" form for (the "type" part of)
resource-specific entity domain names.  Is that really intended?

Section 5.1.2.3

   A self-defined entity domain can be viewed as a particular case of
   resource-specific entity domain, where the specific resource is the
   current resource that uses this entity domain.  In that case, for the
   sake of simplification, the component "ResourceID" SHOULD be omitted
   in its entity domain name.

Why do we need the flexibility of allowing both X.X and .X to represent
the same information?  Wouldn't it be simpler to only allow one form?
Having a single well-specified procedure tends to result in more secure
implementations.

Section 5.2.1

      However, when applied to an entity in a "pid" domain type, the
      property would indicate the location of the center of all hosts in
      this "pid" entity.

I'd consider saying something less specific, like "would indicate a
location representative of all hosts in this "pid" entry", avoiding the
term "center" that invites questions of how it is computed.  (Similarly,
§9.3 mentions the "barycenter" of a set of addresses.)

Section 5.2.2

   EntityPropertyName ::= [ResourceID]'.'[priv:]EntityPropertyType

Should the "priv:" component be quoted here as a literal?

Section 6.1.1.2

   Individual addresses are strings as specified by the IPv4Addresses
   rule of Section 3.2.2 of [RFC3986]; Hierarchical addresses are
   prefix-match strings as specified in Section 3.1 of [RFC4632].  To

The referenced section of RFC 4632 does not refer to "prefix-match
strings" at all (and only once to "match" at all, not directly related
to prefixes).  Please use the terminology of the referenced document to
indicate the functionality that is being integrated.

Section 6.1.2.2

   Individual addresses are strings as specified by Section 4 of
   [RFC5952]; Hierarchical addresses are prefix-match strings as
   specified in Section 7 of [RFC5952].  To define properties, an

Is this the right reference for prefix-matching IPv6 addresses?  Section
7 of RFC 5952 is quite short...

Section 6.1.3

   Both Internet address domains allow property values to be inherited.
   Specifically, if a property P is not defined for a specific Internet
   address I, but P is defined for a hierarchical Internet address C
   which prefix-matches I, then the address I inherits the value of P
   defined for the hierarchical address C.  If more than one such

I think there's some room for tightening up the terminology around
"hierarchical" addresses.  While we do use the term in some earlier
parts of the document, it's not specifically defined anywhere that I
found.  The usage in this section, on the other hand, seems like it
would be easy to replace with discussion of "prefix"es and avoid the
need for a new term.  If we do want to keep the "hierarchical" concept,
I strongly suggest adding some terminology section that specifically
defines what we use it to mean.

Section 7.5

   The "uses" field of a property map resource in an IRD entry specifies
   dependent resources of this property map.  It is an array of the
   resource ID(s) of the resource(s).

This doesn't seem to add anything above the definition of the "uses"
field from RFC 7285; shouldn't we be saying something about the defining
information resource for resource-specific domains/properties (in
addition to any other dependent resources)?

Section 8.3

                          If it is absent, the Server returns an empty
      property value '{}' for all the entity IDs of the "entities" field
      on which at least one property is defined.

Is that the literal string "{}" or an empty JSON object?
Given the SHOULD-level guidance we give elsewhere to assume that the
response is a string (or JSON null value, which both {} and "{}" are
not), it seems important to provide clarity on this matter.

Section 8.6

Given that there are identifiers that can be interpreted as both an
entity name and a property name, and we have the same error code for
invalid entity identifier and invalid property name (with guidance to
return the invalid identifier in the error message), are we setting up
for a situation where the error message is ambiguous about which
interpretation of the string was invalid?

   *  The response only includes the entities and properties requested
      by the client.  If an entity in the request is identified by a
      hierarchical identifier (e.g., a "ipv4" or "ipv6" prefix), the
      response MUST cover properties for all identifiers in this
      hierarchical identifier.

Just to check: the intent here is that we return all properties that are
present on any address covered by the prefix, even though some of those
properties may not be present on all addresses covered by the prefix?

Section 10.x

I am not really an HTTP expert, but the content-lengths in these
examples seem to be based on byte counts with Unix-style line
separators, whereas (per draft-ietf-httpbis-messaging) my understanding
is that the values should be computed with CRLF as the line separator.

Section 10.2

   Beyond "pid", the examples in this section use four additional
   properties for Internet address domains, "ISP", "ASN", "country" and
   "state", with the following values:

Are these property names, types, or both?
Should we use "countrycode" that is defined by
draft-ietf-alto-cdni-request-routing-alto, rather than the very similar
sounding "country"?

Section 10.3

   The following IRD defines ALTO Server information resources that are
   relevant to the Entity Property Service.  It provides two property
   maps: one for the "ISP" and "ASN" properties, and another one for the
   "country" and "state" properties.  [...]

I may be misreading things, but I could only find the former of these
two.  I should be looking for resources that have the
"application/alto-propmap+json" media-type and do not accept parameters,
right?

   The server provides several filtered property maps.  The first
   returns all four properties, and the second returns only the "pid"
   property for the default network map.

Does it also return the "pid" property for the alt-network-map?

   The filtered property maps for the "ISP", "ASN", "country" and
   "state" properties do not depend on the default network map (it does
   not have a "uses" capability), because the definitions of those

I only see "ISP" showing up in the ia-property-map and the
iacs-property-map, both of which list "uses" for the
default-network-map.

   Note that for legacy clients, the ALTO server provides an Endpoint
   Property Service for the "pid" property defined on the endpoints of
   the default network map.

Also the alt-network-map?

I think there are a couple other property maps in the returned IRD that
are not mentioned in the prose at all (not sure if they need to be).

Section 10.4

   Note that, to be compact, the response does not include the entity
   "ipv4:192.0.2.0", because values of all those properties for this
   entity are inherited from other entities.

Is this really the single IP address, equivalent to 192.0.2.0/32?  I
don't see why it's special enough to get called out, as opposed to the
other addresses in 192.0.2.0/27.

                                                    The same rule
   applies to the entities "ipv4:192.0.3.0/28" and "ipv4:192.0.3.0/28".

Should one of these be 192.0.3.16/28?

Section 10.5

   Note that the value of "state" for "ipv4:192.0.2.0" is the only
   explicitly defined property; the other values are all derived by the
   inheritance rules for Internet address entities.

I think the .2.1 is explicitly defined and .2.0 is inherited...

     "property-map": {
       "ipv4:192.0.2.0":
              {".ISP": "BitsRus", ".ASN": "65543", ".state": "PA"},
       "ipv4:192.0.2.1":
              {".ISP": "BitsRus", ".ASN": "65543", ".state": "NJ"},
       "ipv4:192.0.2.17":
              {".ISP": "BitsRus", ".ASN": "65543", ".state": "CT"}

...and my reading of the table in §10.2 would have .2.0 as NJ and .2.1
as PA.

Section 10.6

       "ipv4:192.0.2.0":     {".state": "PA"},

As above, I think this has to be .2.1.

       "ipv4:192.0.3.0/28":  {".ASN": "65543",
                              ".state": "TX"},
       "ipv4:192.0.3.16/28": {".ASN": "65543",
                              ".state": "MN"}

These ASNs should be 65544.

Section 11

   endpoint properties conveyed by using [RFC7285].  Client requests may
   reveal details on their activity or plans thereof, that a malicious
   user may monetize or use for attacks or undesired surveillance.

This would be a malicious Server that's in a position to do so, right
(vs. "user")?

Section 12.1

   Security considerations:
      Security considerations related to the generation and consumption
      of ALTO Protocol messages are discussed in Section 15 of
      [RFC7285].

I think we should also reference Section 11 of this document as having
relevant considerations.

Section 12.2, 12.3

Should we write "priv:*" or some other wildcard to indicate that this
entry is for the class of identifiers beginning with that prefix, and
not the literal identifier "priv:"?

Section 14.1

The RFC 5246 is unused (in favor of RFC 8446, thanks!).

Section 14.2

If it's RECOMMENDED to use the RFC 8895 mechanisms, that seems to
promote 8895 to be a normative reference, per
https://www.ietf.org/about/groups/iesg/statements/normative-informative-references/

NITS

Section 1

   and IPv6 addresses.  It is reasonable to think that collections of
   endpoints, as defined by CIDRs [RFC4632] or PIDs, may also have

We haven't defined PIDs yet.

   At first, a map of endpoint properties might seem impractical,
   because it could require enumerating the property value for every
   possible endpoint.  However, in practice, the number of endpoint
   addresses involved by an ALTO server can be quite large.  To avoid

This doesn't seem like a "however" that contrasts the previous point;
rather, it seems like an "in particular" that expounds on the scale of
impracticality.

      and Filtered Property Map, detailed in Section 8.  The former is a
      GET-mode resource that returns the property values for all
      entities in one or more entity domains, and is analogous to a
      network map or a cost map in [RFC7285].  The latter is a POST-mode

The terms "GET-mode" and "POST-mode" don't seem to be defined or used in
RFC 7285, so we probably need to introduce them here if we're going to
use them.

Section 4.4

   grouped in blocks.  When a same property value applies to a whole
   set, a Server can define a property for the identifier of this set
   instead of enumerating all the entities and their properties.  This

s/a same/the same/

Section 4.4.1

   An entity domain may allow using a single identifier to identify a
   set of individual entities.  For example, a CIDR block can be used to

I suggest "set of related individual entities".

Section 5.2.2


   The specific information resource of an entity property may be the
   current information resource itself, that is, the property map
   defining the property.  In that case, the ResourceID in the property
   name SHOULD be ignored.  For example, the property name ".asn"

I think s/ignored/omitted/.

Section 6.1.3

   Hierarchical addresses can also inherit properties: if a property P
   is not defined for the hierarchical address C, but is defined for a
   set of hierarchical addresses, where each address C' in the set
   covers all IP addresses in C, and C' has a shorter prefix length than

I think the usage of "set" and "covers" is unclear here.
(Also, pedantically, the empty set is a set, and any statement of the
form "<X> holds for each element of the set" is true for the empty set,
but there is no C' in the empty set to have a shorter prefix length than
C.)

   *  If that entity would inherit a value for that property, then the
      ALTO server MUST return a "null" value for that property.  In this

This is the JSON "null" value, at least for the currently defined media
types, right?  That might be worth clarifying (while retaining the
generic nature not tied to a JSON representation).

   *  If the entity would not inherit a value, then the ALTO server MAY
      return "null" or just omit the property.  In this case, the ALTO
      client cannot infer the value for this property of this entity
      from the Inheritance rules.  So the client MUST interpret that
      this property has no value.

This probably doesn't need to be a BCP 14 keyword, as the behavior
follows from the other required parts of the spec.

Section 7.6

      entity does.  The ALTO client MUST ignore any resource-specific
      property for this entity if its mapping is not indicated, in the
      IRD, in the "mappings" capability of the property map resource.

The pronoun "its" might be anti-helpful here, as (if I understand
correctly) we mean to say that the the entity domain that's the defining
information resource for this resource-specific property is what's
listed in the capabilities map, but "its" leaves the exact relation a
bit under-specified.

Section 8

   query.  To support such a case, the filtered property map provides a
   light weight response, with empty property values.

This might be (mis?)read as saying that the filtered property map
*always* provides a response with empty property values.  So I'd suggest
adding a qualifier, like "provides a facility for" or "supports a
lightweight response".

Section 8.1

   The media type of a property map resource is "application/alto-
   propmap+json".

Do we want "filtered" here?

Section 8.3

   ReqFilteredPropertyMap.  The design of object ReqFilteredPropertyMap
   supports the following cases of client requests:

The grammar seems off, here -- maybe "the object" or just
"ReqFilteredPropertyMap is designed to support"?

Section 8.6

   *  When the input member "properties" is absent from the client
      request, the Server returns a property map containaing all the

s/containaing/containing/