[alto] Benjamin Kaduk's Discuss on draft-ietf-alto-unified-props-new-21: (with DISCUSS and COMMENT)
Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 01 December 2021 23:00 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: alto@ietf.org
Delivered-To: alto@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A2993A0CBE; Wed, 1 Dec 2021 15:00:29 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-alto-unified-props-new@ietf.org, alto-chairs@ietf.org, alto@ietf.org, Vijay Gurbani <vijay.gurbani@gmail.com>, vijay.gurbani@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.40.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <163839962921.6785.17983443500874892419@ietfa.amsl.com>
Date: Wed, 01 Dec 2021 15:00:29 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/7nP0BzW7_edJq_BZfVx4P4p5yPc>
Subject: [alto] Benjamin Kaduk's Discuss on draft-ietf-alto-unified-props-new-21: (with DISCUSS and COMMENT)
X-BeenThere: alto@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Application-Layer Traffic Optimization \(alto\) WG mailing list" <alto.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/alto>, <mailto:alto-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/alto/>
List-Post: <mailto:alto@ietf.org>
List-Help: <mailto:alto-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/alto>, <mailto:alto-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Dec 2021 23:00:29 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-alto-unified-props-new-21: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/ for more information about how to handle DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-alto-unified-props-new/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- (1) Section 8.6 seems to have some conflicting requirements. The filtered property map response "MUST include all the inherited property values for the requested entities and all the entities which are able to inherit property values from the requested entities." We then go on to say that to do this, the server MAY follow three rules, that themselves include SHOULD-level guidance, but don't say how the MUST is achieved if the SHOULDs or MAY are ignored. I was expecting to see a construction of the form "SHOULD do X, but if not, MUST do Y". (2) Many of the examples in Sections 10.X do not seem to match up with the prose that describes them and the previous data tables that they are intended to illustrate (see COMMENT). We should make sure that the examples are internally consistent. (3) Section 4.6.2 says: * Last, the entity domain types "asn" and "countrycode" defined in [I-D.ietf-alto-cdni-request-routing-alto] do not have a defining information resource. Indeed, the entity identifiers in these two entity domain types are already standardized in documents that the Client can use. But earlier we said that "the defining information resource of a resource-specific entity domain D is unique", but this seems to be saying that the defining information resource of domains of the "asn" and "contrycode" type are *not* unique, by virtue of not existing at all. How can we rectify these two statements? ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- I suggest noting somewhere early-ish that the (semi-)formal notation defined in Section 8.2 of RFC 7285 will be used. Section 1 properties. Furthermore, recent ALTO use cases show that properties of entities such as network flows [RFC7011] and routing elements [RFC7921] are also useful. Such cases are documented in [I-D.gao-alto-fcs]. The current EPS however is restricted to This is probably more relevant as a comment on draft-gao-alto-fcs than this document, but putting the ALTO server in a position to know about individual flows seems like a big privacy risk, especially in the face of pervasive monitoring (per RFC 7258). It's not really clear that this is actually a good idea to do, and thus whether we want to mention it here. Section 3.2.2 There seems to be an unfortunate risk of conflation of parsing as ((entity domain) name) vs (entity (domain name)), with domain name being the widely-used term (see, e.g., RFC 8499). Could we find some alternate terminology that doesn't suffer from this potential confusion? Section 4.4 For some domain types, entities can be grouped in a set and be defined by the identifier of this set. This is the case for domain >From a mathematical/set-theoretic perspective, this statement is trivially true for all domain types; that's just how sets work. I think what we want to say here is that they can be efficiently grouped by utilizing an underlying structure for the entities in the given domain type. That might become, for example, "For some domain types, there is an underlying structure that allows entities to efficiently be grouped into a set and be defined by the identifier of this set". Section 4.6 Besides, it is also necessary to inform a Client about which associations of specific resources and entity domain types are allowed, because it is not possible to prevent a Server from exposing inappropriate associations. [...] This reasoning is a bit hard for me to follow. It's not possible to prevent a server from exposing nonsensical things, sure. But often we would just define the correct operation of the protocol to be only exposing things that make sense, and if the server is noncompliant to the spec, things break accordingly. Section 4.6.1 The defining information resource of a resource-specific entity domain D is unique and has the following specificities: [...] * its media type is unique and equal to the one that is specified for the defining information resource of an entity domain type. I find this definition worrisom, as it seems to imply that the given resource only has one media type, implicily precluding the server from ever exposing the representation of that resource via a different media type. This does not mesh well with my understanding of the HTTP ecosystem and the guidelines for using HTTP as a substrate for building other protocols. For example, in draft-ietf-httpbis-semantics, we see that the notion of an HTTP resource is in some sense an abstract resource, and HTTP only conveys representations of that resource. There can inherently be multiple representations of a given resource, and there is a media-type negotiation to determine what representation is returned. Likewise, in https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-bcp56bis#section-4.16 we see that it is an expected part of protocol evolution to define new media types that enable new functionality by virtue of the new format, even if the abstract resource being provided remains the same. A fundamental attribute of a defining information resource is its media type. There is a unique association between an entity domain Similarly here -- the media type is not a property of the resource, but rather of a representation that is conveyed over HTTP. [For brevity, I will try to not make further comments on this topic, though I think that the theme continues in other locations in the document.] I think that what ALTO wants is a new conceptual "ALTO resource type" that is distinct from an (HTTP) media type. Unfortunately, I see that RFC 7285 has already placed a strong emphasis on media types specifically, so addressing this topic might best be done as a separate work item. Section 5.1.1 [RFC8126] without a need to register with IANA. All other entity domain types appearing in an HTTP request or response with an "application/alto-*" media type MUST be registered with the IANA, That's a rather unusually specific requirement to have it conditioned solely on the application/alto-* media type. Often when we have such a requirement we would consider adding a note to the registry to help remind reviewers that the requirement exists, but I would rather advocate for removing the media-type specificity of the requirement, here. (Also in §5.2.1.) A Private Use entity domain type identifier and its associated internal specification MUST apply to all the property maps of an IRD. I don't think I understand what this requirement is trying to say. The best reading I can come up with so far is that it says "if there is a private use type identifier presented in an IRD, that entity domain type must be present in all of the property maps in the IRD. On the face of it, that seems like an absurd requirement to meet, though, since even the primary types we're defining in this document are not going to all apply to all property maps. (Also in §5.2.1.) identifier) to reduce potential collisions. The format of the entity identifiers (see Section 5.1.3) in that type of entity domain, as well as any hierarchical or inheritance rules (see Section 5.1.4) for those entities, MUST be specified at the same time. What does "specified at the same time" mean? In the same document? Section 5.1.2.1 A resource-specific entity domain is identified by an entity domain name constructed as follows. It MUST start with a resource ID using the ResourceID type defined in Section 10.2 of [RFC7285], followed by the '.' separator (U+002E), followed by a string of the type EntityDomainType specified in Section 5.1.1. This seems to disallow using the "priv:" form for (the "type" part of) resource-specific entity domain names. Is that really intended? Section 5.1.2.3 A self-defined entity domain can be viewed as a particular case of resource-specific entity domain, where the specific resource is the current resource that uses this entity domain. In that case, for the sake of simplification, the component "ResourceID" SHOULD be omitted in its entity domain name. Why do we need the flexibility of allowing both X.X and .X to represent the same information? Wouldn't it be simpler to only allow one form? Having a single well-specified procedure tends to result in more secure implementations. Section 5.2.1 However, when applied to an entity in a "pid" domain type, the property would indicate the location of the center of all hosts in this "pid" entity. I'd consider saying something less specific, like "would indicate a location representative of all hosts in this "pid" entry", avoiding the term "center" that invites questions of how it is computed. (Similarly, §9.3 mentions the "barycenter" of a set of addresses.) Section 5.2.2 EntityPropertyName ::= [ResourceID]'.'[priv:]EntityPropertyType Should the "priv:" component be quoted here as a literal? Section 6.1.1.2 Individual addresses are strings as specified by the IPv4Addresses rule of Section 3.2.2 of [RFC3986]; Hierarchical addresses are prefix-match strings as specified in Section 3.1 of [RFC4632]. To The referenced section of RFC 4632 does not refer to "prefix-match strings" at all (and only once to "match" at all, not directly related to prefixes). Please use the terminology of the referenced document to indicate the functionality that is being integrated. Section 6.1.2.2 Individual addresses are strings as specified by Section 4 of [RFC5952]; Hierarchical addresses are prefix-match strings as specified in Section 7 of [RFC5952]. To define properties, an Is this the right reference for prefix-matching IPv6 addresses? Section 7 of RFC 5952 is quite short... Section 6.1.3 Both Internet address domains allow property values to be inherited. Specifically, if a property P is not defined for a specific Internet address I, but P is defined for a hierarchical Internet address C which prefix-matches I, then the address I inherits the value of P defined for the hierarchical address C. If more than one such I think there's some room for tightening up the terminology around "hierarchical" addresses. While we do use the term in some earlier parts of the document, it's not specifically defined anywhere that I found. The usage in this section, on the other hand, seems like it would be easy to replace with discussion of "prefix"es and avoid the need for a new term. If we do want to keep the "hierarchical" concept, I strongly suggest adding some terminology section that specifically defines what we use it to mean. Section 7.5 The "uses" field of a property map resource in an IRD entry specifies dependent resources of this property map. It is an array of the resource ID(s) of the resource(s). This doesn't seem to add anything above the definition of the "uses" field from RFC 7285; shouldn't we be saying something about the defining information resource for resource-specific domains/properties (in addition to any other dependent resources)? Section 8.3 If it is absent, the Server returns an empty property value '{}' for all the entity IDs of the "entities" field on which at least one property is defined. Is that the literal string "{}" or an empty JSON object? Given the SHOULD-level guidance we give elsewhere to assume that the response is a string (or JSON null value, which both {} and "{}" are not), it seems important to provide clarity on this matter. Section 8.6 Given that there are identifiers that can be interpreted as both an entity name and a property name, and we have the same error code for invalid entity identifier and invalid property name (with guidance to return the invalid identifier in the error message), are we setting up for a situation where the error message is ambiguous about which interpretation of the string was invalid? * The response only includes the entities and properties requested by the client. If an entity in the request is identified by a hierarchical identifier (e.g., a "ipv4" or "ipv6" prefix), the response MUST cover properties for all identifiers in this hierarchical identifier. Just to check: the intent here is that we return all properties that are present on any address covered by the prefix, even though some of those properties may not be present on all addresses covered by the prefix? Section 10.x I am not really an HTTP expert, but the content-lengths in these examples seem to be based on byte counts with Unix-style line separators, whereas (per draft-ietf-httpbis-messaging) my understanding is that the values should be computed with CRLF as the line separator. Section 10.2 Beyond "pid", the examples in this section use four additional properties for Internet address domains, "ISP", "ASN", "country" and "state", with the following values: Are these property names, types, or both? Should we use "countrycode" that is defined by draft-ietf-alto-cdni-request-routing-alto, rather than the very similar sounding "country"? Section 10.3 The following IRD defines ALTO Server information resources that are relevant to the Entity Property Service. It provides two property maps: one for the "ISP" and "ASN" properties, and another one for the "country" and "state" properties. [...] I may be misreading things, but I could only find the former of these two. I should be looking for resources that have the "application/alto-propmap+json" media-type and do not accept parameters, right? The server provides several filtered property maps. The first returns all four properties, and the second returns only the "pid" property for the default network map. Does it also return the "pid" property for the alt-network-map? The filtered property maps for the "ISP", "ASN", "country" and "state" properties do not depend on the default network map (it does not have a "uses" capability), because the definitions of those I only see "ISP" showing up in the ia-property-map and the iacs-property-map, both of which list "uses" for the default-network-map. Note that for legacy clients, the ALTO server provides an Endpoint Property Service for the "pid" property defined on the endpoints of the default network map. Also the alt-network-map? I think there are a couple other property maps in the returned IRD that are not mentioned in the prose at all (not sure if they need to be). Section 10.4 Note that, to be compact, the response does not include the entity "ipv4:192.0.2.0", because values of all those properties for this entity are inherited from other entities. Is this really the single IP address, equivalent to 192.0.2.0/32? I don't see why it's special enough to get called out, as opposed to the other addresses in 192.0.2.0/27. The same rule applies to the entities "ipv4:192.0.3.0/28" and "ipv4:192.0.3.0/28". Should one of these be 192.0.3.16/28? Section 10.5 Note that the value of "state" for "ipv4:192.0.2.0" is the only explicitly defined property; the other values are all derived by the inheritance rules for Internet address entities. I think the .2.1 is explicitly defined and .2.0 is inherited... "property-map": { "ipv4:192.0.2.0": {".ISP": "BitsRus", ".ASN": "65543", ".state": "PA"}, "ipv4:192.0.2.1": {".ISP": "BitsRus", ".ASN": "65543", ".state": "NJ"}, "ipv4:192.0.2.17": {".ISP": "BitsRus", ".ASN": "65543", ".state": "CT"} ...and my reading of the table in §10.2 would have .2.0 as NJ and .2.1 as PA. Section 10.6 "ipv4:192.0.2.0": {".state": "PA"}, As above, I think this has to be .2.1. "ipv4:192.0.3.0/28": {".ASN": "65543", ".state": "TX"}, "ipv4:192.0.3.16/28": {".ASN": "65543", ".state": "MN"} These ASNs should be 65544. Section 11 endpoint properties conveyed by using [RFC7285]. Client requests may reveal details on their activity or plans thereof, that a malicious user may monetize or use for attacks or undesired surveillance. This would be a malicious Server that's in a position to do so, right (vs. "user")? Section 12.1 Security considerations: Security considerations related to the generation and consumption of ALTO Protocol messages are discussed in Section 15 of [RFC7285]. I think we should also reference Section 11 of this document as having relevant considerations. Section 12.2, 12.3 Should we write "priv:*" or some other wildcard to indicate that this entry is for the class of identifiers beginning with that prefix, and not the literal identifier "priv:"? Section 14.1 The RFC 5246 is unused (in favor of RFC 8446, thanks!). Section 14.2 If it's RECOMMENDED to use the RFC 8895 mechanisms, that seems to promote 8895 to be a normative reference, per https://www.ietf.org/about/groups/iesg/statements/normative-informative-references/ NITS Section 1 and IPv6 addresses. It is reasonable to think that collections of endpoints, as defined by CIDRs [RFC4632] or PIDs, may also have We haven't defined PIDs yet. At first, a map of endpoint properties might seem impractical, because it could require enumerating the property value for every possible endpoint. However, in practice, the number of endpoint addresses involved by an ALTO server can be quite large. To avoid This doesn't seem like a "however" that contrasts the previous point; rather, it seems like an "in particular" that expounds on the scale of impracticality. and Filtered Property Map, detailed in Section 8. The former is a GET-mode resource that returns the property values for all entities in one or more entity domains, and is analogous to a network map or a cost map in [RFC7285]. The latter is a POST-mode The terms "GET-mode" and "POST-mode" don't seem to be defined or used in RFC 7285, so we probably need to introduce them here if we're going to use them. Section 4.4 grouped in blocks. When a same property value applies to a whole set, a Server can define a property for the identifier of this set instead of enumerating all the entities and their properties. This s/a same/the same/ Section 4.4.1 An entity domain may allow using a single identifier to identify a set of individual entities. For example, a CIDR block can be used to I suggest "set of related individual entities". Section 5.2.2 The specific information resource of an entity property may be the current information resource itself, that is, the property map defining the property. In that case, the ResourceID in the property name SHOULD be ignored. For example, the property name ".asn" I think s/ignored/omitted/. Section 6.1.3 Hierarchical addresses can also inherit properties: if a property P is not defined for the hierarchical address C, but is defined for a set of hierarchical addresses, where each address C' in the set covers all IP addresses in C, and C' has a shorter prefix length than I think the usage of "set" and "covers" is unclear here. (Also, pedantically, the empty set is a set, and any statement of the form "<X> holds for each element of the set" is true for the empty set, but there is no C' in the empty set to have a shorter prefix length than C.) * If that entity would inherit a value for that property, then the ALTO server MUST return a "null" value for that property. In this This is the JSON "null" value, at least for the currently defined media types, right? That might be worth clarifying (while retaining the generic nature not tied to a JSON representation). * If the entity would not inherit a value, then the ALTO server MAY return "null" or just omit the property. In this case, the ALTO client cannot infer the value for this property of this entity from the Inheritance rules. So the client MUST interpret that this property has no value. This probably doesn't need to be a BCP 14 keyword, as the behavior follows from the other required parts of the spec. Section 7.6 entity does. The ALTO client MUST ignore any resource-specific property for this entity if its mapping is not indicated, in the IRD, in the "mappings" capability of the property map resource. The pronoun "its" might be anti-helpful here, as (if I understand correctly) we mean to say that the the entity domain that's the defining information resource for this resource-specific property is what's listed in the capabilities map, but "its" leaves the exact relation a bit under-specified. Section 8 query. To support such a case, the filtered property map provides a light weight response, with empty property values. This might be (mis?)read as saying that the filtered property map *always* provides a response with empty property values. So I'd suggest adding a qualifier, like "provides a facility for" or "supports a lightweight response". Section 8.1 The media type of a property map resource is "application/alto- propmap+json". Do we want "filtered" here? Section 8.3 ReqFilteredPropertyMap. The design of object ReqFilteredPropertyMap supports the following cases of client requests: The grammar seems off, here -- maybe "the object" or just "ReqFilteredPropertyMap is designed to support"? Section 8.6 * When the input member "properties" is absent from the client request, the Server returns a property map containaing all the s/containaing/containing/
- [alto] Benjamin Kaduk's Discuss on draft-ietf-alt… Benjamin Kaduk via Datatracker