Re: [regext] Benjamin Kaduk's Discuss on draft-ietf-regext-rdap-sorting-and-paging-17: (with DISCUSS and COMMENT)

Mario Loffredo <mario.loffredo@iit.cnr.it> Fri, 02 October 2020 06:41 UTC

Return-Path: <mario.loffredo@iit.cnr.it>
X-Original-To: regext@ietfa.amsl.com
Delivered-To: regext@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E39CF3A0E7F; Thu, 1 Oct 2020 23:41:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.889
X-Spam-Level: **
X-Spam-Status: No, score=2.889 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=5, NICE_REPLY_A=-0.213, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bU5nTT1o63-j; Thu, 1 Oct 2020 23:41:14 -0700 (PDT)
Received: from smtp.iit.cnr.it (mx3.iit.cnr.it [146.48.98.150]) by ietfa.amsl.com (Postfix) with ESMTP id 7FBCB3A00C4; Thu, 1 Oct 2020 23:41:13 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by smtp.iit.cnr.it (Postfix) with ESMTP id B45686007B1; Fri, 2 Oct 2020 08:41:11 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mx3.iit.cnr.it
Received: from smtp.iit.cnr.it ([127.0.0.1]) by localhost (mx3.iit.cnr.it [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KQPHFq6UZdX9; Fri, 2 Oct 2020 08:41:05 +0200 (CEST)
Received: from [192.12.193.108] (pc-loffredo.nic.it [192.12.193.108]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by smtp.iit.cnr.it (Postfix) with ESMTPSA id 460AA60012A; Fri, 2 Oct 2020 08:41:05 +0200 (CEST)
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: regext-chairs@ietf.org, draft-ietf-regext-rdap-sorting-and-paging@ietf.org, The IESG <iesg@ietf.org>, Tom Harrison <tomh@apnic.net>, regext@ietf.org
References: <160089722480.18312.1611285341459635513@ietfa.amsl.com> <78b84143-eecd-ea03-a4db-077dd9920dc4@iit.cnr.it> <20200929001425.GN89563@kduck.mit.edu> <c76db7f1-8c88-142b-5242-49e0a62e9b38@iit.cnr.it> <20201002022020.GR89563@kduck.mit.edu>
From: Mario Loffredo <mario.loffredo@iit.cnr.it>
Message-ID: <69b025f9-89c7-932f-5707-b6b31e492565@iit.cnr.it>
Date: Fri, 02 Oct 2020 08:37:42 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0
MIME-Version: 1.0
In-Reply-To: <20201002022020.GR89563@kduck.mit.edu>
Content-Type: text/plain; charset="iso-8859-15"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: it
Archived-At: <https://mailarchive.ietf.org/arch/msg/regext/nuRZMusTGON1CDYUFMX-ekQxZ9A>
Subject: Re: [regext] Benjamin Kaduk's Discuss on draft-ietf-regext-rdap-sorting-and-paging-17: (with DISCUSS and COMMENT)
X-BeenThere: regext@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Registration Protocols Extensions <regext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/regext>, <mailto:regext-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/regext/>
List-Post: <mailto:regext@ietf.org>
List-Help: <mailto:regext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/regext>, <mailto:regext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Oct 2020 06:41:18 -0000

Hi Ben,

thnks again for your careful feedback. I'll publish -18 as soon as possible.

Best,

Mario

Il 02/10/2020 04:20, Benjamin Kaduk ha scritto:
> Hi Mario,
>
> Not a whole lot left to say, but what there is is inline.
>
> On Wed, Sep 30, 2020 at 01:28:53PM +0200, Mario Loffredo wrote:
>> Hi Ben,
>>
>> thanks a loto for uor quick reply to my responses. My comments are inline.
>>
>> Il 29/09/2020 02:14, Benjamin Kaduk ha scritto:
>>> Hi Mario,
>>>
>>> Also inline.
>>>
>>> On Sat, Sep 26, 2020 at 03:56:03PM +0200, Mario Loffredo wrote:
>>>> Hi Benjamin,
>>>>
>>>> thanks a lot for your extensive review. I apologize for the delay in
>>>> replying but I have been very busy the last two days and your feedback
>>>> is very detailed.
>>>>
>>>> Please find my coments inline.
>>>>
>>>> Il 23/09/2020 23:40, Benjamin Kaduk via Datatracker ha scritto:
>>>>> Benjamin Kaduk has entered the following ballot position for
>>>>> draft-ietf-regext-rdap-sorting-and-paging-17: Discuss
>>>>>
>>>>> When responding, please keep the subject line intact and reply to all
>>>>> email addresses included in the To and CC lines. (Feel free to cut this
>>>>> introductory paragraph, however.)
>>>>>
>>>>>
>>>>> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
>>>>> for more information about IESG DISCUSS and COMMENT positions.
>>>>>
>>>>>
>>>>> The document, along with other ballot positions, can be found here:
>>>>> https://datatracker.ietf.org/doc/draft-ietf-regext-rdap-sorting-and-paging/
>>>>>
>>>>>
>>>>>
>>>>> ----------------------------------------------------------------------
>>>>> DISCUSS:
>>>>> ----------------------------------------------------------------------
>>>>>
>>>>> Should we say something about which order the sorting criteria are
>>>>> applied (first to last vs last to first) when multiple sortItems are
>>>>> specified in a query?
>>>> [ML] The common interpretation is from left to right so I don't think we
>>>> need to clarify this concept.
>>> I think I can accept not saying more on this subject, but I am curious:
>>> when you say left to right, that means that the leftmost parameter is
>>> higher priority?  So that, to give a totally contrived example, if I had
>>> pairs of (name, id), a query with &sort=name;&sort=id; would give:
>>>
>>> ("alpha", 10)
>>> ("alpha", 20)
>>> ("beta", 10)
>>> ("beta", 20)
>> [ML]  Exactly.
>>
>> One minor comments: the right notation would be &sort=name,id
> This is what I get for writing email without consulting the document;
> thanks for spotting it.
>
>>>>> I recognize that in the HATEOS model, the actual JSONPaths reported by
>>>>> the server should be used by the client to determine what a given sort
>>>>> property does, but it also seems like it would be confusing for this
>>>>> document to specify (e.g.) an "email" property with specific JSONPath,
>>>>> and then have a server go off and use "email" to mean something else,
>>>>> even if that is just the addition of "pref" as discussed at the end of
>>>>> Section 2.3.1.  Do we want to try to have the properties defined by this
>>>>> document be universally defined and encourage the use of new/different
>>>>> property names for variations on them?  (The answer may well be "no",
>>>>> but the answer is not intuitively clear to me.)  To put it another way,
>>>>> is the list in Section 2.3.1 normative, or just an example?
>>>> [ML] I would say "normative" just to facilitate interoperability and
>>>> avoid ambiguities. Maybe it could be enough to say that the sorting
>>>> properties deifined in the document are considered reserved so an RDAP
>>>> server MUST not map them onto other RDAP response values.
>>>>
>>>> Does it work for you?
>>> Yes, that would work for me.  Thanks!
>> [ML] Perfect.
>>>>> ----------------------------------------------------------------------
>>>>> COMMENT:
>>>>> ----------------------------------------------------------------------
>>>>>
>>>>> Section 1
>>>>>
>>>>>       However, there are some drawbacks associated with the use of the HTTP
>>>>>       header.  First, the header properties cannot be set directly from a
>>>>>       web browser.  Moreover, in an HTTP session, the information on the
>>>>>       status (i.e. the session identifier) is usually inserted in the
>>>>>       header or a cookie, while the information on the resource
>>>>>       identification or the search type is included in the query string.
>>>>>       The second approach is therefore not compliant with the HTTP standard
>>>>>       [RFC7230].  As a result, this document describes a specification
>>>>>       based on the use of query parameters.
>>>>>
>>>>> A few more words (section number from 7230?) on why the second approach
>>>>> is not compliant with HTTP might help the reader, though it isn't
>>>>> stricly necessary (we're not using it, after all).
>>>> [ML] Could it be better to replace RFC7230 with RFC7231 and put a
>>>> refernce to Section 8.3.1
>>>> (https://tools.ietf.org/html/rfc7231#section-8.3.1) ?
>>> It might be, but I think I still don't understand why using the HTTP header
>>> field for sorting and paging information is not compliant with HTTP -- the
>>> linked section says that header fields can be used to communicate
>>> information about the target resource, which IIUC includes the resource as
>>> qualified by the query string.  (But note that I am not an HTTP expert...)
>> [ML]  Would it be more appropriate to update the sentence as in the
>> following?
>>
>> OLD
>>
>> The second approach is therefore not compliant with the HTTP standard
>> [RFC7230]
>>
>> NEW
>>
>> The second approach is therefore not compliant with the most common
>> practices about the usage of the HTTP headers [RFC7231]
> That's better, though it does not help the reader find out what those
> "common practices" are.  (It suffices to justify the choice, though, so I
> won't press the point further.)
>
>>>>> Section 2.1
>>>>>
>>>>>          *  "jsonPath": "String" (OPTIONAL) the JSONPath of the RDAP field
>>>>>             corresponding to the property;
>>>>>
>>>>> What is this path relative to?  (Does the client have to know from the
>>>>> other context what type of object it refers to?)
>>>> [ML]  All the JSONPath expressions defined in the document are relative
>>>> to the root of an RDAP response.  The sorting_metadata object is
>>>> included in the same response so I think that the context is clear and
>>>> no further clarification is needed.
>>> Now that you mention it, I do recall other discussion of paths being
>>> relative to the root; my apologies for the noise.
>> [ML] You are welcome.
>>>>>          *  "links": "Link[]" (OPTIONAL) an array of links as described in
>>>>>             [RFC8288] containing the query string that applies the sort
>>>>>             criterion.
>>>>>
>>>>> Just to check: this is going to have the same structure for a Link
>>>>> object that draft-ietf-regext-rdap-partial-response does?  (I am not
>>>>> coming up with a great way to deduplicate the definitions, off the top
>>>>> of my head.)
>>>> [ML] Yes. The sorting links have the same structure as the subsetting
>>>> links (see Section 2.3.2.).
>>>>>       o  "pageSize": "Numeric" (OPTIONAL) a numeric value representing the
>>>>>          number of objects returned in the current page.  It MUST be
>>>>>          provided if and only if the total number of objects exceeds the
>>>>>          page size.  This property is redundant for RDAP clients because
>>>>>          the page size can be derived from the length of the search results
>>>>>          array but, it can be helpful if the end user interacts with the
>>>>>          server through a web browser;
>>>>>
>>>>> If it's redundant, we should probably say something about error handling
>>>>> for when the things that are supposed to be identical have different
>>>>> values.
>>>> [ML]  I think this situation is very unlinkely.  Anyway, in this case,
>>>> the length of the results array really counts. Obviously, it is a bit
>>>> more likely that the totalCount value might be different from the sum
>>>> of  the number of results in each page. In fact, even if the
>>>> registration data can't be considered real-time data, it might happen
>>>> that the count parameter is present in the initial query and it might
>>>> take time to scroll the result set, so there could be a small likelihood
>>>> that the initial totalCount value is obsolete because the result set is
>>>> changed in the meantime. Also in this case, the sum of each result array
>>>> length really counts.
>>>>
>>>> Should I write something about?
>>> Thanks for the additional explanation.  I think that if we were to write
>>> anything more, it would just be a few words in the description here to
>>> indicate that it's just for convenience, e.g., "representing the number of
>>> objects that should have been returned in the current page".  But it is
>>> probably okay to leave it unchanged, too.
>> [ML] Changed.
>>>>> Section 2.3
>>>>>
>>>>>       Except for sorting IP addresses, servers MUST implement sorting
>>>>>       according to the JSON value type of the RDAP field the sorting
>>>>>       property refers to.  That is, JSON strings MUST be sorted
>>>>>       lexicographically and JSON numbers MUST be sorted numerically.  If IP
>>>>>       addresses are represented as JSON strings, they MUST be sorted based
>>>>>       on their numeric conversion.
>>>>>
>>>>> There are more JSON types than string and number; are those other types
>>>>> garanteed to not appear in sortable RDAP fields?  (I can't see how such
>>>>> a guarantee could be made, given that servers can define their own
>>>>> sorting properties.)
>>>> [ML] The other primitive JSON type remaining is boolean but I don't
>>>> think it makes sense to sort by a boolean property. Instead, I missed
>>> I think I also had JSON maps in mind, but I guess it is not exactly defined
>>> to sort by a map itself, only a primitive type, so my question was a bit
>>> silly.
>> [ML] Yes. Only values with primitive types.
>>>> that those values denoting dates and times MUST be sorted in
>>>> chronological order even if they are strings. I'll update the sentence
>>>> as in the following:
>>>>
>>>> Except for sorting IP addresses and values denoting dates and times, servers MUST implement sorting
>>>>       according to the JSON value type of the RDAP field the sorting
>>>>       property refers to.  That is, JSON strings MUST be sorted
>>>>       lexicographically and JSON numbers MUST be sorted numerically.
>>>>       Values denoting dates and times MUST be sorted in chronological order.  If IP
>>>>       addresses are represented as JSON strings, they MUST be sorted based
>>>>       on their numeric conversion.
>>>>
>>>> Does it work for you?
>>> I think so; thanks.
>> [ML] Good.
>>>>>       If the "sort" parameter reports an allowed sorting property, it MUST
>>>>>       be provided in the "currentSort" field of the "sorting_metadata"
>>>>>       element.
>>>>>
>>>>> nit: is "reports" the best word to describe this behavior (which, IIUC,
>>>>> is "present in the query component of the request URL"?
>>>> [ML] Sounds better.
>>>>> Section 2.3.1
>>>>>
>>>>>       In the "sort" parameter ABNF syntax, property-ref represents a
>>>>>       reference to a property of an RDAP object.  Such a reference could be
>>>>>       expressed by using a JSONPath.  The JSONPath in a JSON document
>>>>>
>>>>> nit: is there a missing word here ("a JSONPath expression")?
>>>> [ML] Just for coinciseness, may I use "jsonpath" to mean "JSONPath
>>>> expression" and keep "JSONPath" to refer to the specification?
>>>>
>>>> I could write something like: "JSONPath expression (named "jsonpath" in
>>>> the following)"
>>> That's fine from my perspective, sure.
>> [ML] Perfect.
>>>>>       o  Note that some of the object specific properties are also defined
>>>>>          as query paths.  The object specific properties include:
>>>>>
>>>>> nit: the list structure in this item does not seem parallel to the
>>>>> structure of the first item.
>>>> [ML] OK. I'll change the sentence as in the following:
>>>>
>>>> Object specific properties.  Note that some of these properties
>>>>          are also defined as query paths.  These properties include:
>>>>
>>>>>          as two representations of the same value.  By default, the
>>>>>          unicodeName value MUST be used while sorting.  When the
>>>>>          unicodeName is unavailable, the value of the ldhName MUST be used
>>>>>          instead;
>>>>>
>>>>> I'm not entirely sure how much value "by default" adds here.  Would the
>>>>> meaning be different if we said "The unicodeName value MUST be used
>>>>> while sorting if it is present; when the unicodeName is unavailable, the
>>>>> value of the ldhName is used instead"?
>>>> [ML] No, it wouldn't. I'll change the sentence as you suggest.
>>>>>       o  The jCard "sort-as" parameter MUST be ignored for the sorting
>>>>>          capability described in this document;
>>>>>
>>>>> It's a little bit of a juxtaposition to refer to jCard here in the prose
>>>>> but vcard in the table.
>>>> [ML] I would keep it as is. Instead, I would replace all the "vcard"
>>>> occurrences with "jCard". Being jCard a transliteration of vCard in
>>>> JSON, it seems appropriate to me to keep the references to RFC6350
>>>> sections and  to use the corresponding jCard elements for the mapping
>>>> between the sorting properties and the RDAP response elements. Besides,
>>>> I would write a sentence about the fact that jCard is the JSON format of
>>>> vCard, add a link to RFC7095 and insert RFC7095 among the Normative
>>>> References.
>>>>
>>>> Do you agree?
>>> Yes, thanks.
>> [ML] OK.
>>>>>       o  Even if a nameserver can have multiple IPv4 and IPv6 addresses,
>>>>>          the most common configuration includes one address for each IP
>>>>>          version.  Therefore, the assumption of having a single IPv4 and/or
>>>>>          IPv6 value for a nameserver cannot be considered too stringent.
>>>>>
>>>>> I disagree with the flat assertion that it "cannot be considered too
>>>>> stringent".  It can be so considered, as a matter of difference of
>>>>> opinion; what is appropriate to do here is to say that this
>>>>> document/protocol makes the assumption (especially since we go on to
>>>>> describe the exception-handling procedure when the assumption is
>>>>> violated).
>>>> [ML] May I udpate that sentence as in the following?
>>>>
>>>> OLD
>>>>
>>>> Therefore, the assumption of having a single IPv4 and/or
>>>>          IPv6 value for a nameserver cannot be considered too stringent.
>>>>
>>>> NEW
>>>>
>>>> Therefore, this specification makes the assumption that nameservers have a single IPv4 and/or
>>>>          IPv6 value.
>>> Yes, please!
>> [ML] Done.
>>>>>       o  Multiple events with a given action on an object might be
>>>>>          returned.  If this occurs, sorting MUST be applied to the most
>>>>>          recent event;
>>>>>
>>>>> This makes a lot of sense as the default and I don't propose changing it
>>>>> now, but I do wonder how hard it would be to add support later for
>>>>> sorting on (say) the oldest event instead.
>>>> [ML] Well, I wrote that sentence because some RDAP events can appear
>>>> multiple times. For example, a domain might be locked-unlocked
>>>> repeatedly. The purpose of that sentence is just to avoid ambiguities
>>>> and implicitly suggest RDAP providers to arrange events with the same
>>>> type in descending chronological order.
>>>>>       The "jsonPath" field in the "sorting_metadata" element is used to
>>>>>       clarify the RDAP field the sorting property refers to.  The mapping
>>>>>       between the sorting properties and the JSONPaths of the RDAP fields
>>>>>       is shown below:
>>>>>       [...]
>>>>>          name
>>>>>
>>>>>             $.domainSearchResults[*].unicodeName
>>>>>
>>>>> This seems to ignore the subtlety regarding unicodeName vs ldhName.  Is
>>>>> there a way it could be expressed in JSONPath?
>>>> [ML] If unicodeName and ldhName were alternative, the JSONPath union
>>>> operator would fit (i.e.
>>>> $.domainSearchResults[*].[unicodeName,ldhName]). Currently, RFC7483
>>>> contains no assumption about when they should/must be present but
>>>> examples seem to recommend to present unicodeName only for IDNs. When
>>>> both the properties are present, the union operator doesn't fit exactly
>>>> and I haven't still found the right JSONPath expression based only on
>>>> the basic operators. However, since the "jsonPath" member is only for
>>>> documentation, the aforesaid JSONPath expression could be the most
>>>> suitable for conveying that sorting is applied on a kind of
>>>> <unicodeName, ldhName> combination.
>>> I have to defer to your expertise here; thank you for thinking about it.
>> [ML] Thanks. Maybe this is the only case where the JSONPath WG outcomes
>> might be helpful :-)
>>>>>       o  Nameserver
>>>>>
>>>>>          name
>>>>>
>>>>>             $.domainSearchResults[*].unicodeName
>>>>>
>>>>> Presumably this is supposed to be nameserverSearchResults?
>>>> [ML] Absolutely. It's a cut-and-paste typo :-)
>>>>> Section 2.4
>>>>>
>>>>> I think we want another introductory paragraph like:
>>>>>
>>>>> % The cursor parameter is used by the server to preserve information
>>>>> % about the pagination state of a given query's results across calls to
>>>>> % the search API, so that successive requests by the client can return
>>>>> % page N, N+1, N+2, etc.  Its value is only required to be interpretable
>>>>> % by the server and could be implemented, for example, as an opaque
>>>>> % database lookup key.  If a server does use a method for generating
>>>>> % cursor values that involves internal structure, such as the one
>>>>> % described below, the server needs to recognize that the value supplied
>>>>> % by a client could have been modified (maliciously), and implement
>>>>> % appropriate bounds-checking and similar measures when parsing received
>>>>> % values.
>>>>>
>>>>> The current wording strongly suggests that base64-encoding a meaningful
>>>>> value that the client could inspect or even construct is required, and I
>>>>> do not think that is very maintainable or what was intended, given the
>>>>> current second paragraph ("servers can change the method over time
>>>>> without announcing anything to clients").
>>>>>
>>>>> (side note) I'm also pretty partial to the way JMAP discusses returning
>>>>> (paginated, but non-uniformly) changes to a given data stream, e.g., at
>>>>> https://www.rfc-editor.org/rfc/rfc8620.html#section-5.2 -- any given
>>>>> state is named, and you can get "stuff starting at <named state>" and
>>>>> the name to use for the state as of the current reply.
>>>> [ML] Maybe I didn't make myself clear.
>>>>
>>>> The Base64 encoding is a simple (unrecommended) trasformation to make
>>>> the cursor value opaque to the client. It just seemed suitable to me for
>>>> being used in some examples.But if you take a loook at the example of
>>>> Figure 6, you may note that you can't obtain a meaningful result by
>>>> simply Base64-decoding the cursor value. Definitively, the method to
>>>> encrypt the cursor value must be more complex than a mere Base64 encoding.
>>>>
>>>> Regarding the sentence between brackets, it means that servers can
>>>> change the underlying pagination strategy without having an impact on
>>>> clients. A server can initially implement the offset pagination and then
>>>> turns to the keyset pagination but this has no effect on clients' features.
>>>>
>>>> The same concepts about the checks that servers should make in order to
>>>> check the cursor value are reported both in the "Negative Answer"
>>>> section and in Appedix C.3. "Paging"
>>>>
>>>> Anyway, I'll try to integrate your text in the current document and add
>>>> a sentence with the purpose of discouraging the use of the
>>>> Base64-encoding in the cursor implementations.
>>> Thank you; I did not know that you wanted to discourage the use of plain
>>> base64 encoding, but that is reassuring to know.
>>> (I did notice that the example in Figure 6 did not decode to a meaningful
>>> result, but did not make much of a conclusion from that.)
>> [ML] OK. I will add some text to clarify that a mere Base64 encoding is
>> not recommended to encrypt the cursor value.
> Okay, thank you.
>
>>>>> Section 4
>>>>>
>>>>> If the server doesn't have access to an efficient (e.g.) counting
>>>>> operation on the backend, would we recommend that the server not support
>>>>> sorting/pagination, since there's not much benefit from having the
>>>>> server pull up all the results and count them just to be able to return
>>>>> the total count value back to the client, and then go do the same work again
>>>>> when the client asks for the next page of results?
>>>> [ML] In my implementation the RDAP server doesn't present the count
>>>> operator in the sorting and paging links. The number of results doesn't
>>>> change at all if the result set is sorted by a property rather than
>>>> another. The same generally occurs (as I wrote above) if the client is
>>>> scrolling the result set pages. So why to repeat the count parameter in
>>>> the links? The totalCount value is returned in the response to the
>>>> initial query and, as It is no more repeated in the links,  the counting
>>>> operation is not executed. Therefore, we don't need to make particular
>>>> assumptions about the performance of counting operation.
>>>>> Section 7
>>>>>
>>>>> I suggest noting that (encoded) structured "cursor" values present a new
>>>>> attack surface on the server that needs to be protected.
>>>> [ML] Sorry, could you futherly explain this concept? AFAIK, it is
>>>> possible to protect REST API endpoints but not query parameters.
>>> I think this was assuming that the server was going to just base64-encode
>>> something like "offset=100,limit=50" -- in that case a client could pass in
>>> the base64'd version of "offset=1000000000,limit=50".  The server would
>>> need to sanity-check the results of base64 decoding and reject the
>>> too-large offset.  If the server is expecting to do a fancier
>>> self-encrypted-token scheme for the cursor, the integrity check associated
>>> with the encryption takes care of this protection inherently, and we may
>>> not need to mention anything about sanitizing these valuess..
>> [ML] OK.
>>>>>       results in a response.  However, this last security policy can result
>>>>>       in a higher inefficiency if the RDAP server does not provide any
>>>>>       functionality to return the truncated results.
>>>>>
>>>>> I'm not sure I understand (or agree with) this last sentence -- it seems
>>>>> that unlateral silent truncation of results by the server leads to not
>>>>> just inefficiency but also potential security considerations in its own
>>>>> right, with the client not knowing that it has incomplete results.
>>>>> Also, if the server is truncating the results, by definition it "has
>>>>> functionality to return the truncated results" -- that's what it's
>>>>> doing!  So I assume the intent was to say something about negotiating or
>>>>> indicating that the results are truncated, not actually doing the
>>>>> truncation.
>>>> [ML] I think that servers legitimately truncate the result sets to
>>>> mitigate the risk of resource exhaustion and consequent denial of
>>>> service. The implementation of the capablities described in this
>>>> document makes servers to keep on managing sustainable result sets and,
>>>> at the same time, increases clients'possibility to avoid truncation and
>>>> find relevant results.
>>> I agree with the paragraph you just wrote.  However, I think that the state
>>> of affairs prior to this document, with unilateral truncation by the
>>> server, can lead not just to "inefficiency" but also to security risks.  So
>>> I was hoping to see something like "can result in higher inefficiency or
>>> risk due to acting on incomplete information".
>> [ML] OK. I will change the sentence as you suggest.
>>> My second point ("Also, if the server [...]") was intending to suggest that
>>> the last sentence say something like "if the RDAP Server does not provide
>>> any functionality to return sorted results or iterate through the full
>>> result set".
>> [ML] You are right. I mispelled the sentence. Is it fine for you if I
>> change the sentence as in the follwing?
>>
>> OLD
>>
>> if the RDAP server does not provide any
>>       functionality to return the truncated results
>>
>> NEW
>>
>> if the RDAP server does not provide any
>>       functionality to return results removed by truncation
> Yes, perfect!
>
>>>>>       The new parameters presented in this document provide RDAP operators
>>>>>       with a way to implement a server that reduces inefficiency risks.
>>>>>
>>>>> [same question about "inefficiency" being the right word]
>>>> [ML] Maybe I can replace the phrase "that reduces inefficiency risks."
>>>> with the phrase "that reduces the risk of resource exhaustion and
>>>> consequent denial of service".
>>>>
>>>> Are you ok with it?
>>> Denial of service is only one of the risks I have in mind; another is that
>>> if a server silently truncates, a client will have incomplete data and
>>> might derive a conclusion ("domain X does not satisfy property Y") that is
>>> fal
>> [ML] OK. As I wrote above, I'll change the sentence to outline the risk
>> due to acting on incomplete information.
>>> se.
>>>
>>>>> Appendix B
>>>>>
>>>>>       o  It does not allow direct navigation to arbitrary pages because the
>>>>>          result set must be scrolled in sequential order starting from the
>>>>>          initial page;
>>>>>
>>>>> (side note) I didn't follow the references, so maybe this was covered
>>>>> there, but I don't quite follow why direct navigation is impossible.  If
>>>>> you use a key field for seeking, can't you just start in the middle from
>>>>> some known value for that key field?
>>>> [ML]  Especially when you know the total counf of a result set,  you can
>>>> directly jump to a specific point in the result set through offset
>>>> pagination but you can't do the same through keyset pagination because
>>>> you don't know the key value at that point in advance. One can wonder:
>>>> what jumping in the result set is use for? Well, for example, if you are
>>>> looking for a specific item in a ordered collection of items, you could
>>>> find it through the quicksort algorithm.
>>> I don't want to press this topic very much, so let me just try a brief
>>> example.  Suppose I have a sorted set of ASCII strings, and I want to see if the
>>> string "koala" is present.  I could start at the beginning and look at each
>>> one in turn until I get to something that sorts after "koala", or I could
>>> ask the database "give me the first thing you have that is after "k", which
>>> gets me some of the say there.  Is the problem that you need an actual
>>> value in the dataset to start from, and since "k" isn't guaranteed to be in
>>> the set you are forced to start from the beginning?
>> [ML] Offset pagination is based on the positions of the results within
>> the result set while keyset pagination is based on a unique combination
>> of values of the results. You can always skip the first K results of a
>> result set by specifying offset=K but you can't do the same through
>> keyset pagination because you don't know what is the combination of
>> values placed at position K.
>>
>> Let me give you an example that can clarify.
>>
>> Let's suppose that the query "k*" returns N=1000 results and the length
>> of result page is 100. The fastest method to find if "koala" is present
>> is to jump to the middle (i.e. offset=500) of the result set and look
>> the first result. If it is lexicographically lower than "koala" then
>> "koala" might be in second half, on the contrary, "koala" might be in
>> the first half. Let's suppose that the first item is greater than
>> "koala", then you can jump to the middle of the first half (offset=250)
>> and apply the above dichotomy in turn until you can find if "koala" is
>> present or not. This process takes log2(N)  steps maximum.
>>
>> You can't do the same through keyset pagination because you don't know
>> how the values are distributed in the result set. You can only scroll
>> the result set from the first page to the last and this process takes N
>> steps maximum.
>>
>> However, in general, one is interested in all the results (or in a
>> subset) returned by a query rather than a single result. In this case,
>> provided that the results can always be sorted according to a unique
>> index, keyset pagination is more efficient than offset pagination. Let's
>> take the afore example. By offset pagination, the underlying DBMS
>> selects always 1000 results and then returns the current page so this
>> means that, in the worst case, the DBMS will select 1000 results for 10
>> times (from offset=0 to offset=900). By keyset pagination the number of
>> results the underlying DBMS selects decreases by 100 results each step.
>> At the beginning, it selects 1000 results and returns the first page,
>> then it selects 900 results and returns the second page.
>>
>> Note also two facts:
>>
>> - the time needed to scroll the result set could be significant when the
>> result set is huge
>>
>> - for all the RDAP searchable objects, it is always possible to build
>> more or less easily a combination of properties acting as unique index
>> and this is true regardless of whether the search includes the sort
>> parameter or not. For example, for the entity object class "handle" is a
>> unique property but if the search includes "sort=registrationDate", the
>> combination <registrationDate, handle> is unique.
> Thank you for the additional explanation; I don't think we need to spend
> more time on this topic.
>
> I'm looking forward to the -18!
>
> Thanks again,
>
> Ben
>
>>>>> Appendix C.2
>>>>>
>>>>>       total count.  Therefore, as "totalCount" is an optional response
>>>>>       information, fetching always the total number of rows has been
>>>>>
>>>>> I'm not entirely sure in what sense "optional response information" is
>>>>> intended -- my reading of Section 2.1 is that it's mandatory to return
>>>>> totalCount if the client included the 'count' query parameter.
>>>> [ML] Exactly but it isn't returned always. For this reason, it is an
>>>> optional member of the paging_metadata object.
>>> Okay.  (I think your reply to another ballot comment also helped clarify
>>> this for me.)
>> [ML] Good.
>>>> Looking forward for your reply to my questions/comments.
>>> Thanks a lot for the explanations and updates; hopefully I have clarified
>>> anything that was unclear.
>>>
>>> -Ben
>>>
>>> _______________________________________________
>>> regext mailing list
>>> regext@ietf.org
>>> https://www.ietf.org/mailman/listinfo/regext
>> -- 
>> Dr. Mario Loffredo
>> Systems and Technological Development Unit
>> Institute of Informatics and Telematics (IIT)
>> National Research Council (CNR)
>> via G. Moruzzi 1, I-56124 PISA, Italy
>> Phone: +39.0503153497
>> Mobile: +39.3462122240
>> Web: http://www.iit.cnr.it/mario.loffredo
>>
> _______________________________________________
> regext mailing list
> regext@ietf.org
> https://www.ietf.org/mailman/listinfo/regext

-- 
Dr. Mario Loffredo
Systems and Technological Development Unit
Institute of Informatics and Telematics (IIT)
National Research Council (CNR)
via G. Moruzzi 1, I-56124 PISA, Italy
Phone: +39.0503153497
Mobile: +39.3462122240
Web: http://www.iit.cnr.it/mario.loffredo