Re: [regext] Review of draft-ietf-regext-rdap-redacted-09

Pawel Kowalik <kowalik@denic.de> Wed, 16 November 2022 17:14 UTC

Return-Path: <kowalik@denic.de>
X-Original-To: regext@ietfa.amsl.com
Delivered-To: regext@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0608AC14CE2F; Wed, 16 Nov 2022 09:14:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=denic.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vpGftKax5SZ8; Wed, 16 Nov 2022 09:14:38 -0800 (PST)
Received: from mout-b-105.mailbox.org (mout-b-105.mailbox.org [195.10.208.50]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DF32EC14CF16; Wed, 16 Nov 2022 09:14:36 -0800 (PST)
Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-b-105.mailbox.org (Postfix) with ESMTPS id 4NC8mN0jTXz9sTt; Wed, 16 Nov 2022 18:14:28 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=denic.de; s=MBO0001; t=1668618868; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lD9KU2gUPYqdquPd1j0nJHurR205fndHjzdRUZSYblQ=; b=HEn6F0dit/uJAmiiTK6hnAzaM1VoOP8jWZXXq/1o5JsePvC1RjO8CbScJxHS/rr6ySjE6O LUw2k4ivk5MIXOLOtKgFKDq8XqR96umGkkOTuM3w9SwitG7PnKtO/oBg3GfRsFpO4lOVKq 5pY1r7eNxhIeiQDx04TtT60LsquCRrsOn8EbKQLo117gcdlQjsVmUJ8h4efnU7QAHaLCS/ 2PLfjv0yaDAKZOqorJ2wImNYnEWBxSa/hAF/F5FZGzJCahhwLqEwsId5rBN8FOpVSTT7LN XOAONVsEv1noGnsIRTdhnqjki7oyR72EvEEaiC8054QbhrHYIJD5JT4LQYngFw==
Message-ID: <6d06f8cf-fa1e-2370-845e-f0c4e40986ec@denic.de>
Date: Wed, 16 Nov 2022 18:14:26 +0100
MIME-Version: 1.0
From: Pawel Kowalik <kowalik@denic.de>
Content-Language: en-US
To: "Gould, James" <jgould@verisign.com>, "draft-ietf-regext-rdap-redacted@ietf.org" <draft-ietf-regext-rdap-redacted@ietf.org>
Cc: "regext@ietf.org" <regext@ietf.org>
References: <41CA97DD-14C0-4372-AECC-D9684370B218@verisign.com>
In-Reply-To: <41CA97DD-14C0-4372-AECC-D9684370B218@verisign.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: base64
X-MBO-RS-META: rs3kez6tb8u7mxpib6ia6gtyp9hdjeq9
X-MBO-RS-ID: 2ea8fcc4fb1b742ef62
Archived-At: <https://mailarchive.ietf.org/arch/msg/regext/9r6CZuyLImJC8dlcdbIu1lugt_w>
Subject: Re: [regext] Review of draft-ietf-regext-rdap-redacted-09
X-BeenThere: regext@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Registration Protocols Extensions <regext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/regext>, <mailto:regext-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/regext/>
List-Post: <mailto:regext@ietf.org>
List-Help: <mailto:regext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/regext>, <mailto:regext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Nov 2022 17:14:44 -0000

Hi James,

See my feedback below.

Am 15.11.22 um 21:15 schrieb Gould, James:
> [...]
>
> JG - The use of placeholder values for redaction is prohibited based on the second normative sentence.  The normative language is needed to implement the redaction defined by the RDAP extension.  I agree that stating "A placeholder text value will not match" is too strong and needs to be changed to "A placeholder text may not match".  This will be changed in the next version of the draft.
[PK] Yes, the changed text is way better.
>
>      My proposal would be to use SHOULD NOT instead of MUST NOT. It can be
>      useful for the servers willing
>      still to support clients not implementing this extension in the
>      transition period to populate fields with placeholders instead of just
>      empty/missing fields.
>
> JG - I understand that there may be a transition period to fully implement the RDAP extension as will be the case of any RDAP extension that standardizes a crosscutting feature like redaction, but weakening the normative language from MUST NOT to SHOULD NOT misses the purpose of the RDAP extension and will result in potentially long running interoperability issues.

[PK] Looking at the Abstract and the Introduction of the draft it reads 
"This document describes an RDAP extension for explicitly identifying 
redacted RDAP response fields, using JSONPath as the default expression 
language.". I see a clear focus on "identify", whereas such strong 
normative language actually influences the way the redaction is done. I 
don't think this should be the main focus of this draft and a bit softer 
SHOULD allows to align servers to the best practice without a dilemma if 
one can use the rdapConformance "redacted" even if placeholders are in 
use for whatever reason. Placeholders are perfectly fitting the 
"Redaction by Replacement Value Method", so also fitting the draft.

> [...]
>
> JG - jCard is currently used in RFC 9083 and the specifics of handling redaction of jCard is an important concept that needs to be covered.  The "Redaction by Empty Value Method" is needed because of jCard.  I believe the combination of jCard being used in RFC 9083 and the definition of a redaction method to support jCard warrants providing the detail.

[PK] Fine

> [...]
> JG - The path refers to the object before the redaction for the Redaction by Removal Method, since the RDAP field was removed.  In this case the path points to where the RDAP field would have been.  The path refers to the object before and after the redaction for the Redaction by Empty Value Method, since the RDAP field exists but has been set to an empty value.  The path refers to the object before the redaction for the Redaction by Replacement Value Method, while the replacementPath refers to the object after the redaction, since it points to the replacement JSON field.  Right, requiring the path to a non-existent JSON fields in the case of the Redaction by Removal Method and the Redaction by Replacement Value Method consequently requires the client to validate the JSONPath expression using a non-redacted RDAP response,, which the client does not have.  Having the JSONPath for where the field would have existed may be useful for a client, but they would need to generate the non-redacted response.

[PK] It would be of a great value to describe to which object the path 
refers to to make it clear for the implementers.
Maybe even worth considering to have 2 distinct members "originalPath" 
and "responsePath" where each one is clearly defined to be referring to 
a source or a result object respectively.

> I don't oppose setting the "path" to OPTIONAL but with normative language specifying that it MUST be included when the JSON field does exist in the redacted response.  Do you agree with this?
[PK] Yes, I would support that.
>    
>
>
>      With Section 5 being not-normative and about processing by the server,
>      the draft should similarly include considerations for processing of the
>      responses by the clients.
>
> JG - Yes, Section 5 provides non-normative considerations for the server.  Offhand I believe considerations #2 "Validate a JSONPath expression using a non-redacted RDAP response" could be applicable when a client receives a response that includes the "path" expression for the Redaction by Removal Method and the Redaction by Replacement Value Method, per the feedback above, but the client would need to generate a template non-redacted RDAP response based on the "path" expressions.  Do you agree with this possible consideration, and do you have any other considerations that should be considered?  If there are useful client considerations, they certainly can be added.

[PK] The approach with a "template non-redacted RDAP response" may be 
interesting to take a deeper look into.
My first impression would be that such fits-all template may be 
difficult to build, accounting that an entity may have multiple roles 
assigned or that vCard allows also for custom fields the client would 
not be aware of.

Another approach would be to define a way of interpreting the JSONPath 
so that it is reversible or even defining a subset of JSONPath which is 
reversible in the narrower RDAP context.

In the end, implementing a client, I would rather want to rely on the 
"redacted name" from the "JSON Values Registry" for paths which have 
been deleted, and treating the path member as only informative.

If you agree for such processing by the client I suggest to put it down 
in the chapter 5 (maybe splitting it into server and client side).

> JG - The expression language being stuck with JSONPath was brought up on the list, which resulted in adding the OPTIONAL "pathLang" field.  A new JSON Values Registry field value could be added, like "redacted name" and "redacted reason".  How about the "redacted expression language" Type with a pre-registration of the Value "jsonpath" and the description "JSON path expression language, as defined in draft-ietf-jsonpath-base"?  We would replace draft-ietf-jsonpath-base with the published RFC, which is a normative reference dependency.

[PK] If WG feels we need to foresee support for more expression 
languages then the approach with JSON Values Registry is a good proposal.

>      Is the first normative sentence meant to apply to every case, or just
>      the case where removal of such field would render an invalid jCard?
>      I think there is quite legit way or removing whole objects by position
>      in a fixed length array, like "path": "$.entities[0]" which should be
>      still allowed.
>
> JG - It's really meant to directly address the jCard use case, but I believe it would apply to all fixed length arrays.  Would it help to clarify this more by updating the language to be "using the fixed field position of a fixed length array"?

[PK] Does it mean you intend to forbid fixed field positioning in all 
cases, meaning $.entities[0] would not be allowed?
In case of jCard I understand this is to prevent breaking the format, 
where the position plays a distinct role. What is the rationale to block 
it in all cases?

> [...]
>
> JG - Agreed that the registering a new JSON Values Registry Type and pre-registering the "jsonpath" expression language can help as stated above.

[PK] Fine

>      However, semantically, isn't this registry rather defining the members
>      and objects of RDAP response in a general sense? I mean "Registry Domain
>      ID" means the same no matter if in context of redaction or in context of
>      RDAP response and actually IMHO we should make sure it means the same in
>      any other RDAP context it would be used in the future. IMHO this IANA
>      registry shall be a generic one defining labels to the information
>      pieces in RDAP responses.
>
> JG - No, the JSON Values Registry defines the values that can be used for fields, which is applicable here for the valid set of "type" field values used in the redacted "name" and "reason" fields.  The same can be done to define the valid field values for the "pathLang" field.  The registered Type values of "redacted name", "redacted reason", and potentially the " redacted expression language" will provide the needed isolation with the other typed values in the registry.

[PK] I get that. This is just an observation, that "reverse search"  
will also define a JSON Value Registry" with a mapping of labels to 
JSONPaths. The risk is to end up with different labels for the same 
thing in context of redaction or reverse search.

It would be good to have some voices from the WG if we care about it.

>      ---
>
>      Examples with entity role indexing.
>
>      The draft has many examples of the kind
>      $.entities[?(@.roles[0]=='registrant')] which might be incorrect if an
>      entity holds multiple roles, as RFC9083 does not specify any particular
>      order of roles in the array and also depending on the roles held by the
>      entity the array might have different length. This means that
>      'registrant' can be at index 0, but also at index 1. The correct
>      expression would rather be with wildcard index as per 3.5.2. of
>      JSONPath, like this $.entities[?(@.roles[*]=='registrant')].
>      The example of multiple roles shows also a tricky corner-case, as the
>      server may want to redact a contact in one role but same time not redact
>      it due to other role attached to the same entity. Not sure if JSONPath
>      allows to express this, but this might remain an implementation detail
>      of the server and not particularly a concern of this draft (apart from
>      the questions to 4.2 above).
>
> JG - Interesting tricky corner-case.  I do believe it's an implementation detail, where the draft provides a redaction example based on the unredacted response that doesn't require the wildcard.
[PK] I wanted to bring this up just in case (which obviously never 
happens) that the implementers of the server just copy-paste the examples...
> [...]

Kind Regards

Pawel