[regext] Review of draft-ietf-regext-rdap-redacted-09

Pawel Kowalik <pawel.kowalik@denic.de> Mon, 14 November 2022 17:03 UTC

Return-Path: <pawel.kowalik@denic.de>
X-Original-To: regext@ietfa.amsl.com
Delivered-To: regext@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2EB46C15270F; Mon, 14 Nov 2022 09:03:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=denic.de
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AwJXs0SfpdHV; Mon, 14 Nov 2022 09:03:44 -0800 (PST)
Received: from mout-b-105.mailbox.org (mout-b-105.mailbox.org [IPv6:2001:67c:2050:102:465::105]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 84421C14CE3A; Mon, 14 Nov 2022 09:03:42 -0800 (PST)
Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-b-105.mailbox.org (Postfix) with ESMTPS id 4N9wcm60K4z9scq; Mon, 14 Nov 2022 18:03:36 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=denic.de; s=MBO0001; t=1668445416; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=In/sjgmBw+Z/Fb1QgjDifCjt1Cn0pXxZ6rlvwhbjjvE=; b=jRRuP/+qm0UIbVM6K8tQz8Z8YcQ/EdHBCNVOqywWNEslReas7oAn1EYwU/HAPCwfuf5Q0o auRsSrIvWIZrv8yv3mKQDeVFahF7K0n36tL5Shm0LtgGKLdczq3S1YmDyr8yVSNBR/BJVP NfPuHxcMwVHeczyFLjun7CcSpF6PkFBWnQSY4Pn2kp+h5+Oav5ssvkX+ZcTLYAckL/SV7Q mh5V0CW692keChJ2dWcbDfKopfW2INHdBWTFp8nmFf79hXWdY/KluvTIjx2RDZ7fspc8MM K80qR1bUiwKDvRY5Bn7g/4h5/TZnbi5vBjsKGwebgo3UVgpcaB0Pcgb/EAa2lQ==
Message-ID: <fc9de8c4-5507-2e22-4cd5-64a8a303bf15@denic.de>
Date: Mon, 14 Nov 2022 18:03:35 +0100
MIME-Version: 1.0
To: draft-ietf-regext-rdap-redacted@ietf.org
Content-Language: en-GB
From: Pawel Kowalik <pawel.kowalik@denic.de>
Organization: Denic
Cc: regext@ietf.org
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-MBO-RS-META: rf31krwugnbgbgwa386brx3dowukkumd
X-MBO-RS-ID: 9cebc0ac8a26cfec52c
Archived-At: <https://mailarchive.ietf.org/arch/msg/regext/elOStVaDS322KPYbpghsWoFj1JY>
Subject: [regext] Review of draft-ietf-regext-rdap-redacted-09
X-BeenThere: regext@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Registration Protocols Extensions <regext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/regext>, <mailto:regext-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/regext/>
List-Post: <mailto:regext@ietf.org>
List-Help: <mailto:regext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/regext>, <mailto:regext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Nov 2022 17:19:07 -0000

Hi,


As an action item after IETF 115 I reviewed 
draft-ietf-regext-rdap-redacted-09 before WG LC and have the following 
questions / remarks:

Section 3, first paragraph:

 > Redaction in RDAP can be handled in multiple ways.  The use of
 > placeholder text for the values of the RDAP fields, such as the
 > placeholder text "XXXX", MUST NOT be used for redaction.  A
 > placeholder text value will not match the format requirements of each
 > of the RDAP fields and provides an inconsistent and unreliable
 > redaction signal.

Is the normative language adequate here? Firstly, I am not sure if the 
text can be well understood.
Does it mean that use of a fixed placeholder is prohibited, or any 
placeholder?
I can imagine a way of using placeholders, which do not necessarily 
break the format requirements,
therefore the second sentence shall not categorically tell "will not 
match" as it may not be even true.

My proposal would be to use SHOULD NOT instead of MUST NOT. It can be 
useful for the servers willing
still to support clients not implementing this extension in the 
transition period to populate fields with placeholders instead of just 
empty/missing fields.

---

Section 3, second paragraph:

The whole paragraph deals with particularities of redacting jCard with 
some very specific guidelines and examples.
Wouldn't it be more and future proof to have a general statement that 
redacting the data MUST NOT break the underlying data format?
The same remark applies then later to 3.2. with normative language for 
jCard redaction.

---
4.2.  "redacted" Member

The whole "redacted" member and especially the "path" and 
"replacementPath" members
shall allow the client to recognise parts of the JSON response which 
have been transformed
in the process of redaction.

Here some interesting questions:
- do the paths refer to the object before or after the redaction?
- if the paths refer to the object before the redaction, how should the 
client be able to interpret the path if not having access to this 
object, especially the JSONPath may even match multiple paths, e.g. from 
result and the path the client cannot digest which paths have been 
really removed? More to that each redaction step is actually 
transforming the object, so the order of redaction is also relevant in 
this case. Normalized Paths from JSONPath allow "reversing" the process 
from the result object if the order of redaction is being held.
- if the paths refer to the object after the redaction, likely the path 
member should be OPTIONAL not not required for redaction by removal or 
replacement, because in this case the object likely does not exist in 
the result object

With Section 5 being not-normative and about processing by the server, 
the draft should similarly include considerations for processing of the 
responses by the clients.

---

"pathLang" member

What is the rationale behind introducing this extension point? Right now 
the RFC specifies only one language, which is default one. If there will 
be an additional one I would expect it added by a RFC also answering 
some questions how to transition from one path language to another. 
Without any way for the client to express which "pathLang" it would like 
to get response with, the server would only be able to migrate after all 
clients support both formats, otherwise risking some clients to break.
And if this "pathLang" member is to persist, shouldn't there be IANA 
registry for allowed values, so that the clients have an ide what to 
implement?

---

3.1.  Redaction by Removal Method

 > The Redaction by Removal Method MUST NOT be used to remove a field
 > using the position in a fixed length array to signal the redacted
 > field.  For example, removal of an individual data field in jCard
 > [RFC7095] will result in a non-conformant jCard [RFC7095] array
 > definition.

Is the first normative sentence meant to apply to every case, or just 
the case where removal of such field would render an invalid jCard?
I think there is quite legit way or removing whole objects by position 
in a fixed length array, like "path": "$.entities[0]" which should be 
still allowed.

---

6.2.  JSON Values Registry

Registry for "redacted name" allows to identify what has been removed 
without interpreting it from the path member. This is of a great 
benefit, looking at the difficulties of interpreting the "path" member I 
mentioned above.
However, semantically, isn't this registry rather defining the members 
and objects of RDAP response in a general sense? I mean "Registry Domain 
ID" means the same no matter if in context of redaction or in context of 
RDAP response and actually IMHO we should make sure it means the same in 
any other RDAP context it would be used in the future. IMHO this IANA 
registry shall be a generic one defining labels to the information 
pieces in RDAP responses.

---

Examples with entity role indexing.

The draft has many examples of the kind 
$.entities[?(@.roles[0]=='registrant')] which might be incorrect if an 
entity holds multiple roles, as RFC9083 does not specify any particular 
order of roles in the array and also depending on the roles held by the 
entity the array might have different length. This means that 
'registrant' can be at index 0, but also at index 1. The correct 
expression would rather be with wildcard index as per 3.5.2. of 
JSONPath, like this $.entities[?(@.roles[*]=='registrant')].
The example of multiple roles shows also a tricky corner-case, as the 
server may want to redact a contact in one role but same time not redact 
it due to other role attached to the same entity. Not sure if JSONPath 
allows to express this, but this might remain an implementation detail 
of the server and not particularly a concern of this draft (apart from 
the questions to 4.2 above).

---

Re previous discussion on multiple email addresses.

Just a remark that after the last discussion on 
draft-ietf-regext-epp-eai it may be more common to have more than one 
email address. Likely RDAP needs to get EAI-aware as well on the 
response side.


Kind Regards,

Pawel