Re: [art] Artart last call review of draft-ietf-core-problem-details-05

Carsten Bormann <cabo@tzi.org> Wed, 15 June 2022 16:23 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: art@ietfa.amsl.com
Delivered-To: art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC4F9C157B33; Wed, 15 Jun 2022 09:23:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lxlHnuHpB4Ah; Wed, 15 Jun 2022 09:23:18 -0700 (PDT)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A641CC14F74F; Wed, 15 Jun 2022 09:23:15 -0700 (PDT)
Received: from [192.168.217.118] (p5089ad4f.dip0.t-ipconnect.de [80.137.173.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4LNVwH34mTzDCfc; Wed, 15 Jun 2022 18:23:11 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <165511479760.19573.12671700576299137749@ietfa.amsl.com>
Date: Wed, 15 Jun 2022 18:23:10 +0200
Cc: art@ietf.org, core@ietf.org, draft-ietf-core-problem-details.all@ietf.org, last-call@ietf.org
X-Mao-Original-Outgoing-Id: 677002990.746657-24bd34f0a31949d3517271e7385bc4dc
Content-Transfer-Encoding: quoted-printable
Message-Id: <63D13796-758D-469B-AFA8-3050C9F87819@tzi.org>
References: <165511479760.19573.12671700576299137749@ietfa.amsl.com>
To: Harald Alvestrand <harald@alvestrand.no>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/art/mJtkVVNM1IXi-54HcnnxdZnEg8c>
Subject: Re: [art] Artart last call review of draft-ietf-core-problem-details-05
X-BeenThere: art@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Applications and Real-Time Area Discussion <art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/art>, <mailto:art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/art/>
List-Post: <mailto:art@ietf.org>
List-Help: <mailto:art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/art>, <mailto:art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jun 2022 16:23:22 -0000

Hi Harald,

thank you for this thoughtful review.

A number of the considerations listed are already solved by the underlying standards CoAP, CBOR, and CDDL; I’ll try to explain this in context, and get to the other observations.

> Summary of conclusions
> I recommend that this document be broken into two documents: An error format
> and an internationalized string format.

(I’ll get to this below.)

> Summary of content
> 
> This document defines an error reporting format for use with REST APIs using
> COAP as their response format. Note: This reviewer does not know COAP, so the
> appropriate usage of COAP is not checked in this review.
> 
> The standardized form of the response consists of:
> * A “problem shape” - human readable

The “title” field is the human-readable part; the problem shape is the shape of the problem details as presented that can be recognized by an automated mechanism.

> * An “explanation” - human readable

This (“detail” field) goes into instance-specific details.

> * An “instance” - an URI

E.g., a shopping basket etc.

> * A “response code” (ref RFC 7252), a 2-level code mimicking the 2xx/4xx/5xx
> system (IANA registry) * A “base URI” that is only for internal usage * (added
> in -05): base-lang and base-rtl It is not clear to this reviewer from the
> document whether these fields are mandatory or optional;

The fields are all optional, hence the non-empty<…> to say at least one of them needs to be there (or the whole thing is pointless).  They are optional because the context (specifically, the URI in the request providing the base URI, and the response code in the response providing the response-code) often provides them, and this format is intended to be concise.

> if the question mark
> in front of the fields is “optional”, then there is no mandatory field in the
> details at all; rather, the “non-empty” notation seems to mean that there
> should be *something* in it. So an error report consisting of just base-rtl is
> syntactically legal.

(But not very useful.)

It is usually hard to capture semantic constraints in a structural language like CDDL (or ABNF), so we didn’t try.

> The report may be extended with standard or app-specific details - identified
> with negative numbers (standard), positive numbers (registered) or URIs.
> 
> Appendix A defines the “Tag 38” string format, which decorates a string with
> language and direction attributes.
> 
> Usability as an error format
> The general treatment of an error consists of two processes:
> * The client receiving the error takes an automatic action based on the error
> content. For this to work well, the error needs to be machine parsable and
> clearly indicate the appropriate action to be taken (in the context of the
> application).

Right, this is where the standard or custom problem details come in and provide the problem shape that the application can react on.

> * The user reports an error that was unexpected and not handled
> by the client to the developer or support personnel that can use other
> processes to figure out why there was a problem and give (human) advice. For
> this to work well, the error has to be capable of being picked up by the human
> and transmitted to the issue investigator.

CoAP already has the concept of diagnostic payloads (specified to be UTF-8 strings), so this is already covered; the present draft provides a structure that can be used instead of these unstructured payloads.

> This error format is somewhat capable of doing both, but its great
> extensibility and lack of required parts serves to make it difficult to know
> whether it’s actually useful for both.

It is hard to define “useful” exactly.  We expect common usages of this format will emerge, but it would be too early to standardize these.  Instead, we provide a framework inside which these usages can evolve in an interoperable way.

> The document does not contain any advice on how to make the format useful for
> these two purposes when used with a specific application. This is a weakness.

The document would need to list all those applications, and more specific advice might also be too early to standardize.

> Specific guidance that could be included:
> 
> * When an error can be handled automatically, always include a COAP response
> code, even if it is only “5.00 internal error”.

The response code is usually provided in the CoAP response already, so there is no need to deliver it in the payload.  Only when problem details are stored/forwarded is it a good idea to set response-code, base-uri etc. 
to save the (implicit) transaction context that would otherwise be lost.

> If an application specific
> action is desired (like “refresh your credentials” in an authorized action),
> use a registered custom problem detail number or a custom problem detail URI.

The general action may also be implied by the response code, e.g. 4.01 Unauthorized (really: Unauthenticated, but we inherit this misnomer from HTTP), or 4.03 Forbidden (really: Unauthorized).

> *
> When an error may need to be reported, make sure that it can be fully
> represented in a text format that can be copy/pasted into an error report

As Section 3.1.1 demonstrates, human-readable text form is often not needed, so I’m not sure we should give this advice.

> *
> When an error may need to be reported, makes ure that any personally
> identifiable information (PII) is either obvious to the casual viewer or
> omitted from the report. In particular, do NOT include PII in encoded formats
> like Base64.

Good point!

We added a brief Privacy Considerations section and expanded the Security Considerations section in https://github.com/core-wg/core-problem-details/pull/34 — editor’s copy at https://core-wg.github.io/core-problem-details/privcons-seccons/draft-ietf-core-problem-details.html#name-privacy-considerations).

> Unless some guidance of this kind is included, I fear that the applications of
> the concise error format will be so diverse as to make the “commonality” less
> useful.
> 
> The “Tag 38 internationalized string”
> This document adds an appendix defining an “internationalized string” format
> that adds a BCP 47 language tag and an Unicode-based direction indicator to an
> UTF-8 string. This is laudable; RFC 2277 section 4 pointed out the need for
> this ability 24 years ago.
> 
> Unfortunately neither definition is problem-free.
> 
> First of all, this tag, if useful at all, is of far greater utility than the
> error format. Burying it in an appendix of a document whose stated purpose is
> something else makes it far more difficult to refer to than it needs to be.

That is usually not a problem.  The focal point for finding a CBOR tag for a specific application is the CBOR tag registry; this then points to the places where the specifications for the tags can be found (which in this case is easily expressed as “Appendix A of RFC XXXX”).

> Second, the “detailed semantics” has chosen to include the quite complex BNF of
> RFC 5646 translated into CDDL; this may have some use, but BCP 47 is a moving
> target;

We intend tag38 to be useful for the current form of BCP 47, so it is hard to plan for the future.  If BCP 47 needs to be considered unstable, we could of course define a “bcp47-extension” alternative with a CDDL feature control operator.

> having CDDL parsers try to validate tags according to this grammar is
> not going to be useful. If included at all, this needs to be clearly marked
> with text saying that BCP 47 is normative for this grammar, and that language
> tag parsers should NOT try to reject tags based on this grammar; instead, they
> should be treated as strings, and looked up against relevant language handling
> APIs. (“zh-ZZ” is perfectly valid according to the grammar, but is semantically
> invalid according to BCP 47).

Here again, it is hard to capture semantics in a structural definition.
Our document is going to reference RFC 5646 (including its ABNF), as that is the current definition; if BCP 47 is updated, the effect of that update on this document will need new consideration.

> Note also that the sentence “Data items with tag
> 38 that do not meet the criteria above are invalid (see Section 5.3.2 of
> [STD94]).” is really hard to parse semantically, given that section 5.3.2 of
> RFC 8949 doesn’t use the word “invalid”, it uses “inadmissible value”. I do not
> recommend rejecting unknown language tags.

They may not be rejected, they are just not “valid” in RFC 8949 sense (they are still well-formed).  I would expect language tags to evolve within the grammar defined by RFC 5646 (which does have an extension point); it that is a mistaken assumption, please let us know.

> Thirdly, the definition of the tri-state direction attribute can be made
> clearer; in particular, the Unicode Bidirectional Algorithm (UAX#9) should be
> referenced, with particular reference to
> https://www.unicode.org/reports/tr9/tr9-44.html#Markup_And_Formatting - the
> important property here is that the desired semantic is isolation - the markup
> is intended to have zero influence on strings outside the embedded string - the
> semantics of embedding in RLI…PDI is the desired effect.

Tag38 does not provide a way to handle embedding, so we are not trying to boil that ocean yet.

> Issues with referencing Appendix A from the main body
> 
> The reference from the main body of the document in the context of the (version
> 5 added) language and bidi attributes is also not clear; it is probably
> intended to say “if the default attributes are given, then each attribute with
> the grammar of “oltext” is to be interpreted as a Tag 32 string with the
> corresponding attributes set”.

  oltext = text / tag38

So oltext is the alternative between (unadorned) text (to which the base-xxx fields might apply) and tag38 text, which does provide its own settings.

> This creates an issue with strings defined in custom-problem-detail-entries:
> one has to know whether a particular attribute is “oltext” in order to know
> whether the default lang and rtl attributes apply or not.

The CBOR encoding of unadorned text and of tag38 text is clearly distinguishable, so there never is an ambiguity.

Grüße, Carsten