Re: [Last-Call] Genart last call review of draft-crocker-inreply-react-07

worley@ariadne.com Sun, 31 January 2021 22:17 UTC

Return-Path: <worley@alum.mit.edu>
X-Original-To: last-call@ietfa.amsl.com
Delivered-To: last-call@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 037BF3A12BD for <last-call@ietfa.amsl.com>; Sun, 31 Jan 2021 14:17:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.984
X-Spam-Level:
X-Spam-Status: No, score=-0.984 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=comcastmailservice.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YgtGI3n1Me3Q for <last-call@ietfa.amsl.com>; Sun, 31 Jan 2021 14:17:08 -0800 (PST)
Received: from resqmta-ch2-03v.sys.comcast.net (resqmta-ch2-03v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E4F733A12BC for <last-call@ietf.org>; Sun, 31 Jan 2021 14:17:07 -0800 (PST)
Received: from resomta-ch2-04v.sys.comcast.net ([69.252.207.100]) by resqmta-ch2-03v.sys.comcast.net with ESMTP id 6JmLlLhmAf72J6L1mlfCzL; Sun, 31 Jan 2021 22:17:06 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcastmailservice.net; s=20180828_2048; t=1612131426; bh=1YJzDAvy/B97evqmAEnYoTfoK2ZS1N85lmlReZLX6Z4=; h=Received:Received:Received:Received:From:To:Subject:Date: Message-ID:MIME-Version:Content-Type; b=Y9CD6/pQFoEjiTpRMj8t10Dtg9QzeVJmx3T5NvFsrH6ZUMYo1vn+ToraKm4aFv3lD f77wGItQmXyuYdyN2oJAabwUHamH5ac45dPwjdk7k6LTlXXC3m0ngjpW0FCMgS0k/9 JyNFWPUqd9KLHTcxZWldbxbt7CuOBJ5+PREf6+c4NddlS87PFmt8eqoeJ/UG4XvrY6 0UwlBf1w6bWlhdtRzPdmXdgb8WRBI2dZFPkJSbzz+DXImOBE76/LMWjCB7H8LQDOpU aN7lbCvZgOOs5hQgdReY6xGnXyeVZJbgz/3jlGgGZVFffEOZsRzwU6RiGWqzmpc39M ZWD9h5i2gLwoA==
Received: from hobgoblin.ariadne.com ([IPv6:2601:192:4a00:430:222:fbff:fe91:d396]) by resomta-ch2-04v.sys.comcast.net with ESMTPA id 6L1WlO2RwlgE16L1llxKqR; Sun, 31 Jan 2021 22:17:05 +0000
X-Xfinity-VMeta: sc=-100.00;st=legit
Received: from hobgoblin.ariadne.com (hobgoblin.ariadne.com [127.0.0.1]) by hobgoblin.ariadne.com (8.14.7/8.14.7) with ESMTP id 10VMGnEB028217; Sun, 31 Jan 2021 17:16:49 -0500
Received: (from worley@localhost) by hobgoblin.ariadne.com (8.14.7/8.14.7/Submit) id 10VMGnZO028214; Sun, 31 Jan 2021 17:16:49 -0500
X-Authentication-Warning: hobgoblin.ariadne.com: worley set sender to worley@alum.mit.edu using -f
From: worley@ariadne.com
To: Dave Crocker <dcrocker@bbiw.net>
Cc: gen-art@ietf.org, draft-crocker-inreply-react.all@ietf.org, last-call@ietf.org
In-Reply-To: <d5a06ce9-31e7-ecd8-084b-a4ad92add744@bbiw.net> (dcrocker@bbiw.net)
Sender: worley@ariadne.com
Date: Sun, 31 Jan 2021 17:16:48 -0500
Message-ID: <87sg6g4ur3.fsf@hobgoblin.ariadne.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/last-call/94e8yZNix8tmT7aYgq-H7tPYod8>
Subject: Re: [Last-Call] Genart last call review of draft-crocker-inreply-react-07
X-BeenThere: last-call@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Last Calls <last-call.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/last-call>, <mailto:last-call-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/last-call/>
List-Post: <mailto:last-call@ietf.org>
List-Help: <mailto:last-call-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/last-call>, <mailto:last-call-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 31 Jan 2021 22:17:11 -0000

Dave Crocker <dcrocker@bbiw.net> writes:
> On 1/27/2021 6:32 PM, Dale Worley via Datatracker wrote:
>> Reviewer: Dale Worley
>> Review result: Ready with Nits

First to deal with the straightfoward points:

>>     The emoji(s) express a recipient's summary reaction to the specific
>>     message referenced by the accompanying In-Reply-To header field.
>>     [Mail-Fmt].
>>
>> This is not specific as to where the In-Reply-To header is.  I assume
>> you want to say that it is a header of the parent multipart component
>> of "Reaction" part.  Or perhaps this should be forward-referenced to
>> the discussion in section 3.
>
> I don't understand the concern.  An In-Reply-To header field is part of 
> the message header.  That is, it will be in the header of the response 
> message.

Given that we're deailing with multipart messages, an In-Reply-To header
could be stuck in the message header but it could also be stuck in the
headers of any part.  I don't know if it's ever done, but certainly,
it's plausible that if I include a reply which I had received as an
attachment to another email I send, the In-Reply-To header in the
received e-mail would show up as a header to the attachment part, not
my message as a whole.

In general, the situation is one of unlimited complexity.

I'm not particular what rules you want to specify, just that when I'm
looking at a part with this Content-Disposition that is somewhere in a
multipart structure (possibly without parts), that it's clear which sets
of headers I need to examine to find the In-Reply-Header.

Now I think in reality, it either has to be in the headers of the part
with disposition "reaction", or in the multipart containing that part.
But whatever the rule is, it should be stated.

>>     Reference to unallocated code points SHOULD NOT be treated as an
>>     error; associated bytes SHOULD be processed using the system default
>>     method for denoting an unallocated or undisplayable code point.
>>
>> Code points that do not have the requisite attributes to qualify as
>> part of an emoji_sequence should also not be treated as an error,
>> although you probably want to allow the system to alternatively
>> display them normally (rather than as an unallocated or undisplayable
>> code point).
>
> I think your comment addresses a different issue than the cited text is 
> meant for, but I also might be misunderstanding.
>
> For whatever reasons, including not having been allocated by the Unicode 
> folks, or possibly by running an older system that thinks a code point 
> is not allocated, there is an issue of how the system should deal with 
> encountering such a code point.  The text here is merely trying to say 
> "do whatever you do".

The text is a constraint, though.  It *requires* (sort of) that if the
bytes in the part form a character which the receiver considers
unallocated, it *should not* reject the whole message as being
ill-formed.  The implementation has great freedom in how to display the
caracter, but the message as a whole "SHOULD NOT be treated as an
error".

> A different issue is encountering a code-point, here, that is outside of 
> the emoji-sequence set. The text doesn't try to tell the receiver how to 
> process bytes that are illegal here.

Perhaps that is what you intend, and if so, the text is correct.  But it
seems to me that if the bytes form a code point that the receiver
considers to be allocated but not an emoji, it should be under the same
constraint that it should not reject the message as a whole as erroneous.

Now for the messy part:

>     The rule emoji_sequence is inherited from [Emoji-Seq].  It permits
>     one or more bytes to form a single presentation image.

First, let me say I keep a rigid category distinction between
bytes/octets and characters.  And in this situation, it seems like there
are *three* layers of composition between bytes and displayed items:

- The UTF-8 encoding groups bytes into code points, which are generally
  Unicode "characters".

- The code points can be composed (by Unicode rules) into characters.
  As Barry explains, "as creating “á” from “a” plus combining acute
  accent".  But I'm not so familiar with how that is done and how that
  affects exactly what the word "character" means.  (I also do not know
  whether any emoji code point participates in Unicode composition, but
  a sender can certainly compose reactions containing code points that
  participate in composition, and there probably is no guarantee that
  Unicode will never do such a thing with emoji.)

- Groups of characters may be displayed as single images.  As Barry
  explains, "the sort of thing that’s unique to emoji, wherein the
  emojis for man followed by woman followed by boy, each of which is a
  separate emoji character that would be displayed as it seems, will
  often be rendered as a single image of a family".

Composing these processes, it takes bytes/octets (the encoded form of
the "reaction" part) into a sequence of displayed images.

When I wrote my review, I was aware only of the first composition layer.
But now, it's not clear to me what the sentence "It permits one or more
bytes to form a single presentation image." is intended to say.  The
combining of bytes to form an image may happen at any of the three
layers, and it seems to me that the entire process would be better
described as "It permits one or more bytes to form one or more
presentation images."  But maybe you're trying to say something more
specific.

Dale