Re: [Gen-art] [Last-Call] Genart last call review of draft-crocker-inreply-react-07

Kjetil Torgrim Homme <kjetilho@ifi.uio.no> Thu, 28 January 2021 08:21 UTC

Return-Path: <kjetilho@ifi.uio.no>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8A8B33A13D2; Thu, 28 Jan 2021 00:21:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WbMV9-fB4Fgl; Thu, 28 Jan 2021 00:21:34 -0800 (PST)
Received: from mail-out02.uio.no (mail-out02.uio.no [IPv6:2001:700:100:8210::71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F16773A13C9; Thu, 28 Jan 2021 00:21:33 -0800 (PST)
Received: from mail-mx11.uio.no ([129.240.10.83]) by mail-out02.uio.no with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93.0.4) (envelope-from <kjetilho@ifi.uio.no>) id 1l52YQ-000EwB-2A; Thu, 28 Jan 2021 09:21:26 +0100
Received: from wireguard.i.bitbit.net ([87.238.42.12] helo=comm.ms.redpill-linpro.com) by mail-mx11.uio.no with esmtpsa (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) user kjetilho (Exim 4.93.0.4) (envelope-from <kjetilho@ifi.uio.no>) id 1l52YP-0001R8-DB; Thu, 28 Jan 2021 09:21:25 +0100
Message-ID: <2937937dc6fdd1efdefea923519b278743c3905d.camel@ifi.uio.no>
From: Kjetil Torgrim Homme <kjetilho@ifi.uio.no>
To: Dave Crocker <dcrocker@bbiw.net>, Dale Worley <worley@ariadne.com>, gen-art@ietf.org
Cc: last-call@ietf.org, draft-crocker-inreply-react.all@ietf.org
Date: Thu, 28 Jan 2021 09:21:23 +0100
In-Reply-To: <d5a06ce9-31e7-ecd8-084b-a4ad92add744@bbiw.net>
References: <161180116375.12497.14607504902519078003@ietfa.amsl.com> <d5a06ce9-31e7-ecd8-084b-a4ad92add744@bbiw.net>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.36.5 (3.36.5-1.fc32)
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-UiO-SPF-Received: Received-SPF: neutral (mail-mx11.uio.no: 87.238.42.12 is neither permitted nor denied by domain of ifi.uio.no) client-ip=87.238.42.12; envelope-from=kjetilho@ifi.uio.no; helo=comm.ms.redpill-linpro.com;
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, AWL=0.023, UIO_MAIL_IS_INTERNAL=-5, uiobl=_BLID_, uiouri=_URIID_)
X-UiO-Scanned: 475F1F00B42D1BB7BBAFFA666DEC27B7EB278AA9
Archived-At: <https://mailarchive.ietf.org/arch/msg/gen-art/wRDrwOTGJ5Kr514AmbdVtcyCh0Y>
Subject: Re: [Gen-art] [Last-Call] Genart last call review of draft-crocker-inreply-react-07
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jan 2021 08:21:37 -0000

On Wed, 2021-01-27 at 19:35 -0800, Dave Crocker wrote:
> On 1/27/2021 6:32 PM, Dale Worley via Datatracker wrote:
> >     The rule emoji_sequence is inherited from [Emoji-Seq].  It permits
> >     one or more bytes to form a single presentation image.
> > 
> > I haven't traced the definition of emoji_sequence, but it seems to be
> > essentially a set of Unicode characters that have one or another of
> > certain attributes.  That is perfectly sensible.  But if I understand
> > correctly, "emoji_sequence" is a sequence of characters, and you want
> > to say "In the UTF-8 encoding, some of these characters may be encoded
> > as multiple bytes." or something like that.
> 
> Sorry but I'm not understanding what clarity this provides, over the 
> existing text.
> 
> To the extent that your intent is to say that a) this is a subset of 
> UTF-8, and b) multiple bytes can be used, I think that's built into the 
> definition of emoji-sequence.
> 
> In fact, I had added the one or more text mostly to highlight the the 
> 'sequence' can be only one byte, since 'sequence' would be expected to 
> be read as meaning multiple.

One small change here which will reduce the amount of confusion is to
avoid the word "byte".  Indeed, it is *not* possible for the sequence
to be only one byte, since there are no Unicode code points in the
range U+0000 U+007F with the Emoji property set.

So, use "emoji characters" or "code points" instead?

(I tend to avoid the use of "byte" in favour of "octet" to forestall
complaints from the old DEC-10, DEC-20 and Cray users anyway ☺)

> >     Reference to unallocated code points SHOULD NOT be treated as an
> >     error; associated bytes SHOULD be processed using the system default
> >     method for denoting an unallocated or undisplayable code point.
> > 
> > Code points that do not have the requisite attributes to qualify as
> > part of an emoji_sequence should also not be treated as an error,
> > although you probably want to allow the system to alternatively
> > display them normally (rather than as an unallocated or undisplayable
> > code point).
> 
> I think your comment addresses a different issue than the cited text is 
> meant for, but I also might be misunderstanding.

Probably, but I think it bears saying something about how to handle
code points without the Emoji property set.  IMHO they should be
handled as undisplayable.

> For whatever reasons, including not having been allocated by the Unicode 
> folks, or possibly by running an older system that thinks a code point 
> is not allocated, there is an issue of how the system should deal with 
> encountering such a code point.  The text here is merely trying to say 
> "do whatever you do".
> 
> A different issue is encountering a code-point, here, that is outside of 
> the emoji-sequence set. The text doesn't try to tell the receiver how to 
> process bytes that are illegal here.

The above suggestion would still allow the implementer sufficient lee-
way.

-- 
venleg helsing,
Kjetil T.