Re: [Extra] Barry Leiba's Discuss on draft-ietf-extra-imap-fetch-preview-03: (with DISCUSS and COMMENT)

Michael Slusarz <michael.slusarz@open-xchange.com> Thu, 11 April 2019 03:31 UTC

Return-Path: <michael.slusarz@open-xchange.com>
X-Original-To: extra@ietfa.amsl.com
Delivered-To: extra@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6FDE7120091; Wed, 10 Apr 2019 20:31:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=open-xchange.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ENiEPuUgjVOA; Wed, 10 Apr 2019 20:31:47 -0700 (PDT)
Received: from mx4.open-xchange.com (alcatraz.open-xchange.com [87.191.39.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 848EE12006A; Wed, 10 Apr 2019 20:31:47 -0700 (PDT)
Received: from open-xchange.com (imap.open-xchange.com [10.20.30.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx4.open-xchange.com (Postfix) with ESMTPS id 066D86A25F; Thu, 11 Apr 2019 05:31:44 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=open-xchange.com; s=201705; t=1554953504; bh=barQZuLZ++3KhN+e3DS1ojy65Nlm2FMc1vbBRY4rUEA=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From; b=JfFtkMXnfWF/0C1P6Ttim429rbJp7t/xUm5Qd2/IcVX3NY4ExVluZKmCtxWKfU/KA CiPrODRxKB5MiP/yltzyEeUANLG024X3XaxTN8Hh7k4MvyTzlKPYAJ2aL27BFcM6AN eZh0Hd8igmA+9jdhPPYuhRzri4yfViTiud0ub3xszmIs06lja+0paKalALxTAC8eh7 lCXy74hVgoiLO9+yeAJvBl4ogpdDwdQ5HD+WlUgFcPXchcNHRBwzqxBWXqe00zKzrE 5RySiLvxEOP6rV+13o9YxBKd2i4RjA/Qco21gy2dq0LpO1GXADgaGBkhaHavfj4sHm G0QHhfpE5R/Bw==
Received: from appsuite-gw2.open-xchange.com (appsuite-gw2.open-xchange.com [10.20.28.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by open-xchange.com (Postfix) with ESMTPSA id EC27D3C0066; Thu, 11 Apr 2019 05:31:43 +0200 (CEST)
Date: Wed, 10 Apr 2019 21:31:43 -0600 (MDT)
From: Michael Slusarz <michael.slusarz@open-xchange.com>
To: Barry Leiba <barryleiba@computer.org>, Barry Leiba via Datatracker <noreply@ietf.org>, The IESG <iesg@ietf.org>
Cc: extra@ietf.org, Bron Gondwana <brong@fastmailteam.com>, draft-ietf-extra-imap-fetch-preview@ietf.org
Message-ID: <1478535427.18024.1554953503900@appsuite.open-xchange.com>
In-Reply-To: <155469393077.18315.15660535375707491655.idtracker@ietfa.amsl.com>
References: <155469393077.18315.15660535375707491655.idtracker@ietfa.amsl.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Priority: 3
Importance: Medium
X-Mailer: Open-Xchange Mailer v7.10.1-Rev10
X-Originating-Client: open-xchange-appsuite
Archived-At: <https://mailarchive.ietf.org/arch/msg/extra/3yZNpSAMHEFym2pz0pVj5rqbiSM>
Subject: Re: [Extra] Barry Leiba's Discuss on draft-ietf-extra-imap-fetch-preview-03: (with DISCUSS and COMMENT)
X-BeenThere: extra@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Email mailstore and eXtensions To Revise or Amend <extra.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/extra>, <mailto:extra-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/extra/>
List-Post: <mailto:extra@ietf.org>
List-Help: <mailto:extra-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/extra>, <mailto:extra-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Apr 2019 03:31:51 -0000

Barry,

Thanks for your detailed comments.  Discussion below, which hopefully incorporates the important parts of comments from Alexey, Chris, Adam, and Benjamin.

> On April 7, 2019 at 9:25 PM Barry Leiba via Datatracker <noreply@ietf.org> wrote:
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> — Section 3.1 —
> 
> I don’t understand “the client’s priority decision”: what decision is that? 
> And what’s the point of giving the server a list of algorithms here, given that
> they all have to be ones that are supported by the server?  Won’t the server
> always have to use the first one in the list?  If not, please add some text
> explaining what the server does.

Consensus seems to be this section is confusing, or unclear, or unneeded, or some combination of this three.

I'll start with providing a (future) real-world example, and why this behavior exists in its present form.  Given a hypothetical future algorithm "IMAGE" (generate image/jpeg preview if image data exists in message), I issue this preview command:

a FETCH 1 (PREVIEW (LAZY=IMAGE FUZZY))

This is asking the server: provide me an image preview of the message, but ONLY if 1) the message actually contains image information and 2) that image information is already generated or can be generated very quickly.  Otherwise, fall back to FUZZY generation.

I find that use-case plausible, albeit not something useful given the current landscape of a single algorithm.  The alternative would be to have to send PREVIEW algorithms one at a time, non-pipelined, which is precisely the kind of inefficient client/server interaction this extension is trying to minimize.

I do agree that there is language that can surely be cleaned up here, especially regarding error handling.  But I will hold off on rewriting the section until we can have some discussion/consensus as to whether the proposed usage discussed above is something that others believe we should be supporting.


> — Section 3.2 —
> 
>    If the preview is not available, the server MUST return NIL as the
>    PREVIEW response.  A NIL response indicates to the client that
>    preview information MAY become available in a future PREVIEW FETCH
>    request.  Note that this is semantically different than returning a
>    zero-length string, which indicates an empty preview.
> 
> I think the MUST here is hard to follow, because the text doesn’t make a clear
> enough distinction between “preview is not available” and “an empty preview”. 
> Can you expand the text a bit to explain the distinction more clearly, as this
> is a protocol requirement?

"Preview not available" examples =
  * Preview generation isn't available at the moment (previews are generated in a separate thread, and that pool is saturated)
  * Body text is not available at the moment, so preview can't be generated.
  * The preview text has not been generated within the alotted time (e.g. LAZY modifier)

> Also, as I noted in response to Meral’s Gen-ART
> review it would be good to be clear how encrypted messages should be handled in
> this regard.

Do we want to be specific to the encrypted message use case?  What about a single MIME part containing application/octet-stream data which the preview parser knows nothing about?

I think the encrypted message use case can be handled by a better description of how to handle any data that the preview generator does not know how to meaningfully parse.  Something like "If the message contains no body information that the FUZZY parser can meaningfully display to the user, an empty preview should be returned."  or "An empty preview means that the FUZZY algorithm has made a definitive decision that no meaningful preview text can be generated for the message."


> — Section 4.1 —
> 
>    The preview text MUST be treated as text/plain MIME data by the
>    client.
> 
> I think this requires a normative reference to RFC 2046.

Ack.


> — Section 5.1 —
> 
> The way you have LAZY working isn’t really consistent with the IMAP protocol
> model.  In that model, the client would not have to ask for the preview twice,
> one with LAZY and one without.  Instead, with LAZY, the server would return
> FETCH PREVIEW responses when it could — perhaps some in the first set of FETCH
> responses, and some, where the PREVIEW part was missing before, in unsolicited
> FETCH responses when the preview became available.  That way, the server has
> the responsibility of setting off a separate task to generate the previews, and
> to send them to the client when it has them (at which point it either saves the
> for future FETCHes or doesn’t).
> 
> As it’s written here, the client has to open a separate IMAP session with the
> server and ask a second time for the previews it’s missing — a separate session
> to avoid blocking other action on the main session.  And if the server has spun
> off a task to preemptively generate them because the client asked once (a good
> practice, given the description here) it has to retain them for some indefinite
> period waiting for the client to ask again.
> 
> Why was this not done with the first mechanism?

I believe Chris' discussion handled Barry's concerns here.

I'll add from real-world experience that LAZY is the thing my client developers absolutely would yell and scream at me if I took out from this proposal.  We have been using this paradigm for several years now, and it has been very successful for us at high usage rates and it works in a variety of client types.

 
> — Section 7 —
> 
> As was mentioned in Ben’s review, either the ABNF for “capability” is in error
> (it should not include “preview-mod-ext”) or the description needs to be
> significantly beefed up.  I’m guessing that the intent is that PREVIEW=
> capabilities include both algorithms and modifiers, that PREVIEW=FUZZY is
> required, that the presence of any preview algorithm implies PREVIEW=LAZY such
> that the latter not only need not be specified, but is not permitted to be.  So
> we might have “PREVIEW=FUZZY PREVIEW=FURRY PREVIEW=SLEEPY”, which would mean we
> support the algorithms FUZZY and FURRY, and the modifiers LAZY and SLEEPY.  Is
> that correct?
> 
> That seems somewhat obtuse to me, overloading the PREVIEW= capability and
> inviting confusion.

See discussion on Benjamin's DISCUSS (although he withdrew in favor of this DISCUSS point).

In short, I propose that we remove "priority modifiers" as a category that can be extended, so that "PREVIEW=" is solely intended to list algorithm types.


> — Section 8 —
> 
> It seems like a bad idea to have to keep the IMAP Capabilities registry in sync
> with the two new registries: as it stands, when you add a new algorithm you
> have to add it to the Preview Algorithms registry, and also add a corresponding
> entry in the Capabilities registry... and similarly for a modifier, if I have
> that right above.
> 
> Why not follow the model of AUTH= and RIGHTS=, and just reserve the PREVIEW=
> capability in the registry, allowing it to apply to entries from the two new
> registries?  That avoids inconsistencies in registrations if we later add
> algorithms or modifiers.

See above.  With my proposal to remove priority modifiers, I believe this discussion point becomes moot.


> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> — Section 3.2 —
> 
>    This relaxed requirement permits a
>    server to offer previews as an option without requiring potentially
>    burdensome storage and/or processing requirements to guarantee
>    immutability for a use case that does not require this strictness.
> 
> That’s sensible, but can you include some text giving an example of a situation
> where the preview might change?  Given that the messages themselves are
> immutable, why would applying the same algorithm to the same text give
> different results?

We discussed this on the list in the past, but one example would be loss of cached preview data on the server and re-generation uses a newer algorithm version which produces slightly different text.

The consensus was that we should not be making extraordinary efforts/costs to ensure this text never changes, where this text is not being held out as being the canonical view of the message contents in the first place.

 
> — Section 4.1 —
> 
>    The server SHOULD limit the length of the preview text to 200 preview
>    characters.  This length should provide sufficient data to generally
>    support both various languages (and their different average word
>    lengths) and different client display size requirements.
> 
>    The server MUST NOT output preview text longer than 256 preview
>    characters.
> 
> The text here should make it clear, because many implementers do not understand
> the difference, that these refer to *characters*, not *bytes*, and that 200 or
> 256 characters can possibly be much longer than 256 bytes.  I worry that an
> implementer might allocate a buffer of 256 bytes, thinking that’s enough, and
> have it overflowed.

I feel that this can be accomplished with a sentence after the definition of "preview character" of something like "Note: a single preview character may compromise multiple octets, so any buffers implemented to conform to the string limitations identified in this document should be sized to prevent possible overflow errors."

>    The server SHOULD remove any formatting markup that exists in the
>    original text.
> 
> This is OK as it is, but perhaps a bit more specific than necessary.  I think
> the sense is that the server is meant to do its best to render the preview as
> plain text, because that’s what the client will treat it as.  As such, I would
> fold this into the earlier paragraph that talks about no transfer encoding, and
> maybe say it something like this:
> 
>    The generated string will be treated by the client as plain text, so
>    the server should do its best to provide a meaningful plain text string.
>    The generated string MUST NOT be content transfer encoded and MUST be
>    encoded in UTF-8 [RFC3629].  For purposes of this section, a "preview
>    character" is defined as a single UCS character encoded in UTF-8.  The
>    server SHOULD also remove any formatting markup, and do what other
>    processing might be useful in rendering the preview as plain text.

I'm fine with this.

michael