Re: [Extra] Barry Leiba's Discuss on draft-ietf-extra-imap-fetch-preview-03: (with DISCUSS and COMMENT)

Michael Slusarz <michael.slusarz@open-xchange.com> Sat, 01 June 2019 15:46 UTC

Date: Sat, 01 Jun 2019 09:46:17 -0600
From: Michael Slusarz <michael.slusarz@open-xchange.com>
Reply-To: Michael Slusarz <michael.slusarz@open-xchange.com>
To: Barry Leiba <barryleiba@computer.org>
Cc: The IESG <iesg@ietf.org>, extra@ietf.org, Bron Gondwana <brong@fastmailteam.com>, draft-ietf-extra-imap-fetch-preview@ietf.org
Message-ID: <1755746109.39421.1559403977713@appsuite-gw1.open-xchange.com>
In-Reply-To: <CALaySJKJxBTw6ptgDNvDjaXKDA4_ZFrb5b2gcUbSqsv1zwZSJw@mail.gmail.com>
References: <155469393077.18315.15660535375707491655.idtracker@ietfa.amsl.com> <1478535427.18024.1554953503900@appsuite.open-xchange.com> <CALaySJKJxBTw6ptgDNvDjaXKDA4_ZFrb5b2gcUbSqsv1zwZSJw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Importance: Medium
Autocrypt: addr=michael.slusarz@open-xchange.com; prefer-encrypt=mutual; keydata= mQENBFdRf+ABCACjsnzeuEJqUrZHnmyTL0r8JwN0YF6ZS/hgHYx83/dhz2LRgq0bkl5FSYPc6ZY7j7G NqvvPxR4Ri6xfevym91IhdJbQaiJ2B0kWAz+p4H/iXhgZwsYdjFN/c3+MPEjSazCPwASWCDHv2ueCay 0YO2dmw9TnR6rA6GiReQPGumgCZJ4xX8AUBUQftdJOrq/fl1xpYWrsFrTIfBml1x3Q4Mf3+ocH7ZT/u SST2IJTx4a9szQcnrsVHn/Fc2wp4P4FkW87sQFpLND80E5VAwxEdCAtQrhoocUfmh3LyyIncAOIRdw0 aR/1PSgwuj8A2c1W0DRYuzCNaGveugCT6GmbF7qbABEBAAG5AQ0EV1F/4AEIAJ98e4BRvKeJSzSqybD ZzOcphkUy+PlMrhkQkc2m36U01+4E9AtUz3XGNIJE+in9yYKqXbtHJCavOPcFhktVGFvSADKT311ZXq Cx/ibvrWI/DmoBCCIY9iwrutJX08nOoj53wpiNVOuR6vGFJIHO7TNosPcWWsw16LiZIwo9GiU2KseU/ xW0h26ouPbVqVpFX1Tgv+xaCBF8rj4kgoghnvTVG5aEgF+QExOLqx6BmarePKboTFVnMk6NYAwC5TtJ DfpWqa/5vQa8oVAex09elUrN+IQM/tcbMG+tAe5mhjJCke96tdEno29KYjs3Ecl9t1GMcsVAM1k8D8J DJzPCB/cAEQEAAYkBMQQYAQIAGwUCV1F/4AIbDAQLCQgHBhUKCQgLAgUJEswDAAAKCRAH1y6/pl54M6 obB/9AbXRu5SYhYrmTMFGDNqq0BKeoeS1n6cA2rvYRPmmwKSd9/sZG6815X8worSUjPb4r0P/9UoUUy P99BIN0aKc/7baCCV2/00fITrW0sS5Es2xuDuwcRAwwMJX09yMDeCu6M0Y2kn8QjKr1Pu87isoxliQz QDdsYJd9b/iSPFAW+sV7xYlkVcw4XoXYolQTUjNBuWbl6tV6dQN66m6RnCnB1uolFtbZENRiVcHOAdI DmYMjX3UL8cMtqpMyAXe0HTkb1BK2I5m4Kz0thK+beBBLyd6M4bK45zI6L3f5oDOty9o2jAnxlSVdUi ZfBSBypOekOX0bw2w/4XtUoS9emDRk
Archived-At: <https://mailarchive.ietf.org/arch/msg/extra/b6CeKXybtih6l8Gxphz-3lti4-4>
Subject: Re: [Extra] Barry Leiba's Discuss on draft-ietf-extra-imap-fetch-preview-03: (with DISCUSS and COMMENT)
Precedence: list

My comments below.

> On May 13, 2019 8:22 PM Barry Leiba <barryleiba@computer.org> wrote:
> 
>  
> Hi, Michael,
> I'm sorry it took me so long to get back to this, and I won't let that
> happen again.
> 
> > > ----------------------------------------------------------------------
> > > DISCUSS:
> > > ----------------------------------------------------------------------
> > >
> > > — Section 3.1 —
> > >
> > > I don’t understand “the client’s priority decision”: what decision is that?
> > > And what’s the point of giving the server a list of algorithms here, given that
> > > they all have to be ones that are supported by the server?  Won’t the server
> > > always have to use the first one in the list?  If not, please add some text
> > > explaining what the server does.
> >
> > Consensus seems to be this section is confusing, or unclear, or unneeded, or some combination of this three.
> 
> But it's still there and still unexplained in version -05.  Let's see
> where we can go with it:
> 
> > I'll start with providing a (future) real-world example, and why this
> > behavior exists in its present form.  Given a hypothetical future
> > algorithm "IMAGE" (generate image/jpeg preview if image data exists in
> > message), I issue this preview command:
> >
> > a FETCH 1 (PREVIEW (LAZY=IMAGE FUZZY))
> >
> > This is asking the server: provide me an image preview of the message,
> > but ONLY if 1) the message actually contains image information and 2)
> > that image information is already generated or can be generated very
> > quickly.  Otherwise, fall back to FUZZY generation.
> 
> Hm.  That really seems like a contrived example.  How would the client
> software possibly know whether there's useful information in the
> image, or whether the information in the accompanying text is more or
> less important?  Suppose I have two messages:
> 
> 1. text/plain that basically says nothing; image/jpeg that has a
> diagram that explains everything
> 
> 2. text/plain that has the entire real content of the message;
> image/jpeg of the senders company logo
> 
> It's pretty clear from that explanation that for message 1 I want some
> sort of description of the image and for message 2 I want a preview of
> the text.  But how would the client software know that, and,
> therefore, know what to ask for?

Is that what a typical received message looks like?  To me, that looks like the edge case.

Seems that a server could figure out (through experience) when image preview data should be calculated or ignored.  Image size, MIME structure information, MIME order, etc. could all be used to reach some sort of 95% accuracy threshold.  Maybe that doesn't happen day 1 of implementation, but a competitive, reactive server implementation will work on this.

The same issue exists for text previews, FWIW.  What about a text/plain + text/html alternative message, where text/plain is "Please click on this URL to view the text part", and the text/html part is empty text (it's a single image link).  A server can be programmed to be smart enough to ignore that part re: preview.

As written, we allow a server to be given the latitude to figure these kind of issues out.  Closing down that opportunity because it might be a difficult logical problem isn't the right approach from my view.

 
> Apart from that, if we're going to do any sort of text-based preview
> of an image, I would think we'd just say that when FUZZY is applied to
> an image, that's what it does.

FUZZY is explicitly defined as text only output though, since that's generally mapping what clients already do.  Defining an algorithm that can potentially take any data input and produce any media type output (haha! I've now learned not to use MIME type anymore!) seems too broad when trying to map the default behavior with what the vast majority of clients currently do.


> > I find that use-case plausible, albeit not something useful given the
> > current landscape of a single algorithm.  The alternative would be to
> > have to send PREVIEW algorithms one at a time, non-pipelined, which is
> > precisely the kind of inefficient client/server interaction this
> > extension is trying to minimize.
> 
> We disagree on the plausibility.  Realistically, I can't think of
> *any* other plausible preview algorithms that a *client* might ask for
> without user interaction.  The most I can see is something about how
> to apply FUZZY to non-text media types.

Our UI client does "virtual attachment view", that shows graphical representations of all attachment data within messages within a mailbox, so PREVIEW is perfectly suited for that task.  (We currently have to do the preview conversion via a proprietary plugin.)

I'll agree that it may not be a common use-case, but it is entirely plausible.


> > I do agree that there is language that can surely be cleaned up here,
> > especially regarding error handling.  But I will hold off on rewriting
> > the section until we can have some discussion/consensus as to whether
> > the proposed usage discussed above is something that others believe we
> > should be supporting.
> 
> *If* we're going to keep the multi-algorithm syntax, I think the new
> paragraph you added has it covered, and the paragraph with the phrase
> "priority decision" is now unnecessary.
> 
> But I'd still like to see someone give a good argument for a case
> where having a client ask for multiple algorithms to be applied could
> really be useful and practical.
> 
> > > — Section 3.2 —
> > >
> > >    If the preview is not available, the server MUST return NIL as the
> > >    PREVIEW response.  A NIL response indicates to the client that
> > >    preview information MAY become available in a future PREVIEW FETCH
> > >    request.  Note that this is semantically different than returning a
> > >    zero-length string, which indicates an empty preview.
> > >
> > > I think the MUST here is hard to follow, because the text doesn’t make a clear
> > > enough distinction between “preview is not available” and “an empty preview”.
> > > Can you expand the text a bit to explain the distinction more clearly, as this
> > > is a protocol requirement?
> >
> > "Preview not available" examples =
> >   * Preview generation isn't available at the moment (previews are generated in a separate thread, and that pool is saturated)
> >   * Body text is not available at the moment, so preview can't be generated.
> >   * The preview text has not been generated within the alotted time (e.g. LAZY modifier)
> 
> OK, so you're saying that NIL means that there's some transient issue,
> and that a non-empty preview might be available later... and an empty
> string means that there's no preview text available for this message
> and never will be.  Yes?

Correct.

NIL = I asked for preview, I don't have that preview cached, and the message blob storage is acting up and taking 5 seconds to return data.  As a server, that's beyond my configured limit of requiring a message listing within 2 seconds, so the preview is "not available", but not necessarily empty

Empty String = As a server, I have analyzed the body and there is no data that I find worthy to display as "preview text" to the user, and this is a determinative evaluation.

Those are two entirely different scenarios (and the former scenario is expected, especially in a distributed storage system).  So they absolutely need to be handled separately.


> If that's right (and actually a necessary distinction), I suggest this:
> 
> OLD
>    If the preview is not available, the server MUST return NIL as the
>    PREVIEW response.  A NIL response indicates to the client that
>    preview information MAY become available in a future PREVIEW FETCH
>    request.
> 
>    Examples why a preview may not be available include: the preview
>    generation process is not available due to transient server resource
>    limitations, the message body text is unavailable, or a server-
>    imposed timeout was reached during generation.
> 
>    A NIL response is semantically different than returning a zero-length
>    string, which indicates that no meaningful preview text can be
>    generated for the message.
> 
> NEW
>    It is possible that preview text is not available now, but might be
>    available later -- perhaps the server's preview-generating resources
>    are overloaded, there is a server-imposed timeout during preview
>    generation, or there is some transient issue with fetching the
>    message body.  In such cases, the server will return NIL as the
>    preview response, and the client might possibly try again later.
> 
>    On the other hand, it's possible that the server has determined that
>    no meaningful preview text can be generated for a particular
>    message, and that won't change later.  Examples of this involve
>    encrypted messages, content types the server does not support
>    previews of, and other situations where the server is not able to
>    extract information for a preview.  In such cases, the server will
>    return a zero-length string.  Clients should not send another FETCH
>    for a preview for such messages.
> 
> END
> 
> But let's be sure we really think the distinction is important.

Thank you for these clarification suggestions.  Will adapt them in the next draft.


> > > — Section 8 —
> > >
> > > It seems like a bad idea to have to keep the IMAP Capabilities registry in sync
> > > with the two new registries: as it stands, when you add a new algorithm you
> > > have to add it to the Preview Algorithms registry, and also add a corresponding
> > > entry in the Capabilities registry... and similarly for a modifier, if I have
> > > that right above.
> > >
> > > Why not follow the model of AUTH= and RIGHTS=, and just reserve the PREVIEW=
> > > capability in the registry, allowing it to apply to entries from the two new
> > > registries?  That avoids inconsistencies in registrations if we later add
> > > algorithms or modifiers.
> >
> > See above.  With my proposal to remove priority modifiers, I believe this discussion
> > point becomes moot.
> 
> It does not; it's still an issue with algorithms.  If you register
> "PREVIEW=FUZZY" as a capability string and also register "FUZZY" as an
> algorithm, you're doing double registration and are prone to getting
> inconsistencies.
> 
> But if, instead, you just register "PREVIEW=" as a capability string
> and let that stand for any "PREVIEW=X", where X is a registered
> preview algorithm, you don't get into that sort of trouble.  That's
> what AUTH= and RIGHTS= did.  Is there a reason not to do that?

User error.  Combination of reading previous specs wrong, and mistaken ideas about how this should work.

100% agree with your reasoning.  Will clean this up for the next draft.


> 
> > > ----------------------------------------------------------------------
> > > COMMENT:
> > > ----------------------------------------------------------------------
> > >
> > > — Section 3.2 —
> > >
> > >    This relaxed requirement permits a
> > >    server to offer previews as an option without requiring potentially
> > >    burdensome storage and/or processing requirements to guarantee
> > >    immutability for a use case that does not require this strictness.
> > >
> > > That’s sensible, but can you include some text giving an example of a situation
> > > where the preview might change?  Given that the messages themselves are
> > > immutable, why would applying the same algorithm to the same text give
> > > different results?
> >
> > We discussed this on the list in the past, but one example would be
> > loss of cached preview data on the server and re-generation uses a
> > newer algorithm version which produces slightly different text.
> >
> > The consensus was that we should not be making extraordinary
> > efforts/costs to ensure this text never changes, where this text is not
> > being held out as being the canonical view of the message contents in
> > the first place.
> 
> That's fine.  So, as the first sentence of my comment says, can you
> include some text to explain the situation?  I'm not looking for a
> lot, just a sentence.  (If you think it's best not to, that's OK too:
> this is a comment, not a discuss point).

Fair point.  I will address this suggestion.  Maybe: "For example, the underlying IMAP server may change for an account due to a software upgrade; account state information may be retained in the migration, but the new server may support a different PREVIEW generation algorithm.  Thus, message state may remain the same but PREVIEW FETCH response may change."

Traveling, so give me a day or two and I will incorporate above edits, and discussion points previously raised by Alexey, in the next draft.

michael

[Extra] Barry Leiba's Discuss on draft-ietf-extra… Barry Leiba via Datatracker
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Chris Newman
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Chris Newman
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Barry Leiba
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Alexey Melnikov
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Michael Slusarz
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Barry Leiba
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Michael Slusarz
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Barry Leiba
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Arnt Gulbrandsen
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Neil Jenkins
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Barry Leiba
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Michael Slusarz
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Michael Slusarz
Re: [Extra] Barry Leiba's Discuss on draft-ietf-e… Neil Jenkins