Re: [imapext] IMAP SNIPPET extension (initial draft)

Chris Newman <chris.newman@oracle.com> Tue, 12 May 2015 01:36 UTC

Return-Path: <chris.newman@oracle.com>
X-Original-To: imapext@ietfa.amsl.com
Delivered-To: imapext@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DE01D1B2B31 for <imapext@ietfa.amsl.com>; Mon, 11 May 2015 18:36:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.611
X-Spam-Level:
X-Spam-Status: No, score=-3.611 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_75=0.6, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tWYbRIrlpb_y for <imapext@ietfa.amsl.com>; Mon, 11 May 2015 18:36:29 -0700 (PDT)
Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 73CED1B2B36 for <imapext@ietf.org>; Mon, 11 May 2015 18:36:29 -0700 (PDT)
Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t4C1aMFI025079 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 12 May 2015 01:36:23 GMT
Received: from gotmail.us.oracle.com (gotmail.us.oracle.com [10.133.152.174]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t4C1aMbH002607; Tue, 12 May 2015 01:36:22 GMT
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-disposition: inline
Content-type: text/plain; CHARSET="US-ASCII"
Received: from [10.145.183.59] (dhcp-arcadia01-4fl-east-10-145-183-59.usdhcp.oraclecorp.com [10.145.183.59]) by gotmail.us.oracle.com (Oracle Communications Messaging Server 8.0.0.0.0 64bit (built Mar 19 2015)) with ESMTPA id <0NO700HCUR4KZV00@gotmail.us.oracle.com>; Mon, 11 May 2015 18:36:21 -0700 (PDT)
Date: Mon, 11 May 2015 18:36:19 -0700
From: Chris Newman <chris.newman@oracle.com>
To: Michael M Slusarz <slusarz@horde.org>, IMAPEXT <imapext@ietf.org>
Message-id: <EFC728E3C34D5D1149633067@96B2F16665FF96BAE59E9B90>
In-reply-to: <20150503235625.Horde.pquZ06cW6a2mn1lEJlQhL4J@bigworm.curecanti.org>
References: <20150503235625.Horde.pquZ06cW6a2mn1lEJlQhL4J@bigworm.curecanti.org>
X-Mailer: Mulberry/4.0.8 (Mac OS X)
X-Source-IP: userv0022.oracle.com [156.151.31.74]
Archived-At: <http://mailarchive.ietf.org/arch/msg/imapext/s0LaGjEM9DqrxNv9sZ5Ct09v-N0>
Subject: Re: [imapext] IMAP SNIPPET extension (initial draft)
X-BeenThere: imapext@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Discussion of IMAP extensions <imapext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/imapext>, <mailto:imapext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/imapext/>
List-Post: <mailto:imapext@ietf.org>
List-Help: <mailto:imapext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/imapext>, <mailto:imapext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 May 2015 01:36:33 -0000

--On May 3, 2015 23:56:25 -0600 Michael M Slusarz <slusarz@horde.org> wrote:
> A little late, but here's my efforts on an initial draft regarding SNIPPET so
> that it can be added to the WG.
> 
> Based on the conversation that occurred on this list a few months back, I
> made these editorial decisions:
> 
>    - Extend FETCH: several people said this would not be implemented if it
> used ANNOTATE/CONVERT
>    - Allow multiple different algorithms
> 
> I think the main point of discussion going forward is whether to explicitly
> define an algorithm for snippet generation (i.e. threading) or to allow the
> default algorithm to be entirely server-defined (i.e. FUZZY searching).  For
> now, I went with the latter, and even named it FUZZY.
> 
> Another issue regards the size of snippet return.  In this draft, this is not
> client configurable.  For the FUZZY algorithm, I used 150 octets as SHOULD
> and MUST NOT ever exceed 300 octets.  (Snippet data must be returned as
> UTF-8).
> 
> But hopefully this is a decent enough starting point where it can be used to
> continue the discussion going forward.

Good starting point. Thanks!

I strongly prefer an expert review process to standards track for new SNIPPET
algorithms. The expert review should simply make sure the algorithm is
deterministic if published and make sure private vendor algorithms have a
version number (e.g., VENDORv1, VENDORv2, ...). No reason to require standards
track processing, IMHO.

I also suggest a simpler syntax. Have the client use

>            C: A2 FETCH 1 (RFC822.SIZE SNIPPET=FUZZY)

or

>            C: A2 FETCH 1 (RFC822.SIZE SNIPPET)

and the server always include an algorithm name in the response such as:

>            S: * 1 FETCH (RFC822.SIZE 20000 SNIPPET=FUZZY {61}

or

>            S: * 1 FETCH (RFC822.SIZE 20000 SNIPPET=BLUDYBLOOP {61}

Don't use "X-" in examples per RFC 6648. If the client does not include an
algorithm name, then the server picks the algorithm (so it will provide the
deterministic name if there is one or use FUZZY). This is simpler syntax, and
it allows the client to determine the server's preferred deterministic
algorithm if there is one (which can be better for an offline client).

		- Chris

> ----------------
> 
> 
> Internet Engineering Task Force                          M. Slusarz, Ed.
> Internet-Draft                                                   Dovecot
> Intended status: Standards Track                                May 2015
> Expires: October 31, 2015
> 
>                    IMAP4 Extension: Snippet Generation
>                           draft-imap-snippet-00
> 
> Abstract
> 
>     This document specifies an IMAP protocol extension which allows a
>     client to request that a server provide an abbreviated representation
>     of a message (a snippet of text) that can be used by a client to
>     provide a useful contextual preview of the message contents.
> 
> Status of this Memo
> 
>     This Internet-Draft is submitted in full conformance with the
>     provisions of BCP 78 and BCP 79.
> 
>     Internet-Drafts are working documents of the Internet Engineering
>     Task Force (IETF).  Note that other groups may also distribute
>     working documents as Internet-Drafts.  The list of current Internet-
>     Drafts is at http://datatracker.ietf.org/drafts/current/.
> 
>     Internet-Drafts are draft documents valid for a maximum of six months
>     and may be updated, replaced, or obsoleted by other documents at any
>     time.  It is inappropriate to use Internet-Drafts as reference
>     material or to cite them other than as "work in progress."
> 
>     This Internet-Draft will expire on October 31, 2015.
> 
> Copyright Notice
> 
>     Copyright (c) 2015 IETF Trust and the persons identified as the
>     document authors.  All rights reserved.
> 
>     This document is subject to BCP 78 and the IETF Trust's Legal
>     Provisions Relating to IETF Documents (http://trustee.ietf.org/
>     license-info) in effect on the date of publication of this document.
>     Please review these documents carefully, as they describe your rights
>     and restrictions with respect to this document.  Code Components
>     extracted from this document must include Simplified BSD License text
>     as described in Section 4.e of the Trust Legal Provisions and are
>     provided without warranty as described in the Simplified BSD License.
> 
> Table of Contents
> 
>     1.  Introduction
>     2.  Conventions Used In This Document
>     3.  FETCH Data Item
>       3.1.  Command
>       3.2.  Response
>     4.  SNIPPET Algorithms
>       4.1.  FUZZY
>     5.  Examples
>     6.  Formal Syntax
>     7.  TODO
>     8.  Acknowledgements
>     9.  IANA Considerations
>     10. Security Considerations
>     11. References
>       11.1.  Normative References
>       11.2.  Informative References
>     Author's Address
> 
> 1.  Introduction
> 
>     Many modern mail clients display small extracts of the body text as
>     an aid to allow a user to quickly decide whether they are interested
>     in viewing the full message contents.  Mail clients implementing the
>     Internet Message Access Protocol (IMAP; RFC 3501 [RFC3501]) would
>     benefit from a standardized, consistent way to generate these brief
>     previews of messages (a "snippet").
> 
>     Generation of snippets on the server has several additional benefits.
>     First, it allows caching of these snippets for use with both multiple
>     mail clients and with clients that don't support client-side caching.
> 
>     Second, generation on the server is more efficient.  A client-based
>     algorithm needs to issue, at a minimum, a FETCH BODYSTRUCTURE command
>     in order to determine which MIME [RFC2045] body part should be
>     displayed.
> 
>     Finally, server generation allows caching in a centralized location.
>     Mail accounts are often accessed by multiple different clients, and
>     many of these clients lack support for client-side caching.  Using
>     server generated snippets allows snippets to be generated once and
>     then cached indefinitely.
> 
>     A server that supports the SNIPPET extension indicates this with one
>     or more capability names consisting of "SNIPPET=" followed by a
>     supported snippet algorithm name as described in this document.  This
>     provides for future upwards-compatible extensions and/or the ability
>     to use locally-defined snippet algorithms.
> 
> 2.  Conventions Used In This Document
> 
>     The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
>     "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
>     document are to be interpreted as described in RFC 2119 [RFC2119].
> 
>     "User" is used to refer to a human user, whereas "client" refers to
>     the software being run by the user.
> 
>     In examples, "C:" and "S:" indicate lines sent by the client and
>     server respectively.  If a single "C:" or "S:" label applies to
>     multiple lines, then the line breaks between those lines are for
>     editorial clarity only and are not part of the actual protocol
>     exchange.
> 
> 3.  FETCH Data Item
> 
> 3.1.  Command
> 
>     To retrieve a snippet for a message, the "SNIPPET" FETCH attribute is
>     used when issuing a FETCH command.
> 
>     If no algorithm identifier is provided, the server decides which of
>     its built-in algorithms to use to generate the snippet text.
> 
>     Alternately, the client may explicitly indicate which algorithm
>     should be used in a parenthesized list after the SNIPPET attribute
>     containing the name of the algorithm.  This algorithm MUST be one of
>     the algorithms identified as supported in the SNIPPET capability
>     responses.  If a client requests an algorithm that is unsupported,
>     the server MUST return a tagged BAD response.  The server SHOULD use
>     this algorithm to generate the snippet.
> 
>     A client SHOULD NOT issue more than one SNIPPET attribute per FETCH
>     command.  If more than one SNIPPET attribute is present in a FETCH
>     command, the server SHOULD use the last attribute seen in the
>     command.
> 
> 3.2.  Response
> 
>     The server returns a variable-length string that is the generated
>     snippet for that message.  This string MUST NOT be content transfer
>     encoded and MUST be encoded in UTF-8.  The snippet text MUST be
>     treated as text/plain MIME data by the client.
> 
> 4.  SNIPPET Algorithms
> 
> 4.1.  FUZZY
> 
>     The FUZZY algorithm directs the server to use any internal algorithm
>     it desires, subject to the below limitations, to generate the snippet
>     for a message.
> 
>     The server SHOULD limit the length of the snippet text to 150 octets.
>     The server MUST NOT output snippet text longer than 250 octets.
> 
>     The server SHOULD remove any formatting markup that exists in the
>     original text.
> 
>     The FUZZY algorithm MUST be implemented by any server that supports
>     the SNIPPET extension.
> 
> 5.  Examples
> 
>     Example 1: Requesting FETCH without explicit algorithm selection
> 
>          C: A1 CAPABILITY
>          S: * CAPABILITY IMAP4rev1 SNIPPET=FUZZY
>          S: A1 OK Capability command completed.
>          C: A2 FETCH 1 (RFC822.SIZE SNIPPET)
>          S: * 1 FETCH (RFC822.SIZE 20000 SNIPPET {61}
>          S: This is the first line of text from the first text part.
>          S: )
>          S: A2 OK FETCH complete.
> 
>     Example 2: Requesting FETCH with explicit algorithm selection
> 
>            C: A1 CAPABILITY
>            S: * CAPABILITY IMAP4rev1 SNIPPET=FUZZY
>            S: A1 OK Capability command completed.
>            C: A2 FETCH 1 (RFC822.SIZE SNIPPET (FUZZY))
>            S: * 1 FETCH (RFC822.SIZE 20000 SNIPPET {61}
>            S: This is the first line of text from the first text part.
>            S: )
>            S: A2 OK FETCH complete.
> 
>     Example 3: Requesting FETCH with invalid explicit algorithm selection
> 
>            C: A1 CAPABILITY
>            S: * CAPABILITY IMAP4rev1 SNIPPET=FUZZY
>            S: A1 OK Capability command completed.
>            C: A2 FETCH 1 (RFC822.SIZE SNIPPET (X-SNIPPET-ALGO))
>            S: A2 BAD FETCH contains invalid snippet algorithm name.
> 
> 6.  Formal Syntax
> 
>     The following syntax specification uses the augmented Backus-Naur
>     Form (BNF) as described in ABNF [RFC5234].  It includes definitions
>     from IMAP [RFC3501].
> 
>         capability      =/ "SNIPPET=FUZZY"
> 
>         fetch-att       =/ "SNIPPET" *(SP "(" snippet-alg ")")
> 
>         msg-att-static  =/ "SNIPPET" SP nstring
> 
>         snippet-alg     =  "FUZZY" / snippet-alg-ext
> 
>         snippet-alg-ext =  atom
>                            ; New algorithms MUST be registered with IANA
> 
> 7.  TODO
> 
>     1.  More explicit algorithm for text/plain processing?
> 
>     2.  Interaction with CONDSTORE (MODSEQs)?
> 
>     3.  Allow algorithms to return non-text/plain data?
> 
> 8.  Acknowledgements
> 
>     TODO
> 
> 9.  IANA Considerations
> 
>     IMAP4 [RFC3501] capabilities are registered by publishing a standards
>     track or IESG-approved experimental RFC.  The registry is currently
>     located at:
> 
>        http://www.iana.org/assignments/imap-capabilities
> 
>     This document requests that IANA adds the "SNIPPET" capability to the
>     IMAP4 [RFC3501] capabilities registry.
> 
>     This document also requests that IANA adds a new IMAP4 [RFC3501]
>     snippet algorithms registry, which registers snippet algorithms by
>     publishing a standards track or IESG-approved experimental RFC.  This
>     document constitutes registration of the FUZZY algorithm in that
>     registry.
> 
> 10.  Security Considerations
> 
>     TODO; See RFC 3552
> 
> 11.  References
> 
> 11.1.  Normative References
> 
>     [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
>                Requirement Levels", BCP 14, RFC 2119, March 1997.
> 
>     [RFC3501]  Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
>                4rev1", RFC 3501, March 2003.
> 
>     [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
>                Specifications: ABNF", STD 68, RFC 5234, January 2008.
> 
> 11.2.  Informative References
> 
>     [RFC2045]  Freed, N. and N.S. Borenstein, "Multipurpose Internet Mail
>                Extensions (MIME) Part One: Format of Internet Message
>                Bodies", RFC 2045, November 1996.
> 
> Author's Address
> 
>     Michael Slusarz, editor
>     Dovecot
>     Denver, Colorado
>     US
> 
>     Email: michael.slusarz@dovecot.fi
> 
> 
> 
> ___________________________________
> Michael Slusarz [slusarz@horde.org]
> 
> _______________________________________________
> imapext mailing list
> imapext@ietf.org
> https://www.ietf.org/mailman/listinfo/imapext
>