[imapext] IMAP SNIPPET extension (initial draft)

Michael M Slusarz <slusarz@horde.org> Mon, 04 May 2015 05:56 UTC

Return-Path: <slusarz@horde.org>
X-Original-To: imapext@ietfa.amsl.com
Delivered-To: imapext@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D949E1ACD78 for <imapext@ietfa.amsl.com>; Sun, 3 May 2015 22:56:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.3
X-Spam-Level:
X-Spam-Status: No, score=-1.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, J_CHICKENPOX_75=0.6] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id II_1sSbioNtl for <imapext@ietfa.amsl.com>; Sun, 3 May 2015 22:56:27 -0700 (PDT)
Received: from resqmta-po-09v.sys.comcast.net (resqmta-po-09v.sys.comcast.net [IPv6:2001:558:fe16:19:96:114:154:168]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9A2881ACD76 for <imapext@ietf.org>; Sun, 3 May 2015 22:56:27 -0700 (PDT)
Received: from resomta-po-02v.sys.comcast.net ([96.114.154.226]) by resqmta-po-09v.sys.comcast.net with comcast id PhwJ1q0044tLnxL01hwSNZ; Mon, 04 May 2015 05:56:26 +0000
Received: from bigworm.curecanti.org ([IPv6:2002:43b0:53e4:0:21e:68ff:fe1e:2860]) by resomta-po-02v.sys.comcast.net with comcast id PhwS1q0023cyTSr01hwScq; Mon, 04 May 2015 05:56:26 +0000
Received: from localhost (localhost.localdomain [IPv6:::1]) (Authenticated sender: slusarz) by bigworm.curecanti.org (Postfix) with ESMTPSA id DAE461025 for <imapext@ietf.org>; Sun, 3 May 2015 23:56:25 -0600 (MDT)
Received: from dangerzone.curecanti.org (dangerzone.curecanti.org [172.18.5.4]) by bigworm.curecanti.org (Horde Framework) with HTTP; Sun, 03 May 2015 23:56:25 -0600
Date: Sun, 03 May 2015 23:56:25 -0600
Message-ID: <20150503235625.Horde.pquZ06cW6a2mn1lEJlQhL4J@bigworm.curecanti.org>
From: Michael M Slusarz <slusarz@horde.org>
To: IMAPEXT <imapext@ietf.org>
User-Agent: Internet Messaging Program (IMP) H6 (7.0.0-git)
Accept-Language: en
X-Originating-IP: 172.18.5.4
X-Remote-Browser: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0
Content-Type: text/plain; charset="utf-8"; format="flowed"; DelSp="Yes"
MIME-Version: 1.0
Content-Disposition: inline
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1430718986; bh=lGpTRpbxjUeK7MiQSObtNkBn8osGSPPebThioV6++XM=; h=Received:Received:Received:Received:Date:Message-ID:From:To: Subject:Content-Type:MIME-Version; b=NvNcSz4hUbG9Vu5MEYeCgnUlsT/3lsuXTIu/6xqd6/iaMA+V4KI1JVY9EWsucu2og B7PO0DuTUeYe7y3KGhMZz7F2oIwgtD6Pwqrp/NUNSpr76ONGGWE/lNIhNwc1t84oXq hKDHv224e/bCZBYivHHSVh/9nqNbf2Tc+BJUzdAGXRJrpdKfyYQqn5duMc3vAwEbA6 M4GPa8kOKpxn/AOfxY2qaBooUXAvNEXk+V4sQjlBYdXtHS2+TkoU/9gE5VGax8wyG0 z30NT2Hcx8S2UeR4ZxvqHeYAqG1W+lTyJWqrf0jLb2mYweQYEWTMcEdpniK0uZ6rSn isHkIsHVaEx7w==
Archived-At: <http://mailarchive.ietf.org/arch/msg/imapext/gLAXS_nwAU1fcGrsEpEe6uCTeZY>
Subject: [imapext] IMAP SNIPPET extension (initial draft)
X-BeenThere: imapext@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Discussion of IMAP extensions <imapext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/imapext>, <mailto:imapext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/imapext/>
List-Post: <mailto:imapext@ietf.org>
List-Help: <mailto:imapext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/imapext>, <mailto:imapext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 May 2015 05:56:30 -0000

A little late, but here's my efforts on an initial draft regarding  
SNIPPET so that it can be added to the WG.

Based on the conversation that occurred on this list a few months  
back, I made these editorial decisions:

   - Extend FETCH: several people said this would not be implemented  
if it used ANNOTATE/CONVERT
   - Allow multiple different algorithms

I think the main point of discussion going forward is whether to  
explicitly define an algorithm for snippet generation (i.e. threading)  
or to allow the default algorithm to be entirely server-defined (i.e.  
FUZZY searching).  For now, I went with the latter, and even named it  
FUZZY.

Another issue regards the size of snippet return.  In this draft, this  
is not client configurable.  For the FUZZY algorithm, I used 150  
octets as SHOULD and MUST NOT ever exceed 300 octets.  (Snippet data  
must be returned as UTF-8).

But hopefully this is a decent enough starting point where it can be  
used to continue the discussion going forward.

michael


----------------


Internet Engineering Task Force                          M. Slusarz, Ed.
Internet-Draft                                                   Dovecot
Intended status: Standards Track                                May 2015
Expires: October 31, 2015

                   IMAP4 Extension: Snippet Generation
                          draft-imap-snippet-00

Abstract

    This document specifies an IMAP protocol extension which allows a
    client to request that a server provide an abbreviated representation
    of a message (a snippet of text) that can be used by a client to
    provide a useful contextual preview of the message contents.

Status of this Memo

    This Internet-Draft is submitted in full conformance with the
    provisions of BCP 78 and BCP 79.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF).  Note that other groups may also distribute
    working documents as Internet-Drafts.  The list of current Internet-
    Drafts is at http://datatracker.ietf.org/drafts/current/.

    Internet-Drafts are draft documents valid for a maximum of six months
    and may be updated, replaced, or obsoleted by other documents at any
    time.  It is inappropriate to use Internet-Drafts as reference
    material or to cite them other than as "work in progress."

    This Internet-Draft will expire on October 31, 2015.

Copyright Notice

    Copyright (c) 2015 IETF Trust and the persons identified as the
    document authors.  All rights reserved.

    This document is subject to BCP 78 and the IETF Trust's Legal
    Provisions Relating to IETF Documents (http://trustee.ietf.org/
    license-info) in effect on the date of publication of this document.
    Please review these documents carefully, as they describe your rights
    and restrictions with respect to this document.  Code Components
    extracted from this document must include Simplified BSD License text
    as described in Section 4.e of the Trust Legal Provisions and are
    provided without warranty as described in the Simplified BSD License.

Table of Contents

    1.  Introduction
    2.  Conventions Used In This Document
    3.  FETCH Data Item
      3.1.  Command
      3.2.  Response
    4.  SNIPPET Algorithms
      4.1.  FUZZY
    5.  Examples
    6.  Formal Syntax
    7.  TODO
    8.  Acknowledgements
    9.  IANA Considerations
    10. Security Considerations
    11. References
      11.1.  Normative References
      11.2.  Informative References
    Author's Address

1.  Introduction

    Many modern mail clients display small extracts of the body text as
    an aid to allow a user to quickly decide whether they are interested
    in viewing the full message contents.  Mail clients implementing the
    Internet Message Access Protocol (IMAP; RFC 3501 [RFC3501]) would
    benefit from a standardized, consistent way to generate these brief
    previews of messages (a "snippet").

    Generation of snippets on the server has several additional benefits.
    First, it allows caching of these snippets for use with both multiple
    mail clients and with clients that don't support client-side caching.

    Second, generation on the server is more efficient.  A client-based
    algorithm needs to issue, at a minimum, a FETCH BODYSTRUCTURE command
    in order to determine which MIME [RFC2045] body part should be
    displayed.

    Finally, server generation allows caching in a centralized location.
    Mail accounts are often accessed by multiple different clients, and
    many of these clients lack support for client-side caching.  Using
    server generated snippets allows snippets to be generated once and
    then cached indefinitely.

    A server that supports the SNIPPET extension indicates this with one
    or more capability names consisting of "SNIPPET=" followed by a
    supported snippet algorithm name as described in this document.  This
    provides for future upwards-compatible extensions and/or the ability
    to use locally-defined snippet algorithms.

2.  Conventions Used In This Document

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
    document are to be interpreted as described in RFC 2119 [RFC2119].

    "User" is used to refer to a human user, whereas "client" refers to
    the software being run by the user.

    In examples, "C:" and "S:" indicate lines sent by the client and
    server respectively.  If a single "C:" or "S:" label applies to
    multiple lines, then the line breaks between those lines are for
    editorial clarity only and are not part of the actual protocol
    exchange.

3.  FETCH Data Item

3.1.  Command

    To retrieve a snippet for a message, the "SNIPPET" FETCH attribute is
    used when issuing a FETCH command.

    If no algorithm identifier is provided, the server decides which of
    its built-in algorithms to use to generate the snippet text.

    Alternately, the client may explicitly indicate which algorithm
    should be used in a parenthesized list after the SNIPPET attribute
    containing the name of the algorithm.  This algorithm MUST be one of
    the algorithms identified as supported in the SNIPPET capability
    responses.  If a client requests an algorithm that is unsupported,
    the server MUST return a tagged BAD response.  The server SHOULD use
    this algorithm to generate the snippet.

    A client SHOULD NOT issue more than one SNIPPET attribute per FETCH
    command.  If more than one SNIPPET attribute is present in a FETCH
    command, the server SHOULD use the last attribute seen in the
    command.

3.2.  Response

    The server returns a variable-length string that is the generated
    snippet for that message.  This string MUST NOT be content transfer
    encoded and MUST be encoded in UTF-8.  The snippet text MUST be
    treated as text/plain MIME data by the client.

4.  SNIPPET Algorithms

4.1.  FUZZY

    The FUZZY algorithm directs the server to use any internal algorithm
    it desires, subject to the below limitations, to generate the snippet
    for a message.

    The server SHOULD limit the length of the snippet text to 150 octets.
    The server MUST NOT output snippet text longer than 250 octets.

    The server SHOULD remove any formatting markup that exists in the
    original text.

    The FUZZY algorithm MUST be implemented by any server that supports
    the SNIPPET extension.

5.  Examples

    Example 1: Requesting FETCH without explicit algorithm selection

         C: A1 CAPABILITY
         S: * CAPABILITY IMAP4rev1 SNIPPET=FUZZY
         S: A1 OK Capability command completed.
         C: A2 FETCH 1 (RFC822.SIZE SNIPPET)
         S: * 1 FETCH (RFC822.SIZE 20000 SNIPPET {61}
         S: This is the first line of text from the first text part.
         S: )
         S: A2 OK FETCH complete.

    Example 2: Requesting FETCH with explicit algorithm selection

           C: A1 CAPABILITY
           S: * CAPABILITY IMAP4rev1 SNIPPET=FUZZY
           S: A1 OK Capability command completed.
           C: A2 FETCH 1 (RFC822.SIZE SNIPPET (FUZZY))
           S: * 1 FETCH (RFC822.SIZE 20000 SNIPPET {61}
           S: This is the first line of text from the first text part.
           S: )
           S: A2 OK FETCH complete.

    Example 3: Requesting FETCH with invalid explicit algorithm selection

           C: A1 CAPABILITY
           S: * CAPABILITY IMAP4rev1 SNIPPET=FUZZY
           S: A1 OK Capability command completed.
           C: A2 FETCH 1 (RFC822.SIZE SNIPPET (X-SNIPPET-ALGO))
           S: A2 BAD FETCH contains invalid snippet algorithm name.

6.  Formal Syntax

    The following syntax specification uses the augmented Backus-Naur
    Form (BNF) as described in ABNF [RFC5234].  It includes definitions
    from IMAP [RFC3501].

        capability      =/ "SNIPPET=FUZZY"

        fetch-att       =/ "SNIPPET" *(SP "(" snippet-alg ")")

        msg-att-static  =/ "SNIPPET" SP nstring

        snippet-alg     =  "FUZZY" / snippet-alg-ext

        snippet-alg-ext =  atom
                           ; New algorithms MUST be registered with IANA

7.  TODO

    1.  More explicit algorithm for text/plain processing?

    2.  Interaction with CONDSTORE (MODSEQs)?

    3.  Allow algorithms to return non-text/plain data?

8.  Acknowledgements

    TODO

9.  IANA Considerations

    IMAP4 [RFC3501] capabilities are registered by publishing a standards
    track or IESG-approved experimental RFC.  The registry is currently
    located at:

       http://www.iana.org/assignments/imap-capabilities

    This document requests that IANA adds the "SNIPPET" capability to the
    IMAP4 [RFC3501] capabilities registry.

    This document also requests that IANA adds a new IMAP4 [RFC3501]
    snippet algorithms registry, which registers snippet algorithms by
    publishing a standards track or IESG-approved experimental RFC.  This
    document constitutes registration of the FUZZY algorithm in that
    registry.

10.  Security Considerations

    TODO; See RFC 3552

11.  References

11.1.  Normative References

    [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
               Requirement Levels", BCP 14, RFC 2119, March 1997.

    [RFC3501]  Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
               4rev1", RFC 3501, March 2003.

    [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
               Specifications: ABNF", STD 68, RFC 5234, January 2008.

11.2.  Informative References

    [RFC2045]  Freed, N. and N.S. Borenstein, "Multipurpose Internet Mail
               Extensions (MIME) Part One: Format of Internet Message
               Bodies", RFC 2045, November 1996.

Author's Address

    Michael Slusarz, editor
    Dovecot
    Denver, Colorado
    US

    Email: michael.slusarz@dovecot.fi



___________________________________
Michael Slusarz [slusarz@horde.org]