Re: [imapext] draft-ietf-appsawg-multimailbox-search

Bron Gondwana <brong@fastmail.fm> Fri, 07 March 2014 00:37 UTC

Return-Path: <brong@fastmail.fm>
X-Original-To: imapext@ietfa.amsl.com
Delivered-To: imapext@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4BFC01A014E for <imapext@ietfa.amsl.com>; Thu, 6 Mar 2014 16:37:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F3mCJbSooulA for <imapext@ietfa.amsl.com>; Thu, 6 Mar 2014 16:37:53 -0800 (PST)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by ietfa.amsl.com (Postfix) with ESMTP id AA8EF1A001D for <imapext@ietf.org>; Thu, 6 Mar 2014 16:37:53 -0800 (PST)
Received: from compute3.internal (compute3.nyi.mail.srv.osa [10.202.2.43]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 90F6220A78 for <imapext@ietf.org>; Thu, 6 Mar 2014 19:37:49 -0500 (EST)
Received: from web2 ([10.202.2.212]) by compute3.internal (MEProxy); Thu, 06 Mar 2014 19:37:49 -0500
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.fm; h= message-id:from:to:mime-version:content-transfer-encoding :content-type:in-reply-to:references:subject:date; s=mesmtp; bh= XpM+4XcXCVkcY0u1XlSWCRJ7xPw=; b=A1dkJBlhcIODWmzueoROfXebxdGtNVux Mkp7R71Cp5BadVau5p9VfqmtdzQJ1o09Ymh95tesfbTbK1NwSZ9DChlqz7bh9E/b woa0k+48qWam8GPgu+MVAy1tgwDeTgzsmf/xQmbETIbrV2MSvTbkRpo3ls1iuykz 8NbeDrgr9v8=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:from:to:mime-version :content-transfer-encoding:content-type:in-reply-to:references :subject:date; s=smtpout; bh=XpM+4XcXCVkcY0u1XlSWCRJ7xPw=; b=IsC NPMtzax32auVK6OHD+1J8IIdt7JQ7U8Xjnm3nOFJdgVBVwIDXw8y8jKsWPBN9tvD fpwLu+QtMe+p3WL0LFMMHl3JWFr4BBTDjH9433OetaCDqUw6VkJAfGvLfTPnkc1c QQB+OJN/LyMx7/ETIPe4n7gG7tY/LfW1/2fykwRw=
Received: by web2.nyi.mail.srv.osa (Postfix, from userid 99) id 704B95400D3; Thu, 6 Mar 2014 19:37:49 -0500 (EST)
Message-Id: <1394152669.15703.91551029.06D49A6D@webmail.messagingengine.com>
X-Sasl-Enc: xMNjyPXvf9zkoTw/jrbRR0jlPDDZCcluZ8fAMF6rGxVy 1394152669
From: Bron Gondwana <brong@fastmail.fm>
To: imapext@ietf.org
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain
X-Mailer: MessagingEngine.com Webmail Interface - ajax-4527a23f
In-Reply-To: <5B508A7BE305AE6FA48093DB@96B2F16665FF96BAE59E9B90>
References: <CAC4RtVCgLuvpAqEs6Nzz+358dvg4q2YkBNmeq39SYFmnYOLoKw@mail.gmail.com> <1394103473.14059.91258445.054B3082@webmail.messagingengine.com> <06B38F8D7C86FDF50230A07E@caldav.corp.apple.com> <5B508A7BE305AE6FA48093DB@96B2F16665FF96BAE59E9B90>
Date: Fri, 07 Mar 2014 11:37:49 +1100
Archived-At: http://mailarchive.ietf.org/arch/msg/imapext/WeOxVHf5szGvl-yTfe-JrmTglFE
Subject: Re: [imapext] draft-ietf-appsawg-multimailbox-search
X-BeenThere: imapext@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Discussion of IMAP extensions <imapext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/imapext>, <mailto:imapext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/imapext/>
List-Post: <mailto:imapext@ietf.org>
List-Help: <mailto:imapext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/imapext>, <mailto:imapext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Mar 2014 00:37:57 -0000

On Fri, Mar 7, 2014, at 03:29 AM, Chris Newman wrote:
> --On March 6, 2014 9:30:17 -0500 Cyrus Daboo <cyrus@daboo.name> wrote:
> > --On March 6, 2014 at 9:57:53 PM +1100 Bron Gondwana <brong@fastmail.fm> 
> wrote:
> >> Anyway - my main problem with it is that users don't just want search -
> >> they want sort.  Running the search across multiple mailboxes means you
> >> still need to fetch all the data for all the matches before you can get a
> >> meaningful sort by any parameter other than mailbox name as the first
> >> sort index.  And that can be a lot of download to the client if you
> >> searched a common term.]
> >
> > On Bron's point - yes what is really needed is an extension that does
> > away with the single selected mailbox model and allows searching,
> > sorting, sync'ing, message access across all mailboxes all the time.
> 
> I have no interest in implementing such an extension at this time. I'm 
> going to assert there is not enough interest in such a proposal to get a 
> specification written, thus making it not relevant to this discussion. If 
> someone were to prove my assertion wrong, then we could have an interesting 
> discussion about whether one or both proposals should go to standards track.

You're asserting this based on your customer who you can't name - as opposed
to my customers who I (well, I can't really name the actual customers)... bu
the ability to sort by date overall rather than with each mailbox was a commonly
requested search feature at FastMail - and we now support it.

We definitely don't have anyone asking us to go back to sorting by mailbox
only - particularly since mailbox name is still a sort key that you can use
in your sort if you want to.

> For just searching, the current MULTISEARCH design has significant benefits 
> for both client and server. It produces concise results (a UID set for each 
> mailbox rather than list of UIDs), and the straightforward implementation 
> will return results as they are found on a per-mailbox basis in a way that 
> clients can display the results as they are found. The straightforward 
> implementation of MULTISORT/ALLMSGS has to compute all the results before 
> sending anything to the client. There's value to having these be separate 
> extensions even if sufficient interest to write a MULTISORT/ALLMSGS 
> extension materializes in the future.

Our result set is still quite concise - though it's tuples:

3 xconvmultisort (reverse arrival) (conversations position (1 30)) utf-8 FUZZY TEXT "chris.newman" not folder "Inbox.Trash" not folder "Inbox.Junk Mail"
* XCONVMULTI (("INBOX" 1148523981) ("INBOX.Archive.Inbox-2011" 1298369152) ("INBOX.Drafts" 1148523983) ("INBOX.Lists.Cyrus" 1193627117) ("INBOX.Lists.Imap5" 1217730348) ("INBOX.Lists.ImapExt" 1198790887) ("INBOX.Lists.fm-cvs" 1148523992) ("INBOX.Sent Items" 1148523995)) ((75e35b068901c51b (2 20195) (5 5210) (5 5209) (5 5207) (5 5206) (5 5205) (5 5203) (5 5202) (5 5201) (5 5199) (5 5198) (7 13690) (5 5197)) (22e8c6bc0c999471 (5 5106) (0 946541) (5 5104) (7 13234) (5 5102)) (35f42610ba6324f3 (5 5079)) (42599c237f434f9f (5 5078) (5 5077) (5 5076) (5 5074) (5 5073) (5 5072) (5 5071)) (94643209768ba85c (5 5070) (5 5069) (5 5068)) (a85bf44cea5ba267 (6 33423)) (8980e46f8cc9617e (6 33248)) (7bec218f48a35592 (6 30446)) (f6375654cf52cb5e (6 28692)) (0cc65f89dc003eb7 (7 9754)) (df7c52435658163e (1 5246) (1 4990) (1 4987) (1 4967)) (14744ad604794672 (1 3190) (4 201) (7 8057) (1 3183)) (77cb8893eeb61bb4 (3 13491)) (c613c22cd60fc734 (4 54) (4 53) (4 51) (4 47)) (cd0f24d8cd2da0e3 (4 50) (4 49) (4 48) (4 46)) (8c1c6d50a25d7d75 (4 41)) (aa0a6113c53456ea (5 4315)) (da7343dea9a2dd92 (5 4308) (5 4303)) (7f0aa97dba576376 (5 4286)) (86631b9c7fd03ba8 (5 4269)) (1dc2b83458b9edb4 (5 4267)) (764d53b27438df5d (5 4261)) (4657718a320253b3 (5 1023)) (8216063d1d06d1a2 (5 3801) (5 1857) (5 1649) (5 3605) (5 987) (5 1453) (5 2782) (5 3152) (5 622) (5 707) (5 2587) (5 2670) (5 427) (5 2393) (5 3273) (5 1109) (5 511) (5 1395) (5 3359) (5 1201) (5 3065) (5 897)) (885c99dc1324a315 (5 3977) (5 1218) (5 3172) (5 1006) (5 2960)) (86e821cd553f0665 (3 2228)) (40dec021413028d4 (5 4046)) (d2a6ca5cfc0cd33a (5 4098) (5 1939) (5 1280) (5 2153) (5 3885) (5 1726) (5 3542) (5 1389)) (5e0e810635ad3ebf (5 793) (5 1666) (5 3624) (5 1470)) (51cbbb00a51db66d (5 1383) (5 3348) (5 2384) (5 225) (5 1102) (5 3057) (5 672) (5 1558)))
* OK [POSITION 1]
* OK [HIGHESTMODSEQ 295339132831582]
* OK [TOTAL 109]
3 OK Completed (in 0.110 secs)

Which breaks down to:

a list of folder names and uidvalidities.

a list of the first 30 matching conversations, each of which is (CID (tuple) (tuple) ...)

each tuple is folder index (from the list above) and UID within that folder.  This is used to hilight the matching messages within the conversation when it is opened.

... and if you scroll past that first 30 item window, you can fetch more.  Obviously, there's no need to work in small windows - it's just an option that we use in our API to avoid getting unbounded amounts of data returned to the client.

4 xconvmultisort (reverse arrival) (conversations multianchor (1383 Inbox.Lists.ImapExt 1 30)) utf-8 FUZZY TEXT "chris.newman" not folder "Inbox.Trash" not folder "Inbox.Junk Mail"
* XCONVMULTI (("INBOX" 1148523981) ("INBOX.Archive.Inbox-2011" 1298369152) ("INBOX.Drafts" 1148523983) ("INBOX.Lists.Cyrus" 1193627117) ("INBOX.Lists.Imap5" 1217730348) ("INBOX.Lists.ImapExt" 1198790887) ("INBOX.Lists.fm-cvs" 1148523992) ("INBOX.Sent Items" 1148523995)) ((a7c22d265a01ae47 (5 1808) (5 2682)) (0e7f42859d14064b (5 522)) (18683122949a755f (5 2482)) (839b47557488a2f0 (5 2008) (5 2282) (5 111)) (17d7c11d5c313214 (5 3578) (5 650)) (ddf8dad81171c58f (5 4005)) (dd20935a4d7b2534 (5 1476)) (b4bbc805e5d88997 (5 811)) (d4b09f52f1551e51 (5 1955)) (9d1382e572868684 (5 4025) (5 1865)) (0285ab40d0125d6e (5 3534)) (b84d7c33ee0ce08b (5 2772) (5 612)) (2faf629f598823c9 (5 2873)) (16c0037968ce0d5d (5 1393) (5 3357)) (320995b4ec6d6c99 (5 1741)) (52f526d95c7a85f5 (5 1459) (5 3419)) (a45377efe133440c (5 3500)) (682cd6d68615f357 (5 1997)) (cd77fa3872d98167 (5 501) (5 2661) (5 2463) (5 302)) (f6f2cd14f1e19e6f (5 2926) (5 1190) (5 3141) (5 975) (5 753) (5 2121)) (f7ff660331ebbe9d (5 3053)) (d01b067ac95c0f9d (5 1696)) (24ce7fe1b6913094 (5 1621) (5 3577)) (5b5bbd4ce7ce019a (5 3497)) (472840f7906f56a2 (5 1136) (5 1797)) (aa51277478e5d2a2 (5 3002) (5 3866)) (dfcd789dbf52d20d (5 3745) (5 2070) (5 2728) (5 570) (5 2530)) (780c5432ead295aa (5 1710) (5 3665) (5 1512) (5 3469) (5 239)) (d9a39a69640a0151 (5 873)) (d70e71aab74630dd (5 2821)))
* OK [POSITION 31]
* OK [HIGHESTMODSEQ 295339132831585]
* OK [TOTAL 109]
4 OK Completed (in 0.070 secs)


... this is a search run on our production system against my folders, using the xapian FUZZY indexed full-message search over what expands to about 6 different backend indexes spread over slow disk (compressed long-term index), slow disk (weekly compressed index), ssd (daily index updates) and finally tmpfs (real-time indexing of new messages).

So this is working now... it's insanely non-standard, but it's a much nicer user experience than the alternatives - including multisearch.

It gives me a list of conversations ordered by most recent message.

If you weren't doing conversations, the syntax would simplify to just a list of folder names that matched (with uidvalidity) and then a list of tuples of (index uid) mapping into that folder listing.

Bron.

-- 
  Bron Gondwana
  brong@fastmail.fm