Re: [MORG] Review: draft-ietf-morg-fuzzy-search-02

Dave Cridland <dave.cridland@isode.com> Tue, 16 November 2010 16:20 UTC

Return-Path: <dave.cridland@isode.com>
X-Original-To: morg@core3.amsl.com
Delivered-To: morg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 03ED03A6D16 for <morg@core3.amsl.com>; Tue, 16 Nov 2010 08:20:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, J_CHICKENPOX_74=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id txEYstqUS2z2 for <morg@core3.amsl.com>; Tue, 16 Nov 2010 08:20:22 -0800 (PST)
Received: from rufus.isode.com (rufus.isode.com [62.3.217.251]) by core3.amsl.com (Postfix) with ESMTP id 3425D3A6DB3 for <morg@ietf.org>; Tue, 16 Nov 2010 08:20:21 -0800 (PST)
Received: from puncture ((unknown) [217.155.137.60]) by rufus.isode.com (submission channel) via TCP with ESMTPSA id <TOKvXgAEeUCy@rufus.isode.com>; Tue, 16 Nov 2010 16:20:46 +0000
X-SMTP-Protocol-Errors: NORDNS
References: <A199F09978CEADEC3697D0DA@dhcp-63f1.meeting.ietf.org> <AANLkTikpxj788+WaFstG_-9+Uu4rmuhzQSfFqUnR_psg@mail.gmail.com> <C80B2463-5A2E-465D-A800-79EA009131AB@iki.fi> <22FDB22B5203DF761DFD4F61@dhcp-63f1.meeting.ietf.org> <1289922768.1764.168.camel@kurkku.sapo.corppt.com>
In-Reply-To: <1289922768.1764.168.camel@kurkku.sapo.corppt.com>
Message-Id: <2850.1289924444.405478@puncture>
Date: Tue, 16 Nov 2010 16:20:44 +0000
From: Dave Cridland <dave.cridland@isode.com>
To: Timo Sirainen <tss@iki.fi>, Messaging Organization <morg@ietf.org>, Cyrus Daboo <cyrus@daboo.name>
MIME-Version: 1.0
Content-Type: text/plain; delsp="yes"; charset="us-ascii"; format="flowed"
X-Mailman-Approved-At: Tue, 16 Nov 2010 08:22:01 -0800
Subject: Re: [MORG] Review: draft-ietf-morg-fuzzy-search-02
X-BeenThere: morg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Messaging Organization <morg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/morg>, <mailto:morg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/morg>
List-Post: <mailto:morg@ietf.org>
List-Help: <mailto:morg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/morg>, <mailto:morg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Nov 2010 16:20:23 -0000

On Tue Nov 16 15:52:48 2010, Timo Sirainen wrote:
> On Sun, 2010-08-01 at 14:58 +0200, Cyrus Daboo wrote:
> > > Yes, I think SORT (RELEVANCY) should be required for that.  RFC  
> 5267
> > > isn't super clear on if SORT supports PARTIAL, but I guess it  
> should
> > > according to ABNF.
> >
> > Can we get clarification on that? Even add an example to fuzzy  
> search
> > showing a client finding the top-10 relevant items.
> 
> In my understanding it does, and Dovecot supports it, but I don't  
> know
> if Dave or other implementers were thinking the same.. Dave?
> 
> 
That's basically what CONTEXT=SORT is for:

   Servers advertising CONTEXT=SORT also advertise the SORT  
capability,
   as described in [SORT], support the extended SORT command syntax
   described in Section 3, and accept three additional return options
   for this extended SORT.

   These additional return options allow for notifications of changes  
to
   the results of SEARCH or SORT commands, and also allow for access  
to
   partial results.

I don't think that's anything less than super-clear.

> I was planning on using this example:
> 
> C: D02 SORT RETURN (PARTIAL 1:10) (RELEVANCY) UTF-8 FUZZY TEXT  
> "World"
> S: * ESEARCH (TAG "D02") PARTIAL (1:10  
> 19,80,85:86,95,102:103,111,113:114)
> S: D02 OK Sort completed.

That's somewhat deceptive, since the results are in UID order, so it  
could be misread.

What that should provide is the 10 most relevant matches, of course -  
in this example it might be viewed that, as these are also the ten  
first matching messages, there's some implicit "order by UID" going  
on. A random.shuffle makes it more clear:

C: D02 SORT RETURN (PARTIAL 1:10) (RELEVANCY) UTF-8 FUZZY TEXT "World"
S: * ESEARCH (TAG "D02") PARTIAL (1:10  
102,95,114,103,113,111,86,10,80,85)
S: D02 OK Sort completed.

Dave.