Re: [MORG] Review: draft-ietf-morg-fuzzy-search-02

Timo Sirainen <tss@iki.fi> Fri, 20 August 2010 18:32 UTC

Return-Path: <tss@iki.fi>
X-Original-To: morg@core3.amsl.com
Delivered-To: morg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A2B2F3A6B0A for <morg@core3.amsl.com>; Fri, 20 Aug 2010 11:32:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.885
X-Spam-Level:
X-Spam-Status: No, score=-105.885 tagged_above=-999 required=5 tests=[AWL=0.714, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SOOwOV5aHCi2 for <morg@core3.amsl.com>; Fri, 20 Aug 2010 11:32:55 -0700 (PDT)
Received: from dovecot.org (dovecot.org [62.236.108.70]) by core3.amsl.com (Postfix) with ESMTP id AA5C63A695D for <morg@ietf.org>; Fri, 20 Aug 2010 11:32:54 -0700 (PDT)
Received: from [10.134.132.86] (unknown [194.65.5.235]) by dovecot.org (Postfix) with ESMTP id 8F9EEFA8B0F; Fri, 20 Aug 2010 21:33:28 +0300 (EEST)
From: Timo Sirainen <tss@iki.fi>
To: Cyrus Daboo <cyrus@daboo.name>
In-Reply-To: <87CC64988A0FB9F1D383B5BC@dhcp-63f1.meeting.ietf.org>
References: <A199F09978CEADEC3697D0DA@dhcp-63f1.meeting.ietf.org> <CA8B3A3A-5450-4D4D-AEE3-970403B2256F@iki.fi> <87CC64988A0FB9F1D383B5BC@dhcp-63f1.meeting.ietf.org>
Content-Type: text/plain; charset="UTF-8"
Date: Fri, 20 Aug 2010 19:33:27 +0100
Message-ID: <1282329207.6489.24.camel@kurkku.sapo.corppt.com>
Mime-Version: 1.0
X-Mailer: Evolution 2.28.3
Content-Transfer-Encoding: 7bit
Cc: morg@ietf.org
Subject: Re: [MORG] Review: draft-ietf-morg-fuzzy-search-02
X-BeenThere: morg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Messaging Organization <morg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/morg>, <mailto:morg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/morg>
List-Post: <mailto:morg@ietf.org>
List-Help: <mailto:morg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/morg>, <mailto:morg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Aug 2010 18:32:58 -0000

On Sun, 2010-08-01 at 14:56 +0200, Cyrus Daboo wrote:

> >> 6) Do clients need to care that the FUZZY search key is not
> >> distributive, e.g.:
> >>
> >> FUZZY OR SUBJECT x SUBJECT y != OR FUZZY SUBJECT x FUZZY SUBJECT y
> >
> > I agree with Barry, those are the same. Added example:
> 
> I am not totally convinced. My point is how a server deals with two FUZZY's 
> - does it somehow treats those as one, does it do the two separately (as 
> per the OR example right-hand side) and then combine relevance in some 
> fashion for the aggregate? 

"Implementation-determined"? :)

> How does you server handle multiple FUZZY's?

I haven't actually implemented FUZZY yet, only the relevancy parts of it
as part of other non-standard search keys. Looking at the code, it
doesn't handle scores for multiple fuzzy searches well at all, simply
drops the earlier searches' scores..

> > FUZZY is applied to all search keys when parenthesis or OR is used. For
> > example in this search query everything is matched fuzzily:
> >
> >  C: A03 SEARCH FUZZY OR (SUBJECT work FROM user@example.com) SUBJECT home
> >  S: * SEARCH 2 5
> >  S: A03 OK Search completed.
> 
> Well that automatically falls out from the syntax since "OR" is itself a 
> search-key that is the argument for FUZZY.

So no point in having that example?