[Extra] Search in IMAP4rev2 (was Re: draft-ietf-extra-imap4rev2-04 review)

"Chris Newman" <chris.newman@oracle.com> Tue, 26 March 2019 08:54 UTC

Return-Path: <chris.newman@oracle.com>
X-Original-To: extra@ietfa.amsl.com
Delivered-To: extra@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E7683120296 for <extra@ietfa.amsl.com>; Tue, 26 Mar 2019 01:54:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.302
X-Spam-Level:
X-Spam-Status: No, score=-4.302 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=oracle.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JMwlJpreRoDj for <extra@ietfa.amsl.com>; Tue, 26 Mar 2019 01:54:57 -0700 (PDT)
Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 58EB5120295 for <extra@ietf.org>; Tue, 26 Mar 2019 01:54:57 -0700 (PDT)
Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x2Q8n5bp166842; Tue, 26 Mar 2019 08:54:52 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=corp-2018-07-02; bh=+uKdgWfYYw7WUUE3Dt8SPJhBMTb28tLaEgdmlIvMN08=; b=Wi7anOzwSroEqxoppi+d27e0ixsEx+9i0bi9rm4mP1d19ltBrH9dctNc3kC70YevSo72 4Ve95DCN0XQ5b8nz4qM3EJlh1eYQC0wEEwlFKg8u8fPl44oTGPHfdBvRfydg9x5XP+xd ZKJCl8yR/MB4csGilXLOBYhAhSHMwS0nrPX1844v9vs1c1WVw4Ckjm4rlZBSa7Rlnr9h GeOXLyhsxouek8jFLU+pMtVqK2eFncFYXcDg6e4jPUkggfXOz99NXgsqlsLlPJDs3jmv 6QAhlhvKxNencvtF21UC2vyQjRvTcNOiZgdJl8hjO9IWTvS1ICb9Xla/g27pX61qxPws 3A==
Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2re6g110s0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Mar 2019 08:54:52 +0000
Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x2Q8slic030470 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Mar 2019 08:54:47 GMT
Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x2Q8skFa027519; Tue, 26 Mar 2019 08:54:47 GMT
Received: from [192.168.56.1] (/62.168.35.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 26 Mar 2019 01:54:46 -0700
From: "Chris Newman" <chris.newman@oracle.com>
To: "Bron Gondwana" <brong@fastmailteam.com>
Cc: extra@ietf.org
Date: Tue, 26 Mar 2019 09:54:43 +0100
X-Mailer: MailMate (1.12.4r5594)
Message-ID: <FD8F3284-0072-4244-AC4D-53D60C867159@oracle.com>
In-Reply-To: <aa824b61-e137-491e-b3ee-6ff8b92f35a0@www.fastmail.com>
References: <44234C00-7A5D-4B41-9E85-4CF839B48214@iki.fi> <aa824b61-e137-491e-b3ee-6ff8b92f35a0@www.fastmail.com>
MIME-Version: 1.0
Content-Type: text/plain; format=flowed
X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9206 signatures=668685
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=764 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903260067
Archived-At: <https://mailarchive.ietf.org/arch/msg/extra/ZGTij0q6vbTrFbdK9ZaO20E61rs>
Subject: [Extra] Search in IMAP4rev2 (was Re: draft-ietf-extra-imap4rev2-04 review)
X-BeenThere: extra@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Email mailstore and eXtensions To Revise or Amend <extra.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/extra>, <mailto:extra-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/extra/>
List-Post: <mailto:extra@ietf.org>
List-Help: <mailto:extra-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/extra>, <mailto:extra-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Mar 2019 08:54:59 -0000

On 25 Mar 2019, at 18:14, Bron Gondwana wrote:
>>> In all search keys that use strings, a message matches the key if 
>>> the
>>> string is a substring of the associated text.
>>> 11. Allow word-based searching (as per Chris Newman)?
>
> There's already FUZZY.

The issue is that when a server deployment becomes large and when it 
uses an indexed search technology (e.g., Solr, Elasticsearch, Lucene) 
then the server can not (and does not) perform body searches that comply 
with the IMAP specification. So I'd like the IMAP4rev2 search 
specification to be changed to reflect the reality that pure-substring 
search does not always match, especially for body searches. These 
technologies generally do word-based searches and can do word prefix 
searches reasonably efficiently, but can't do pure substring efficiently 
at scale. Also these technologies are often configured so that 
stop-words don't match (e.g., and, or, he, she, etc).

The function FUZZY performs is different and is an orthogonal issue.

		- Chris