a plea for syntactic cleanliness

Lawrence Greenfield <leg+@andrew.cmu.edu> Mon, 02 December 2002 22:13 UTC

Received: (from majordomo@localhost) by above.proper.com (8.11.6/8.11.3) id gB2MDBv23222 for ietf-imapext-bks; Mon, 2 Dec 2002 14:13:11 -0800 (PST)
Received: from smtp6.andrew.cmu.edu (SMTP6.andrew.cmu.edu [128.2.10.86]) by above.proper.com (8.11.6/8.11.3) with ESMTP id gB2MD7g23217 for <ietf-imapext@imc.org>; Mon, 2 Dec 2002 14:13:08 -0800 (PST)
Received: from penguin.andrew.cmu.edu (PENGUIN.andrew.cmu.edu [128.2.121.100]) by smtp6.andrew.cmu.edu (8.12.3.Beta2/8.12.3.Beta2) with ESMTP id gB2MD6Q9014762; Mon, 2 Dec 2002 17:13:06 -0500
Date: Mon, 02 Dec 2002 17:13:06 -0500
Message-Id: <200212022213.gB2MD6Q9014762@smtp6.andrew.cmu.edu>
From: Lawrence Greenfield <leg+@andrew.cmu.edu>
X-Mailer: BatIMail version 3.3
To: ietf-imapext@imc.org
Subject: a plea for syntactic cleanliness
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (Unebigoryƍmae) Emacs/21.2 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)
MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset="US-ASCII"
Sender: owner-ietf-imapext@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-imapext/mail-archive/>
List-ID: <ietf-imapext.imc.org>
List-Unsubscribe: <mailto:ietf-imapext-request@imc.org?body=unsubscribe>

Ok, I was reviewing some parsing stuff in Cyrus, which has code to
support some of the ANNOTATE suggestions.

I'm getting increasingly frustrated by the extensions that are
defining strange strings. For instance, there is never any reason to
allow a quoted string but not allow a literal string. We have
LITERAL+, all servers that care about performance should implement it,
and it frees clients from having to figure out if there are any
characters that need to be quoted.

Looking at draft-ietf-imapext-annotate-05.txt, it defines some
protocol elements not in terms of base imap strings (like 'attrib' or
'entry') but rolls its own.

In doing so, it does weird things:

   entry             = DQUOTE 1*atom-slash *("/" 1*atom-slash) DQUOTE
   atom-slash        = any utf8-char except "/"
   utf8-char         =  %x01-FF
                       ; any character, excluding NUL

Ok, so an entry could be

"\001\002////""/"\045"

is this really what's intended?

Instead, 'entry = string', where string is well defined in the IMAP
base spec, would suffice. It is a semantic constraint that should
prevent a leading slash, NOT a syntactic one. Let's not try to force
things into the ABNF that are really about the semantics.

Now, looking at draft-crispin-imapv-20, 'quoted' isn't allowed to
contain UTF-8 characters, meaning that literals are going to be used
more heavily. That really isn't a big deal.

ACL2 makes the same mistake. Please, use "string" or "nstring" in
extensions. Don't roll your own. Don't make us have to rewrite lexers
to deal with strange semantic constraints.

Larry