Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-sieve)

Ned Freed <ned.freed@mrochek.com> Fri, 23 May 2008 22:16 UTC

Date: Fri, 23 May 2008 08:40:57 -0700
From: Ned Freed <ned.freed@mrochek.com>
Subject: Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-sieve)
In-reply-to: "Your message dated Fri, 23 May 2008 17:14:51 +0400" <4836C34B.1030700@isode.com>
To: Alexey Melnikov <alexey.melnikov@isode.com>
Cc: ietf-mta-filters@imc.org
Message-id: <01MV4RRQJI7Q00007A@mauve.mrochek.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset="ISO-8859-1"; format="flowed"
Content-transfer-encoding: 7bit
References: <480F17C6.6040404@isode.com> <4836C34B.1030700@isode.com>
Sender: owner-ietf-mta-filters@mail.imc.org
Precedence: bulk

> Alexey Melnikov wrote:
> > While this is a Lemonade WG document, this document is of relevance to
> > the Sieve WG mailing list and should be reviewed here.
> >
> > So please review
> > <http://www.ietf.org/internet-drafts/draft-ietf-lemonade-imap-sieve-05.txt>
> > before May 5th.

> I haven't seen any reviews of this. Can you please send me a quick "I've
> reviewed it and it looks fine" or "this needs more work" to me, if
> you've done a review.

I finally had a chance to look at this. Having done so, the question I really
want to ask is has anyone implemented this, and if they  did, how well did it
work? And if the answer to that is "no implementations yet", I'd then like to
hear if anyone is planning to implement this, and if so, when and for what
puropse?

The general problem I see here is that Sieve is specified to operate in a
fairly narrow context: At or around the time of final delivery. This has
influenced Sieve design choices in subtle ways and, now that we have such an
extensive body of work on Sieve-related stuff, tracking down all of the
implicit assumptions of context and reworking them so that Sieve can operate in
the context of the message store is very difficult. There are almost certainly
going to be things we miss or get wrong, and I think implementation experience
is about the only way we're going to find all the impedance mismatches.

Having said all this, I don't know how it translates into a plan of action for
the document. The obvious thing is to do is approve it as experimental, but
this may have the opposite of the indended effect - it may cause the
specification to be ignored because it is "not ready' for standardization.
OTOH, if it approved as proposed but then requires substantive changes, we
could easily end up in a situation with large differences between
implementations. That's not good either.

Now, obviously if someone has implemented this without difficulty then these
concerns are baseless. But if they haven't, I'd like to know to what extent
there are plans to code this, so we have some idea of how to proceed.

Moving on to specifics... Overall the document seems basically OK. I don't see
fundamental problems that would invalidate the entire approach. That said,
there are quite a few details I think need a bit of work.

First up is the interactions with annotations. I can see the value of being
able to get at annotations through Sieve. The trouble is there seems to be a
missing piece: I don't believe a Sieve extension has been defined that get at
annotation data. GIven that that's the case I'm not sure I see the point of the
changedannotation environment item. All this is going to tell you is what
annotations changed - it doesn't tell you whether they were added, or deleted,
or modified, or what's in them if they are still there. Unless there's either
an extension in the works for getting at annotation information (in which case
it needs to be referenced) or there's a use-case for knowing what annotations
have changed (in which case there should be an exmaple of that usage
somewhere), I have to recommend removing the changedannotations environment
item. If need be it can be defined in a subsequent Sieve annotate extension
specification.

Next up is the imap4flags extension. Flag handling is such a key part of IMAP I
really have to wonder how useful this extension is without imap4flags support.
I wonder if it doesn't make more sense to simply require imap4flags support -
given the way imap4flags can be used without variables it isn't especially
difficult to implement. It would certainly simplify the document by removing
lots of qualifiers all over the place.

The imap4flags extension essentially provides two different ways to manipulate
flags: With variables or with the so-called default flag internal variable.
Given some of the examples it appears that the default flag internal variable
is supposed to be set to the current flags on the emssage before the script is
run. I don't believe this is explicitly stated in the document, and it needs to
be. (This document should also reiterate that changes to the default flags
attach to the messages produced by the actions the sieve performs and the final
state of the default flags apply to the implicit keep.)

I note in passing that RFC 5232 doesn't actually say what the initial setting
of the default flags is supposed to be. Empty seems like the only choice for
regular Sieve use and we should spell that out in the next update.

And now for one of those subtle issues: Flag value inspection and manipulation.
A fairly common thing for scripts to do is make copies of their inputs so they
can refer back to the original values independent of any modifications the
script makes. (This is especially common practice in large scripts built up
from various component pieces from different sources.) This is a nonissue
for the imap4flags extension in normal usage: The initial setting of the
default flag internal value is empty. But it isn't empty at startup in this
new usage context. And unless I've missed something, there's no way to capture
the state of this internal variable. The best you can do is grab the first
flag in the set:

    if hasflag :matches "*" {set "doof" "${0}";}

Nowhere near good enough. Or, if you know the names of all the flags you're
ever going to be interested in, you can do:

    setflag "initialflags" "";
    if hasflag :is "flag1" {addflag "initialflags" "flag1";}
    if hasflag :is "flag2" {addflag "initialflags" "flag1";}
    ...
    if hasflag :is "flagn" {addflag "initialflags" "flagn";}

But this is extremely fragile, not to mention verbose and tacky.

The obvious way to address this is to add an additional environment item:
initialflags. This makes it easy to, say, reset the default flags to their
initial settings:

    if environment :matches "initialflags" "*" {setflag "${0}";}

Moving on... redirect. Redirect assumes that the message in hand is suitable
for submission. This is a reasonable assumption in normal Sieve use because the
message is either in transit or has just been in transit. However, I don't
believe, say, IMAP APPEND imposes sufficient requirements on messages for them
to be considered ready for submission. The draft needs to deal with this issue,
but I'm not exactly sure how - perhaps an explicit reference to the SMTP submit
specification plus some statement that if the message is unsuitable for
submission a DSN SHOULD be generated.

Next up is editheader and another subtle but extremely important issue. The
draft quite correctly bans editheader modifications to the message at hand
because such modifications  would violate IMAP semantics. However, this ignores
the extremely important usage of editheader as a meanss of annotating
redirected messages, where no such semantics problem exists. For example, it is
quite common to use editheader to prevent redirection loops:

        /* Don't redirect if we already redirected */
        if not header :contains "X-Sieve-Filtered"
                ["<kim@job.example.com>", "<kim@home.example.com>"]
        {
                addheader "X-Sieve-Filtered" "<kim@job.example.com>";
                redirect "kim@home.example.com";
        }

(This example is straigth from the editheader draft itself.) Given the added
risks of loops created by this document I really don't want to lose this
capability.

For that matter I'm not sure I see the harm in allowing:

        keep;
        addheader "comment" "extra copy for posterity"
	fileinto "archive";

Unless I'm missing something there should be no reason why an additional
copy of the message created with fileinto has to be identical to the
original.

And there's also the use of editheader as a kind of variable surrogate:

	if <something nasty and complex>
	{ addheader "flag" "whatever";}
	...
	if header :is "flag" "whatever" ...
	...
	deleteheader "flag";

I think this is a bit screwy and of course unnecessary in any implementation
that support variables, but I've seen several customer scripts do it. (And yes,
we support variables.)

I therefore think the outright ban on editheader needs to be lifted and
replaced with someething more nuanced, e.g., editheader, if used, has no
effect on implicit or expliccit keep, but can affect other actions.

Next up is spamtest and virustest. Neither is mentioned in the draft, and IMO
this is a SERIOUS omission. I can say with absolute certainty that one of the
main things, if not the main thing, our customers are going to want to do in
sieves attached to mailboxes: Check appended messages for spam and viruses. And
since headers cannot be used to communicate spam status because of message
immuatability, that leves spamtest and virustest as the only games in town.

And that segres into the next to last item on my list: reject. I think I
understand the reasons for wanting to ban it outright, but this is another one
that just isn't going to play in Peoria. Like it or not, people are going to
want to use this facility to impose access restrictions on IMAP APPEND, and
they are going to be pissed if they can't do it. In fact this is one that
unless there's a fundamental reason we cannot implement this in our store by
havingt it turn a reject into a NO response to APPEND I will recommend that we
simply ignore this restriciton and allow reject.

One final subtle issue. Another thing I'm fairly confident people will want to
do with Sieve in IMAP is implement finer grained access control. (When you
think about it, access control is one of Sieve's main purposes.) But unlike
final delivery, where there's no identity associated with the entity performing
the delivery, there's always an identity associated with someone doing IMAP
stuff. Accordingly, Sievce in IMAP needs to have a way to access this
identity so it can use it as part of its access control checks. THe obvious
way to do it is through another environment item, e.g., user-identifier. So 
you could say stuff like:

    if allof(environment :is "cause" ["append", "copy"],
             not environment :is "user-identifier" "ned",
	     size :over 64K {
      reject "Sorry, I don't allow random bozos to append big messages to my mailbox";
    }

That's it for now, but if nothing else I hope this review has driven home how
easy it is for these impedance mismatch issues to creep in unobserved.

				Ned

WGLC on IMAP Sieve (draft-ietf-lemonade-imap-siev… Alexey Melnikov
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Alexey Melnikov
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Ned Freed
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Dave Cridland
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Alexey Melnikov
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Arnt Gulbrandsen
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Alexey Melnikov
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Arnt Gulbrandsen
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Ned Freed
Re: WGLC on IMAP Sieve (draft-ietf-lemonade-imap-… Alexey Melnikov