Re: draft-freed-sieve-in-xml status?

"Robert Burrell Donkin" <robertburrelldonkin@gmail.com> Mon, 15 December 2008 13:11 UTC

Return-Path: <owner-ietf-mta-filters@mail.imc.org>
X-Original-To: ietfarch-sieve-archive-Aet6aiqu@core3.amsl.com
Delivered-To: ietfarch-sieve-archive-Aet6aiqu@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E8D123A69F2 for <ietfarch-sieve-archive-Aet6aiqu@core3.amsl.com>; Mon, 15 Dec 2008 05:11:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.269
X-Spam-Level:
X-Spam-Status: No, score=-2.269 tagged_above=-999 required=5 tests=[AWL=-0.270, BAYES_00=-2.599, J_CHICKENPOX_33=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DpaGOoE1tcM3 for <ietfarch-sieve-archive-Aet6aiqu@core3.amsl.com>; Mon, 15 Dec 2008 05:11:43 -0800 (PST)
Received: from balder-227.proper.com (properopus-pt.tunnel.tserv3.fmt2.ipv6.he.net [IPv6:2001:470:1f04:392::2]) by core3.amsl.com (Postfix) with ESMTP id 06D513A6982 for <sieve-archive-Aet6aiqu@ietf.org>; Mon, 15 Dec 2008 05:11:42 -0800 (PST)
Received: from balder-227.proper.com (localhost [127.0.0.1]) by balder-227.proper.com (8.14.2/8.14.2) with ESMTP id mBFD3SJt026896 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 15 Dec 2008 06:03:28 -0700 (MST) (envelope-from owner-ietf-mta-filters@mail.imc.org)
Received: (from majordom@localhost) by balder-227.proper.com (8.14.2/8.13.5/Submit) id mBFD3SUL026895; Mon, 15 Dec 2008 06:03:28 -0700 (MST) (envelope-from owner-ietf-mta-filters@mail.imc.org)
X-Authentication-Warning: balder-227.proper.com: majordom set sender to owner-ietf-mta-filters@mail.imc.org using -f
Received: from mail-bw0-f12.google.com (mail-bw0-f12.google.com [209.85.218.12]) by balder-227.proper.com (8.14.2/8.14.2) with ESMTP id mBFD3EGX026879 for <ietf-mta-filters@imc.org>; Mon, 15 Dec 2008 06:03:26 -0700 (MST) (envelope-from robertburrelldonkin@gmail.com)
Received: by bwz5 with SMTP id 5so4675117bwz.10 for <ietf-mta-filters@imc.org>; Mon, 15 Dec 2008 05:03:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=+wlqCp4EsJ/zgNrC+cNvE8MDog2CjSuK+uES3jSnka4=; b=JY7Oje6Ypz9e89mC1kAehmvLu9xFLMsSM9VdyUwqb9cE0V1A60XuHPUGiK6xXQUAwm XDymcHwrnLOIRDI+XQu8uidzlAmCcP21o/64j5ohg2Dttl9PLpNvT9RT0Nm+Lb9F22qA ewU924AJ+gg1O+WEXtt68709TADtKDXrqMzsA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=W8xITXaAOOEO7xMUx2DoG+mokMwMkltfFJplrZolu6eQOgNLNWdczH2AySuoq7zDa8 cG4NLyQ3/zPEquSyMG9aajIXPCucCDHZp5BI0BN95lkj9HXmDoDOs08kYR09RNh3W1H/ DnUKdqzgId0cf3rAIYB3Ykf2puMK6wjv/817s=
Received: by 10.181.199.16 with SMTP id b16mr2525369bkq.142.1229346012537; Mon, 15 Dec 2008 05:00:12 -0800 (PST)
Received: by 10.181.9.9 with HTTP; Mon, 15 Dec 2008 05:00:12 -0800 (PST)
Message-ID: <f470f68e0812150500r5d1916f4obbd941434295fe07@mail.gmail.com>
Date: Mon, 15 Dec 2008 13:00:12 +0000
From: Robert Burrell Donkin <robertburrelldonkin@gmail.com>
To: Ned Freed <ned.freed@mrochek.com>
Subject: Re: draft-freed-sieve-in-xml status?
Cc: ietf-mta-filters@imc.org
In-Reply-To: <01N335CY3MAQ00SE3A@mauve.mrochek.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <f470f68e0812041225x318bfdccg1bf9201b53ce8c2e@mail.gmail.com> <493E908E.70504@isode.com> <f470f68e0812090956j56c29f17s77fc554adaab1350@mail.gmail.com> <f470f68e0812140301r7ef04460t24f2ea9e6d2ff7a0@mail.gmail.com> <01N32VHWP1EK00SE3A@mauve.mrochek.com> <f470f68e0812141304i228b5a03s890e0f101b76b07e@mail.gmail.com> <01N335CY3MAQ00SE3A@mauve.mrochek.com>
Sender: owner-ietf-mta-filters@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-mta-filters/mail-archive/>
List-ID: <ietf-mta-filters.imc.org>
List-Unsubscribe: <mailto:ietf-mta-filters-request@imc.org?body=unsubscribe>

On Sun, Dec 14, 2008 at 9:59 PM, Ned Freed <ned.freed@mrochek.com> wrote:
>> On Sun, Dec 14, 2008 at 6:06 PM, Ned Freed <ned.freed@mrochek.com> wrote:
>> >> i note that the draft describes the infoset rather than defining it in
>> >> the standard way. is there a reason for this decision?
>> >
>> > I don't know what "the standard way" is you're referring to. Perhaps you
>> > could provide a reference to an RFC where this has been used?
>
>> AIUI XML is maintained by w3c (rather than IEFT) so is a
>> recommendation. http://www.w3.org/TR/xml-infoset/ is the current
>> document.
>
> Quite true, however, the IETF has its own specification for XML is supposed to
> be used in RFCs: RFC 3470. And while infosets are mentioned as one approach to
> specifying things about an XML format, there's no recommendation, let alone
> requirement, that they be used.
>
>> > This document is a little unusual in that it's defining a mapping of, if you
>> > will, a non-XML infoset onto XML. As such, the natural approach seemed to be to
>> > first discuss the structure of the language being mapped, then explain the
>> > mapping, and finish up with additional unique-to-XML semantics.
>
>> i agree that most of this arangement is natural. it's just jumping to
>> a schema seems - to me - a little premature and inflexible.
>
> First of all, the use of XML Schema is in fact too inflexible to be allowed
> to continue. The next revision will use Relax instead.

XML schema is flexible but the flexibility comes at the price of
readability. one of relax variants would be a better choice.

however (in my experience) the generative tools commonly used for XML
and web service binding, and editor generation tend not to offer good
relax support. IMO the draft should offer secondary informative XML
Schema or Schemata to assist developers using these tools.

> But I'm sitll a little confused as to what you're asking for here. If you're
> asking for removal of the explicit inline XML syntax examples in favor of a
> more abstract approach, I'd be fine with that if there's a WG consensus to make
> such a change.

no - i'm very happy with the syntax examples

i would like to see the approach used in RFC 5023 (and others)
adopted, adding a normative description of the XML and making the
schema only informative.

> Beyond that, however, lies a slippery slope. If what you're after is a
> restatement of Sieve elements and semantics in infoset form, that is not going
> to happen on my watch. RFC 5228 is the definitive source of information about
> such things. It may be a little awkward for implementors to have to interpolate
> back through the specifications to get at the meaning of things they have in
> their XML, but the alternative of having two separate specifications that are
> bound to be inconsistent in some way or other is much worse.

no, not restatement

>> > This approach is perhaps not the best choice for someone coming at this trying
>> > to get at Sieve semantics starting with XML, but I believe consumers of the
>> > document with that mindset will be distinctly in the minority. The main focus
>> > here is to provide people familiar with Sieve a means of mapping Sieve to XML
>> > so that XML tools can be applied.
>
>> my experience is entirely opposite
>
>> developers that use the java libraries i work on have good XML but
>> lack a good understanding of underlying mail technologies (for example
>> sieve). there is a large and growing requirement for integration
>> between mail and enterprise systems (typically coding in Java and .NET
>> but also ruby and python). developers from enterprise backgrounds are
>> typically strong on web+xml but very weak on mail.
>
> Yep, I've seen a lot of this as well. And the problem emcompasses far more than
> Sieve: For example, a lot of people who are unfamiliar with email don't
> understand very basic concepts such as the separation between envelope and
> message content. (This particular issue actually pokes through into Sieve in
> the form of whether an envelope or header test is appropriate.)

i beg to differ slightly on this one

some enterprise mail processing may happen during the SMTP transaction
but it is more typical for the mail processing after storage. not all
mail stored arrives through SMTP and so it is typical for any envelope
information to be reduced to simple MIME headers. most developers in
these mail processing environments do not need to understand the
difference between envelope and message content because - for them -
there is no difference.

Sieve works very well as a general MIME document processing language.
the envelope tests are - in many ways - peculiar since the rest of the
specification really isn't mail specific. there are potentially some
very interesting applications in this area so it would be a shame - i
think - for the expert group to focus too strongly on SMTP at the
expense of other IMHO equally valid Sieve use cases.

> But here's the dilemma: This stuff is complicated and in some cases fairly
> subtle. This in turn means that the reiteration of even a subset of the
> underlying design principles that implementors need to know takes up a lot of
> space and will still fall short of the mark of giving the necessary guidance.
> But it may lead to the belief that reading this specification (or for that
> matter this one and RFC 5228) is in fact sufficient to understand how to use
> Sieve. It quite simply isn't.

again, i beg to differ

sieve is very similar structurally to the guerrilla standards used in
enterprise mail system for more than 5 years now. for most mail
processing applications, only the container builders need to have a
good understanding of the protocols. application developers are
offered a safe environment and an OOP interface. i see no reason why
sieve should be any different.

> IMO what's needed is a proper architectural specification for email. We're
> trying to get one of those done, but progress has been very slow.

providing that mail is interpreted sufficiently broadly, i agree

<snip>

>> > Had this been the more usual case of simply defining an XML formal, I have to
>> > admit that I would have gone with the informal approach used in, say, RFC 2629.
>> > I'm not all that keen on lots of formalism  - IMO it often hinders
>> > understanding more than it helps.
>
>> IMHO the problem is getting the right level of formalism
>
>> more modern approaches to specification (eg Atom
>> http://www.rfc-editor.org/rfc/rfc5023.txt etc) tend to make the schema
>> only informative and the description of the infoset normative. this
>> would be more flexible for example, by allowing different schema
>> langauges to be used, alterations in namespace or additional
>> annotations in foreign vocabularies.
>
> But doesn't this fly in the face of your earlier suggestion of doing this
> by annotating the schema?

i'll explain a little more what i meant by that

i was suggesting that might be worthwhile creating an independent,
clean room schema (based on the RFC), documenting it then releasing
under a suitable FOSS license (MIT). similar - in spirit - to the
Annotated XML Specification
http://www.xml.com/pub/a/axml/axmlintro.html.

- robert