[Jmap] Benjamin Kaduk's Discuss on draft-ietf-jmap-mail-15: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Tue, 05 March 2019 02:40 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: jmap@ietf.org
Delivered-To: jmap@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id A88A0130EBA; Mon, 4 Mar 2019 18:40:37 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-jmap-mail@ietf.org, Bron Gondwana <brong@fastmailteam.com>, jmap-chairs@ietf.org, brong@fastmailteam.com, jmap@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.92.1
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <155175363767.5305.14440255640742603646.idtracker@ietfa.amsl.com>
Date: Mon, 04 Mar 2019 18:40:37 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/jmap/0sOX753lzQjbP_Q8oV4accFkAb0>
Subject: [Jmap] Benjamin Kaduk's Discuss on draft-ietf-jmap-mail-15: (with DISCUSS and COMMENT)
X-BeenThere: jmap@ietf.org
X-Mailman-Version: 2.1.29
List-Id: JSON Message Access Protocol <jmap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/jmap>, <mailto:jmap-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jmap/>
List-Post: <mailto:jmap@ietf.org>
List-Help: <mailto:jmap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/jmap>, <mailto:jmap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2019 02:40:45 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-jmap-mail-15: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-jmap-mail/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Length of review comments aside, this is actually a quite nice document
-- thank you to the authors for making it so clear and well-organized.
The only readability complaint I have is that I don't get a great
picture of how all the bits of the Email object fit together, but it's
complicated enough that maybe a generic schema wouldn't actually be
helpful.

Section 2

I think we need more precise language than "corresponds to" for the
relationship between JMAP MailboxRights and IMAP ACLs, specifically
because JMAP distinguishes mayRename and mayDelete but IMAP just has the
single 'x' ACL.  (More in the COMMENT section, but the non-isomporphic
mapping of 'x' is the only DISCUSS-worthy part.)

Section 4.1.1

We only describe the "\" to "$" translation for the four supported
system keywords, but it seems that it should be more generic (not that
we expect more IMAP system keywords to appear anytime soon)?

Section 4.1.2.1

                                       A server SHOULD replace any octet
   or octet run with the high bit set that violates UTF-8 syntax with
   the unicode replacement character (U+FFFD).  [...]

This seems problematic, given that this is supposed to be the "Raw"
format.  I guess the justification for the replacement is that we use
JSON and JSON requires UTF-8, but if that's the case then shouldn't this
be a MUST and not a SHOULD?  In particular, a client can't rely on the
server providing the SHOULD, so it doesn't seem to provide much value.

Section 4.7

   The server MAY forbid two email objects with the same exact [RFC5322]
   content, or even just with the same [RFC5322] Message-ID, to coexist
   within an account; if the target account already has the email the
   copy will be rejected with a standard "alreadyExists" error.

This has some security considerations that should probably be mentioned
in Section 9.4: when a user only has read privileges to a
subset of the folders in an account, this behavior can be abused as an
oracle to determine whether a given message exists in the inaccessible
portions of that account.  (Similarly for /import.)

Section 4.9

   The following metadata properties on the Email objects will be "null"
   if requested:
   [...]
   o  mailboxIds

This seems in conflict with the Section 4.1.1 text that every Email "MUST
belong to one or more mailboxes at all times (until it is deleted)."
Presumably we want a broader disclaimer in 4.1.1 rather than any changes
here...

There may also be a related condition wherein an EmailSubmission object
refers to an Email after the Email is deleted -- I didn't (yet) see text
to indicate whether the emailId in the EmailSubmission is expected to
still be resolvable, in which case there would potentially not be an
associated Mailbox.

Section 9.3

It's 2019.  Why are we still recommending SASL PLAIN?
We have better options even if we are resigned to passwords, like SCRAM.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 1.3.1

   o  *maxMailboxesPerEmail*: "UnsignedInt|null" The maximum number of
      mailboxes that can be can assigned to a single email.  [...]

nit: My understanding (shaped solely by experience and inference) is that
"email" is generally used to refer to a message, whereas "email account"
or "email address" would be used to refer to or associate with a thing
that can be a container for folders.

   o  *maxSizeMailboxName*: "UnsignedInt" The maximum length, in (UTF-8)
      octets, allowed for the name of a mailbox.  This MUST be at least
      100, although it is recommended servers allow more.

The Unicode normalization form used by client/server could cause
disagreement about whether a given name is permitted, though for the
type of use this will get it's not clear that we need to be more
rigorous.

Section 1.3.2

   o  *submissionExtensions*: "String[String[]]" A JMAP implementation
      that talks to a Submission [RFC6409] server SHOULD have a
      configuration setting that allows an administrator to expose a new
      submission EHLO capability in this field.  This allows a JMAP

I think I'm confused by the workflow here.  Suppose we have a JMAP
client A talking to a JMAP server B, and B is also an SMTP client that
talks to MTA C.  This capability is supposed to be about letting B
expose a new EHLO capability to A, but only when C supports it?
We probably need some more text here to explain the scenario, even if my
guess is correct.

      server to gain access to a new submission extension without code
      changes.  By default, the JMAP server should hide EHLO
      capabilities that are to do with the transport mechanism and thus
      are only relevant to the JMAP server (for example PIPELINING,
      CHUNKING, or STARTTLS).  Each key in the object is the _ehlo-
      name_, and the value is a list of _ehlo-args_.  Examples of
      Submission extensions to include:

The description here could probably be reworded/reorderd for clarity,
since we don't start talking about what the actual map keys/values are
until the penultimate sentence, at present.

Section 1.5

   In addition, servers MUST support pushing state changes for a type
   called "EmailDelivery".  [...]

Er, is this only servers that are implementing
urn:ietf:params:jmap:mail?  Or is the bit about "Servers MUST support
all properties specified for the new data types defined in this
document"  supposed to imply that I have to implement all three as a
unit, even if not all of them are going to be available for every
account/credentials?

Section 2

   o  *id*: "Id" (immutable; server-set) The id of the mailbox.

Since jmap-core now has all Ids as immutable and server-set, is this
sort of notation considered redundant in jmap-mail?  (Throughout; this
is just the first instance.)

   o  *role*: "String|null" (default: null) Identifies mailboxes that
      [...]
      same role.  Servers providing IMAP access to the same data are
      encouraged to enforce these extra restrictions in IMAP as well.
      Otherwise, it is implementation dependent how to modify the IMAP
      attributes to ensure compliance when exposing the data over JMAP.

If we had a "see also" document relation, this might justify one from IMAP
to here.  It probably doesn't qualify for Updates: though.
Nonetheless, I'd consider having a top-level "Updates Affecting IMAP"
section like RFC 8446 did for TLS 1.2, to collect all the bits that
someone deploying IMAP and JMAP together might need to think about for
their IMAP implementation.

   o  *sortOrder*: "UnsignedInt" (default: 0) Defines the sort order of
      [...]
      equal order SHOULD be sorted in alphabetical order by name.  The
      sorting SHOULD take into account locale-specific character order
      convention.

I think this last SHOULD is probably not a 2119 SHOULD, and therefore
should just be "should".

   o  *unreadThreads*: "UnsignedInt" (server-set) An indication of the
      number of "unread" threads in the mailbox.  For compatibility with
      existing implementations, the way "unread threads" is determined
      is not mandated in this document.  The simplest solution to

Do we really have competing *J*MAP implementations that disagree on
this?  Or is the idea to preserve compatibility with IMAP
clients or other current implementations that attempt to determine
threading relationships (in which case, that should probably be mentioned)?

   o  *myRights*: "MailboxRights" (server-set) The set of rights (ACLs)
      the user has in relation to this mailbox.  These are backwards
      compatible with IMAP ACLs, as defined in [RFC4314].  A
      _MailboxRights_ object has the following properties:

I read this as saying that all of the properties must be present in the
object, such that omitting a property is not a permitted synonym for it
having a value of false.  Is this reading correct?
If so, what should a client do if the server misbehaves?

      *  *mayReadItems*: "Boolean" If true, the user may use this
         [...]
         but not the parent mailbox, this may be "false".  Corresponds
         to IMAP ACLs "lr".

"Corresponds to" is perhaps imprecise, if one is thinking about the IMAP
ACL as being the authoritative source of information (which not all
readers will!).  The point being that just 'l' or just 'r' would not be
enough to get this, and since JMAP is specifying a slightly more
abstract mapping than standard IMAP rights, we should be precise about
the mapping, and arguably in both directions.  (Not just here, but for
all the compound ACLs, of course.  Also for IMAP ACL 'x', which has two
corresponding JMAP permissions.)

      *  *maySetSeen*: "Boolean" The user may add or remove the "$seen"
         keyword to/from an email.  If an email belongs to multiple
         mailboxes, the user may only modify "$seen" if *all* of the
         mailboxes have this permission.  Corresponds to IMAP ACL "s".

nit: in the intro, we talk of "rights the user has in relation to this
mailbox", so it's a bit disjoint to talk of mailboxes having permissons.
(Here and below.)

   o  *isSubscribed*: "Boolean" Has the user indicated they wish to see
      [...]
      choose to ignore this property, either entirely for ease of
      implementation, or just for the primary account (which is normally

nit: We don't really provide a formal definition of "primary account"
either here or in jmap-core.

Section 2.1

   Standard "/get" method.  The _ids_ argument may be "null" to fetch

I thought I had said something on jmap-core but maybe I only thought
about it: my personal preference would be for section references into
jmap-core to provide a foundation for what the "standard method" is,
though I will not insist upon it.

   all at once.

(Also, this behavior is part of the standard method now.)

Section 2.2

I might say explicitly that "a non-null updatedProperties response
argument indicates that the mailbox contents are unchanged from the old
state, with only the counts having changed", since these semantics are
potentially unexpected.

Section 2.3

   Standard "/query" method, but with the following additional argument:

(Generic comment, not scoped to just this section): are we going to need
to draw a distinction between input and ouput arguments for any of the
additions we make to standard methods?

   o  *role*: "String|null" The Mailbox _role_ property must match the
      given value exactly.

So the client is responsible for lower-casing values from the IMAP
Mailbox Name Attributes registry and the server must not do a
case-insensitive comparison?

   The following properties MUST be supported for sorting:

Just to be pedantic, we're talking about the values of the "property"
property within the Comparator object, right?  (This is one of those
annoying things at the intersection of technical specifications and
natural English language.)

Section 4

   Due to the number of properties involved, the set of _Email_
   properties is specified over the following three sub-sections.

It's probably worth a note that the subsections are for purposes of
document organization and are not reflected in the wire protocol
structure -- the properties involved are all top-level peers, across the
three subsections.  (Assuming that's correct, of course.)

Section 4.1

I assume it was a conscious WG decision to not present a full
consolidated schema for Email.  But I want to check, since I think (not
having one) that it would have helped my understanding, and I try to be
open to discovering the errors of my ways...

   o  _textBody_/_htmlBody_: These provide a list of parts that should
      be rendered sequentially as the "body" of the message.  This is a
      list rather than a single part as messages may have headers and/or
      footers appended/prepended as separate parts as they are
      transmitted, and some clients send text and images intended to be
      displayed inline in the body (or even videos and sound clips) as
      multiple parts rather than a single HTML part with referenced
      images.

Some guidance related to interpreting these lists and avoiding the eFail
(efail.de) class of attacks is probably in order -- the HTML parts
should get some different containers around their processing.  This
could potentially go here, or in Section 9.2.

   Because MIME allows for multiple representations of the same data
   (using "multipart/alternative"), there is a textBody property (which
   prefers a plain text representation) and an htmlBody property (which
   prefers an HTML representation) to accommodate the two most common
   client requirements.  The same part may appear in both lists where
   there is no alternative between the two.

(soapbox) It's annoying when I get mime/multipart with HTML in the
text/plain section.  Is this clause going to allow for that same sort of
misclassification?

   Due to the number of properties involved, the set of _Email_
   properties is specified over the following three sub-sections.

nit: are we at four sub-sections now?

Section 4.1.1

      The IMAP "\Recent" keyword is not exposed via JMAP.  The IMAP
      "\Deleted" keyword is also not present: IMAP uses a delete+expunge
      model, which JMAP does not.  Any message with the "\Deleted"
      keyword MUST NOT be visible via JMAP (including as part of any
      mailbox counts).  Users may add arbitrary keywords to an email.

IIRC, Trash gets special handling with respect to deletion in JMAP
commands; does it also get special treatment w.r.t \Deleted translation
from IMAP to JMAP?

Section 4.1.2.2

This escape valve for "Any header not defined in [RFC5322] or [RFC2369]"
(here, et seq) seems like it might benefit from some general guidance
about implementations applying common sense, i.e., allowing servers the
ability to deny requests for a given form for such new headers in order
to prevent nonsense behavior.

Section 4.1.4

   o  *partId*: "String|null" Identifies this part uniquely within the
      Email.  This is scoped to the _emailId_ and has no meaning outside
      of the JMAP Email object representation.  This is "null" if, and
      only if, the part is of type "multipart/*".

The prose isn't the super-best indicator that the asterisk is
intended to have wildcarding behavior.

Section 4.4.1

Just to double-check: the different fencepost behavior for
minSize/maxSize is intentional?

   o  *text*: "String" Looks for the text in emails.  The server SHOULD
      look up text in the _from_, _to_, _cc_, _bcc_, _subject_ header
      fields of the message, and inside any "text/*" or other body parts
      that may be converted to text by the server.  The server MAY
      extend the search to any additional textual property.

side note: as a mail user, I like to be able to use text search to
conclusively determine that a given message/topic is *not* in a given
mailbox.  This weak "SHOULD" language does not provide me that
guarantee, though I can see how it makes sense in the protocol design to
leave the flexibility for implementors, here.

   o  When searching inside a "text/html" body part, any text considered
      markup rather than content SHOULD be ignored, including HTML tags
      and most attributes, anything inside the "<head>" tag, CSS and
      JavaScript.  Attribute content intended for presentation to the
      user such as "alt" and "title" SHOULD be considered in the search.

This would seem to leave no reliable way for a security researcher to
(e.g.) search for snippets of attack javascript in received mails
without downloading all of them.

   o  Text SHOULD be matched in a case-insensitive manner.

Is the server going to have a sense of the user's locale as needed for
fully generic case-insensitive comparison?

   o  Tokens MAY be matched on a whole-word basis using stemming (so for
      example a text search for "bus" would match "buses" but not
      "business").

I think this is supposed to not apply to the "phrase search" two bullets
above, but greater clarity would be appreciated.

   o  *hasKeyword* - This value MUST be considered "true" if the email
      has the keyword given as an additional _keyword_ property on the
      _Comparator_ object, or "false" otherwise.

I strongly suggest an explicit listing of the "additional properties as
reuqired for specific sort operations" of the _Comparator_ type when
used for Emails.

Section 4.6

   When emptying the trash, clients SHOULD NOT destroy emails which are
   also in a mailbox other than trash.  For those emails, they SHOULD
   just remove the Trash mailbox from the email.

This last SHOULD seems to be duplicated in Section 2.

   For successfully created Email objects, the _created_ response
   contains the _id_, _blobId_, _threadId_ and _size_ properties of the
   object.

For partial drafts, the behavior where a threadId always gets assigned
on first touch is perhaps interesting, as subsequent edits might cause
the effective threading to change, which would necessitate a new Id as
well (IIUC).  I'm not sure if there's anything weird there that a client
would need to be prepared for, though.  (Maybe not, given that it always
is going to get back an _id_ from /set, and should be using that.)

Section 4.8

   If the blob referenced is not a valid [RFC5322] message, the server
   MAY modify the message to fix errors (such as removing NUL octets or
   fixing invalid headers).  If it does this, the _blobId_ on the
   response MUST represent the new representation and therefore be
   different to the _blobId_ on the EmailImport object.  Alternatively,
   the server MAY reject the import with an "invalidEmail" SetError.

In general, having more options like this can increase the fragility of
the ecosystem, as a client might be written to assume one behvaior and
then be not as portable to a different server.  I'm not sure that
there's a clear way to mandate a single type of behavior here, but want
to be sure that the topic was discussed.

Section 6

   o  *email*: "String" (immutable) The "From" email address the client
      MUST use when creating a new message from this identity.  The
      value MAY alternatively be of the form "*@example.com", in which
      case the client may use any valid email address ending in
      "@example.com".

I mostly assume this is supposed to be for a generic domain and not
special-casing example.com literally.  Some extra text/formatting would
help with that.

Section 7.3

The associated identityId is not a queriable property?

Section 7.5

   o  *onSuccessUpdateEmail*: "Id[Email]|null" A map of _EmailSubmission
      id_ to an object containing properties to update on the Email
      object referenced by the EmailSubmission if the create/update/
      destroy succeeds.  (For references to EmailSubmission creations,
      this is equivalent to a back-reference so the id will be the
      creation id prefixed with a "#".)

I'm confused by the "Id[Email]" part -- we describe it as a map to "an
object containing properties to update", but that's not exactly what an
Email object is.  Is this more of a PatchObject than an Email per se?
nit: I'd also tweak the wording of the parenthetical a bit (here and
below), to something like "when applying to EmailSubmissions created in
the same "/set" invocation, ..."

Section 8

By a literal reading, fromDate and toDate are in conflict with each
other (when non-null).  That is, the fromDate text does not admit the
possibility of an end to the vacation response period, and vice versa.

Section 9

I'd consider adding another sentence like "Additional considerations
specific to the data types and functionality introduced by this document
are described in the following subsections."

Section 9.3

I know we don't want this to devolve into a generic discussion of the
flaws of email, but perhaps the envelope-from/body-from distinction is
worth repeating, with a note that JMAP has provisions for ACLs on
submission that check both.