Re: [apps-discuss] Alissa Cooper's Discuss on draft-ietf-appsawg-sieve-duplicate-07: (with DISCUSS and COMMENT)

Stephan Bosch <stephan@rename-it.nl> Fri, 20 June 2014 07:52 UTC

Return-Path: <stephan@rename-it.nl>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5CB731B2799; Fri, 20 Jun 2014 00:52:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.456
X-Spam-Level:
X-Spam-Status: No, score=-0.456 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_NL=0.55, HOST_EQ_NL=1.545, RP_MATCHES_RCVD=-0.651] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E7QaoByNy2Ve; Fri, 20 Jun 2014 00:52:12 -0700 (PDT)
Received: from drpepper.rename-it.nl (drpepper.rename-it.nl [217.119.238.16]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E69A21AD6B1; Fri, 20 Jun 2014 00:52:10 -0700 (PDT)
Received: from klara.student.utwente.nl ([130.89.162.218]:65170 helo=[10.168.3.2]) by drpepper.rename-it.nl with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from <stephan@rename-it.nl>) id 1Wxtc4-0000oA-6S; Fri, 20 Jun 2014 09:51:58 +0200
Message-ID: <53A3E7EB.1030604@rename-it.nl>
Date: Fri, 20 Jun 2014 09:51:07 +0200
From: Stephan Bosch <stephan@rename-it.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
MIME-Version: 1.0
To: Alissa Cooper <alissa@cooperw.in>, The IESG <iesg@ietf.org>
References: <20140620004041.5801.22430.idtracker@ietfa.amsl.com>
In-Reply-To: <20140620004041.5801.22430.idtracker@ietfa.amsl.com>
X-Enigmail-Version: 1.6
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
X-RenameIT-MailScanner-SpamScore: -2.3 (--)
X-RenameIT-MailScanner-SpamCheck: No, score=-2.3 required=5.0 tests=ALL_TRUSTED, BAYES_00 autolearn=ham version=3.3.1
Archived-At: http://mailarchive.ietf.org/arch/msg/apps-discuss/lqF7m-F5JvptoFOzBcpaA94DK80
Cc: appsawg-chairs@tools.ietf.org, ned+ietf@mrochek.com, draft-ietf-appsawg-sieve-duplicate@tools.ietf.org, apps-discuss@ietf.org
Subject: Re: [apps-discuss] Alissa Cooper's Discuss on draft-ietf-appsawg-sieve-duplicate-07: (with DISCUSS and COMMENT)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Jun 2014 07:52:16 -0000

Hi Alissa,

On 6/20/2014 2:40 AM, Alissa Cooper wrote:
> Alissa Cooper has entered the following ballot position for
> draft-ietf-appsawg-sieve-duplicate-07: Discuss
>
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
>
> o Section 3.3: 
> "A default expiration time of around 7 days is usually
>    appropriate. ... If that limit is exceeded by the ":seconds" argument,
> the
>    maximum value MUST silently be substituted"
>
> The suggested default seems really long, especially for the example use
> case described in Section 1. On the other hand, it seems odd that the
> user's choice would be overriden by the preset maximum. This would make
> more sense to me if the default expiration were shorter and the user
> could override it with a longer :seconds argument if he wanted. What is
> the rationale for doing it the opposite way?

The default was chosen in Sieve mailing list discussions, but in essence
it is pretty arbitrary. It is mainly based on the period in which a
series of duplicate messages may arrive with a margin of a few more days.

The rationale for a maximum is to prevent users from having the ability
to create duplicate tracking list entries that linger indefinitely.

> o Section 6: 
> It seems like this section is missing a discussion of the privacy
> considerations associated with enabling the duplicate test. In the case
> where the Sieve filter is implemented server side, making use of the
> duplicate test means that some record of the receipt of a particular
> message will be persisted (for as long as specified by the logic in
> Section 3.3) even if the user downloads his messages or deletes a
> received message on the server that matches a test string in the
> meantime. If I want to filter all messages with duplicate message IDs,
> for example, this puts a
> requirement on the server to maintain a list of the message IDs of all
> messages I’ve received (for the amount of time specified by the timing
> parameters). So if a third party wanted to find out that list, it could
> go to the operator
> of the server and ask for it. This introduces a privacy risk that is not
> discussed in the document and is exacerbated by the long default
> expiration time mentioned above. Tests that make use of the :header or
> :uniqueid arguments are also potentially problematic since the list of
> strings to match is kept on the server. It seems like these aspects
> should at least be noted in the document for implementers to consider.
>
> o Section 6: 
> Sieve scripts that include duplicate tests contain potentially sensitive
> information (e.g., subject or body strings). So it seems like the scripts
> should be confidentiality protected in transit. I checked with Barry and
> he said that there is no RFC that specifies if/when scripts should be
> protected in transit, and I understand that this document is probably not
> the right place to specify required behavior there, but I'd like to
> discuss (more with the ADs than the authors) if there is some plan for
> specifying that behavior somewhere. 
>
> o Section 6: 
> I think it would make sense to require that the tracked message list be
> stored with an equivalent level of protection as the user's messages
> themselves. E.g., if message headers and bodies are stored encrypted,
> then the duplicate tracking list should be as well. And the duplicate
> tracking list should be subject to the same access controls as the
> mailbox (perhaps this is obvious but I'm not sure).

This is a very interesting point of view. I had not thought of privacy
concerns relating to the duplicate tracking list.

Probably, the main reason we have not considered this is that the
document states the following:

   NOTE: The necessary mechanism to track duplicate messages is very
   similar to the mechanism that is needed for tracking duplicate
   responses for the "vacation" [VACATION <http://tools.ietf.org/html/draft-ietf-appsawg-sieve-duplicate-07#ref-VACATION>] action. One way to implement
   the necessary mechanism for the "duplicate" test is therefore to
   store a hash of the tracked unique ID and, if provided, the ":handle"
   argument.

There is no tractable way to reverse a hash into its original string
value, meaning that the entries in the duplicate tracking list would
have very little value for an attacker.

Of course, if implementations choose a different approach that uses
entries that can yield the original string values, there would be a real
concern. I propose adding a sentence or two pointing out that duplicate
tracking list entries should not contain the plain user strings for
privacy reasons, or, alternatively, that the tracked message list be
stored with an equivalent level of protection as the user's messages
themselves.

The same considerations apply to the vacation extension (RFC 5230).

> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
>
> o Section 3.1:
> "When the ":uniqueid" argument is used, such normalization
>    concerns are the responsibility of the user."
>
> I don't quite get this. Do we expect users to, e.g., specify that
> uniqueid strings should only be compared after conversion to UTF-8? That
> would seem to rely on a level of technical sophistication that almost no
> users actually have.

I am not sure what exactly you mean here. The normalization concerns
listed before that sentence all apply to header field content, something
the author of a sieve script cannot control directly. With ":uniqueid"
the script author is directly responsible for the composition of the
unique ID value. As long as the encoding is chosen consistently, the
encoding does not matter for successfully matching it to past values in
the duplicate tracking list. It can also be some sort of binary value if
appropriate; whatever the application demands. The quoted sentence
mainly points to the fact that for some applications it can be useful to
have such normalization for ":uniqueid" as well, e.g. to trim leading
and trailing white space. However, the script author will have to
implement this explicitly, since the duplicate test will not do this
implicitly.

> o Section 3.2:
> "This means that it does not matter whether values are
>     obtained from the message ID header, from an arbitrary header
>     specified using the ":header" argument or explicitly from the
>     ":uniqueid" argument.
>
> I had trouble understanding what "it does not matter" meant in this
> sentence. I would suggest:
>
> "This means that the values in the duplicate list should be used for
> duplicate testing regardless of whether they were
>    obtained from the message ID header, from an arbitrary header
>    specified using the ":header" argument or explicitly from the
>    ":uniqueid" argument."

Ok, works for me.

Regards,

Stephan.