Re: [core] Incoming AD review of draft-ietf-core-block-19

Alexey Melnikov <alexey.melnikov@isode.com> Mon, 16 May 2016 12:55 UTC

Return-Path: <alexey.melnikov@isode.com>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AF7C012D601 for <core@ietfa.amsl.com>; Mon, 16 May 2016 05:55:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.427
X-Spam-Level:
X-Spam-Status: No, score=-3.427 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=isode.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C6FL1fBZOp6k for <core@ietfa.amsl.com>; Mon, 16 May 2016 05:55:38 -0700 (PDT)
Received: from statler.isode.com (Statler.isode.com [62.232.206.189]) by ietfa.amsl.com (Postfix) with ESMTP id 62E5612D0F1 for <core@ietf.org>; Mon, 16 May 2016 05:55:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1463403335; d=isode.com; s=selector; i=@isode.com; bh=kNqjeOMMgm4ZIX+75ePD82kSkKOEN0CClSbt21HzVB8=; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version: In-Reply-To:References:Content-Type:Content-Transfer-Encoding: Content-ID:Content-Description; b=h1pp0FzGnzXBysKrq9a2XfE6zOgOU9l2YcQT5auQ9Yu+mNRHl4lQMDrYbKkWgauMLHcIKq geSjDmxYvMMkXGvJd9WmJEvAzSGaQCdsP15fiAWoOAtlX56uyiCUVI1kWVoFnez9cN9JDv 9TGFK7SUCR+0NP7aiBXJby6JxSly99I=;
Received: from [172.22.50.30] ((unknown) [62.232.206.186]) by statler.isode.com (submission channel) via TCP with ESMTPSA id <VznDRwB-mwsh@statler.isode.com>; Mon, 16 May 2016 13:55:35 +0100
X-SMTP-Protocol-Errors: NORDNS PIPELINING
From: Alexey Melnikov <alexey.melnikov@isode.com>
X-Mailer: iPad Mail (13E238)
In-Reply-To: <571B8563.2090508@tzi.org>
Date: Mon, 16 May 2016 14:03:46 +0100
Message-Id: <0E85B4E9-3F6F-43ED-9622-E09DA8782B69@isode.com>
References: <5707D6F8.40000@isode.com> <57186FC3.9010405@tzi.org> <571B6571.5010602@isode.com> <571B8563.2090508@tzi.org>
To: Carsten Bormann <cabo@tzi.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/core/YYQSjB8FlvFDVrX4B_r87sXPJmk>
Cc: core@ietf.org
Subject: Re: [core] Incoming AD review of draft-ietf-core-block-19
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2016 12:55:41 -0000

Hi Carsten,
Continuing the discussion:

> On 23 Apr 2016, at 15:23, Carsten Bormann <cabo@tzi.org> wrote:
> 
> Alexey Melnikov wrote:
>> Hi Carsten,
>> Thank you for your responses. Further discussions below:
>> 
>>> On 21/04/2016 07:14, Carsten Bormann wrote:
>>> Alexey Melnikov wrote:
>>>> Hi,
>>>> I am mostly happy with this document, but I have a few comments/questions:
>>>> 
>>>> On page 11:
>>>> 
>>>>   Clients that want to retrieve
>>>>   all blocks of a resource SHOULD strive to do so without undue delay.
>>>> 
>>>> This is not an interoperability issue and it would be impossible to
>>>> verify compliance with it, unless you have a number that specifies what
>>>> is "undue delay". So I think you shouldn't use RFC 2119 SHOULD here.
>>>> Just use lowercased "should" instead.
>>> Indeed, you cannot measure compliance with this SHOULD.  I still think
>>> that it is important for interoperability to point out that clients will
>>> have more successful exchanges if they heed this.  (From an
>>> interoperability point of view, this is a statement that relieves
>>> servers of a potential onus to somehow cater for clients that don't.)
>>> 
>>>> Similarly, in 2.5:
>>>> 
>>>>   Clients SHOULD strive to send all blocks of a
>>>>   request without undue delay.
>>>> 
>>>> (Similar text in 2.6)
>>> (Ditto.)
>> 
>> I think I prefer to have some recommendations on what is "undue delay",
>> if you can add some text.
> 
> Delay for which there isn't a good reason?
> 
> Another way to say this would be: "Servers will not go out of their way
> to accommodate clients that take forever to finish a block-wise
> transfer.  If the resource changes while this proceeds, the ETag for a
> further block obtained may be different.  To avoid this happening all
> the time for a fast-changing resource, a server MAY try to keep a cache
> around for a specific client for a short amount of time, but the
> lifetime for such a cache may be short, on the order of a few expected
> round-trip times, counting from the previous block transferred."
> 
> Should we go to this level of detail here?

I think your suggested text above is better, so I think the answer is yes.
> 
>>>> In 2.9.2:
>>>> 
>>>> Should probably also mention that this response code is also used for
>>>> mismatching content-format options
>>> That is one way to see this; section 2.3 takes the view that mismatching
>>> content-formats aren't reassembled into one body in the first place so
>>> an incomplete body is the result of not having all parts.
>>> (I added back reference in the editor's draft.)
>> 
>> What is the state of the resource in such condition?
> 
> We didn't make a guarantee here; after all, the client just violated a
> MUST.  A good server will just reject a block-wise transfer with NUM≠0
> and a different content-format than the current state of the resource:
> Either it is stateless, and it matches up the content-format of the
> block against that of the existing resource, or it is atomic, in which
> case it matches up the block against the partially reassembled piece of
> representation that is going to replace the state of the resource.

I think adding such suggestion (about server behaviour) would be a good thing.
> 
>>>> In 2.10:
>>>> 
>>>>   A response with a Block1 Option in control usage
>>>>   with the M bit set invalidates cached responses for the target URI.
>>>> 
>>>> Can you explain the reason for this?
>>> 
>>> If the M bit had not been set the response would have been a final
>>> response and would be used to update the cache entries for this URI.
>>> Now, with the M bit set, we know that there will be a final response
>>> later, but we don't know what that will be.  Continuing to serve a
>>> previous response from the cache doesn't sound right.  But then, it
>>> could be argued that the server just promised to perform the request
>>> atomically later, so nothing has happened yet.  Good question.

I thought keeping cached version until an update is final would be a reasonable implementation choice. But I will not insist on this changing.

>>> 
>>>> In 3.2:
>>>> 
>>>>   A stateless server that simply builds/updates the resource in place
>>>>   (statelessly) may indicate this by not setting the more bit in the
>>>>   response (Figure 8); in this case, the response codes are valid
>>>>   separately for each block being updated.
>>>> 
>>>> What is the behaviour of both the client and the server if PUT on a
>>>> particular block fails? Is this clear enough in the document?
>>> In the stateless case, the resource is now probably broken (unless the
>>> resource is somehow intrinsically robust to this case).  The client
>>> should not be using the resource (e.g., try to initiate a firmware
>>> update from an image it just has been building).  The server is
>>> stateless with respect to individual requests, so it is patiently
>>> sitting there for the broken resource to be mended.
>> 
>> How can a resource be "mended" if a PUT failed? I think it would be
>> reasonable for a server implementation to discard the whole accumulated
>> payload, so there would be no way to mend it other than by uploading the
>> whole thing from the beginning. If my interpretation is invalid, I
>> welcome some clarifications on this.
> 
> If the server is stateless (in-place replace), the failed PUT may have
> had no effect (which should be the case for 4.xx response), so the
> client can try doing something else to that block.  If there was a 5.xx
> response, that is harder to say.  But the real problem is that the
> previous blocks may already have had an effect on the resource, so it
> may be inconsistent/incomplete.

I think this demonstrates my concern. If a PUT fails, the client that did the PUT goes away and another client requests the same resource, what does the server do? Will it serve a corrupted version?

Lack of predictable behaviour in this case bothers me.

> 
>> So I think this needs more clarifying text, either describing what
>> client might be able to do to fix the resource or explaining that the
>> client need to restart upload.
> 
> Right, I'll try to separate out the cases and add some text (but not
> here in the examples).

 [...]
> 
>>>> As block2 is a critical option, how can a server know if a particular
>>>> client supports this option?
>>> The assumption here is that CoAP clients generally do, unless they are
>>> very specialized and never have to deal with non-trivial amounts of
>>> information (such as a /.well-known/core).
>> 
>> Is this generally true for all COAP extensions or just some?
> 
> Just this one.  Block-wise transfers were part of the design for the
> CoAP protocol from the start, and implementers have been aware that they
> had to do block-wise in order to support non-trivial payload sizes (but
> they don't have to do them if they don't need to).
> 
>> Extensibility for most protocols is done by capability
>> discovery/negotiation and graceful service degradation in absence of a
>> particular extension. This seems to violate this principle.
> 
> Right.  So, for instance Observe was designed so that it can be
> gracefully ignored by a server that doesn't implement it.  We still put
> a mechanism in RFC 6690 so a server can signal that it offers Observe
> for a resource.  I would expect similar out of band information to be
> provided for future extensions, so a client doesn't have to waste a
> round-trip trying out the extension.  Block is slightly weird in that a
> server may need to offer the (critical) extension unsolicited for an
> unextended request; we'd try to avoid that for any new extension, but
> here we do have the luxury.

Because Block is special, I think this needs extra text early in the document.

Also the document should have Update value for the base COAP RFC in the header to signal this.