Re: [core] Incoming AD review of draft-ietf-core-block-19

Alexey Melnikov <alexey.melnikov@isode.com> Mon, 16 May 2016 12:55 UTC

From: Alexey Melnikov <alexey.melnikov@isode.com>
In-Reply-To: <571B8563.2090508@tzi.org>
Date: Mon, 16 May 2016 14:03:46 +0100
Message-Id: <0E85B4E9-3F6F-43ED-9622-E09DA8782B69@isode.com>
References: <5707D6F8.40000@isode.com> <57186FC3.9010405@tzi.org> <571B6571.5010602@isode.com> <571B8563.2090508@tzi.org>
To: Carsten Bormann <cabo@tzi.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/core/YYQSjB8FlvFDVrX4B_r87sXPJmk>
Cc: core@ietf.org
Subject: Re: [core] Incoming AD review of draft-ietf-core-block-19
Precedence: list

Hi Carsten,
Continuing the discussion:

> On 23 Apr 2016, at 15:23, Carsten Bormann <cabo@tzi.org> wrote:
> 
> Alexey Melnikov wrote:
>> Hi Carsten,
>> Thank you for your responses. Further discussions below:
>> 
>>> On 21/04/2016 07:14, Carsten Bormann wrote:
>>> Alexey Melnikov wrote:
>>>> Hi,
>>>> I am mostly happy with this document, but I have a few comments/questions:
>>>> 
>>>> On page 11:
>>>> 
>>>>   Clients that want to retrieve
>>>>   all blocks of a resource SHOULD strive to do so without undue delay.
>>>> 
>>>> This is not an interoperability issue and it would be impossible to
>>>> verify compliance with it, unless you have a number that specifies what
>>>> is "undue delay". So I think you shouldn't use RFC 2119 SHOULD here.
>>>> Just use lowercased "should" instead.
>>> Indeed, you cannot measure compliance with this SHOULD.  I still think
>>> that it is important for interoperability to point out that clients will
>>> have more successful exchanges if they heed this.  (From an
>>> interoperability point of view, this is a statement that relieves
>>> servers of a potential onus to somehow cater for clients that don't.)
>>> 
>>>> Similarly, in 2.5:
>>>> 
>>>>   Clients SHOULD strive to send all blocks of a
>>>>   request without undue delay.
>>>> 
>>>> (Similar text in 2.6)
>>> (Ditto.)
>> 
>> I think I prefer to have some recommendations on what is "undue delay",
>> if you can add some text.
> 
> Delay for which there isn't a good reason?
> 
> Another way to say this would be: "Servers will not go out of their way
> to accommodate clients that take forever to finish a block-wise
> transfer.  If the resource changes while this proceeds, the ETag for a
> further block obtained may be different.  To avoid this happening all
> the time for a fast-changing resource, a server MAY try to keep a cache
> around for a specific client for a short amount of time, but the
> lifetime for such a cache may be short, on the order of a few expected
> round-trip times, counting from the previous block transferred."
> 
> Should we go to this level of detail here?

I think your suggested text above is better, so I think the answer is yes.
> 
>>>> In 2.9.2:
>>>> 
>>>> Should probably also mention that this response code is also used for
>>>> mismatching content-format options
>>> That is one way to see this; section 2.3 takes the view that mismatching
>>> content-formats aren't reassembled into one body in the first place so
>>> an incomplete body is the result of not having all parts.
>>> (I added back reference in the editor's draft.)
>> 
>> What is the state of the resource in such condition?
> 
> We didn't make a guarantee here; after all, the client just violated a
> MUST.  A good server will just reject a block-wise transfer with NUM≠0
> and a different content-format than the current state of the resource:
> Either it is stateless, and it matches up the content-format of the
> block against that of the existing resource, or it is atomic, in which
> case it matches up the block against the partially reassembled piece of
> representation that is going to replace the state of the resource.

I think adding such suggestion (about server behaviour) would be a good thing.
> 
>>>> In 2.10:
>>>> 
>>>>   A response with a Block1 Option in control usage
>>>>   with the M bit set invalidates cached responses for the target URI.
>>>> 
>>>> Can you explain the reason for this?
>>> 
>>> If the M bit had not been set the response would have been a final
>>> response and would be used to update the cache entries for this URI.
>>> Now, with the M bit set, we know that there will be a final response
>>> later, but we don't know what that will be.  Continuing to serve a
>>> previous response from the cache doesn't sound right.  But then, it
>>> could be argued that the server just promised to perform the request
>>> atomically later, so nothing has happened yet.  Good question.

I thought keeping cached version until an update is final would be a reasonable implementation choice. But I will not insist on this changing.

>>> 
>>>> In 3.2:
>>>> 
>>>>   A stateless server that simply builds/updates the resource in place
>>>>   (statelessly) may indicate this by not setting the more bit in the
>>>>   response (Figure 8); in this case, the response codes are valid
>>>>   separately for each block being updated.
>>>> 
>>>> What is the behaviour of both the client and the server if PUT on a
>>>> particular block fails? Is this clear enough in the document?
>>> In the stateless case, the resource is now probably broken (unless the
>>> resource is somehow intrinsically robust to this case).  The client
>>> should not be using the resource (e.g., try to initiate a firmware
>>> update from an image it just has been building).  The server is
>>> stateless with respect to individual requests, so it is patiently
>>> sitting there for the broken resource to be mended.
>> 
>> How can a resource be "mended" if a PUT failed? I think it would be
>> reasonable for a server implementation to discard the whole accumulated
>> payload, so there would be no way to mend it other than by uploading the
>> whole thing from the beginning. If my interpretation is invalid, I
>> welcome some clarifications on this.
> 
> If the server is stateless (in-place replace), the failed PUT may have
> had no effect (which should be the case for 4.xx response), so the
> client can try doing something else to that block.  If there was a 5.xx
> response, that is harder to say.  But the real problem is that the
> previous blocks may already have had an effect on the resource, so it
> may be inconsistent/incomplete.

I think this demonstrates my concern. If a PUT fails, the client that did the PUT goes away and another client requests the same resource, what does the server do? Will it serve a corrupted version?

Lack of predictable behaviour in this case bothers me.

> 
>> So I think this needs more clarifying text, either describing what
>> client might be able to do to fix the resource or explaining that the
>> client need to restart upload.
> 
> Right, I'll try to separate out the cases and add some text (but not
> here in the examples).

 [...]
> 
>>>> As block2 is a critical option, how can a server know if a particular
>>>> client supports this option?
>>> The assumption here is that CoAP clients generally do, unless they are
>>> very specialized and never have to deal with non-trivial amounts of
>>> information (such as a /.well-known/core).
>> 
>> Is this generally true for all COAP extensions or just some?
> 
> Just this one.  Block-wise transfers were part of the design for the
> CoAP protocol from the start, and implementers have been aware that they
> had to do block-wise in order to support non-trivial payload sizes (but
> they don't have to do them if they don't need to).
> 
>> Extensibility for most protocols is done by capability
>> discovery/negotiation and graceful service degradation in absence of a
>> particular extension. This seems to violate this principle.
> 
> Right.  So, for instance Observe was designed so that it can be
> gracefully ignored by a server that doesn't implement it.  We still put
> a mechanism in RFC 6690 so a server can signal that it offers Observe
> for a resource.  I would expect similar out of band information to be
> provided for future extensions, so a client doesn't have to waste a
> round-trip trying out the extension.  Block is slightly weird in that a
> server may need to offer the (critical) extension unsolicited for an
> unextended request; we'd try to avoid that for any new extension, but
> here we do have the luxury.

Because Block is special, I think this needs extra text early in the document.

Also the document should have Update value for the base COAP RFC in the header to signal this.

[core] Incoming AD review of draft-ietf-core-bloc… Alexey Melnikov
Re: [core] Incoming AD review of draft-ietf-core-… Carsten Bormann
Re: [core] Incoming AD review of draft-ietf-core-… Alexey Melnikov
Re: [core] Incoming AD review of draft-ietf-core-… Carsten Bormann
Re: [core] Incoming AD review of draft-ietf-core-… Alexey Melnikov