Re: [core] Incoming AD review of draft-ietf-core-block-19

Carsten Bormann <cabo@tzi.org> Sat, 23 April 2016 14:23 UTC

Message-ID: <571B8563.2090508@tzi.org>
Date: Sat, 23 Apr 2016 16:23:31 +0200
From: Carsten Bormann <cabo@tzi.org>
User-Agent: Postbox 4.0.8 (Macintosh/20151105)
MIME-Version: 1.0
To: Alexey Melnikov <alexey.melnikov@isode.com>
References: <5707D6F8.40000@isode.com> <57186FC3.9010405@tzi.org> <571B6571.5010602@isode.com>
In-Reply-To: <571B6571.5010602@isode.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/core/3L6RbzGFd6WAbk0GA0PH03eyfRA>
Cc: core@ietf.org
Subject: Re: [core] Incoming AD review of draft-ietf-core-block-19
Precedence: list

Alexey Melnikov wrote:
> Hi Carsten,
> Thank you for your responses. Further discussions below:
> 
> On 21/04/2016 07:14, Carsten Bormann wrote:
>> Alexey Melnikov wrote:
>>> Hi,
>>> I am mostly happy with this document, but I have a few comments/questions:
>>>
>>> On page 11:
>>>
>>>    Clients that want to retrieve
>>>    all blocks of a resource SHOULD strive to do so without undue delay.
>>>
>>> This is not an interoperability issue and it would be impossible to
>>> verify compliance with it, unless you have a number that specifies what
>>> is "undue delay". So I think you shouldn't use RFC 2119 SHOULD here.
>>> Just use lowercased "should" instead.
>> Indeed, you cannot measure compliance with this SHOULD.  I still think
>> that it is important for interoperability to point out that clients will
>> have more successful exchanges if they heed this.  (From an
>> interoperability point of view, this is a statement that relieves
>> servers of a potential onus to somehow cater for clients that don't.)
>>
>>> Similarly, in 2.5:
>>>
>>>    Clients SHOULD strive to send all blocks of a
>>>    request without undue delay.
>>>
>>> (Similar text in 2.6)
>> (Ditto.)
> 
> I think I prefer to have some recommendations on what is "undue delay",
> if you can add some text.

Delay for which there isn't a good reason?

Another way to say this would be: "Servers will not go out of their way
to accommodate clients that take forever to finish a block-wise
transfer.  If the resource changes while this proceeds, the ETag for a
further block obtained may be different.  To avoid this happening all
the time for a fast-changing resource, a server MAY try to keep a cache
around for a specific client for a short amount of time, but the
lifetime for such a cache may be short, on the order of a few expected
round-trip times, counting from the previous block transferred."

Should we go to this level of detail here?

>>> In 2.9.2:
>>>
>>> Should probably also mention that this response code is also used for
>>>  mismatching content-format options
>> That is one way to see this; section 2.3 takes the view that mismatching
>> content-formats aren't reassembled into one body in the first place so
>> an incomplete body is the result of not having all parts.
>> (I added back reference in the editor's draft.)
> 
> What is the state of the resource in such condition?

We didn't make a guarantee here; after all, the client just violated a
MUST.  A good server will just reject a block-wise transfer with NUM≠0
and a different content-format than the current state of the resource:
Either it is stateless, and it matches up the content-format of the
block against that of the existing resource, or it is atomic, in which
case it matches up the block against the partially reassembled piece of
representation that is going to replace the state of the resource.

>>> In 2.10:
>>>
>>>    A response with a Block1 Option in control usage
>>>    with the M bit set invalidates cached responses for the target URI.
>>>
>>> Can you explain the reason for this?
>>
>> If the M bit had not been set the response would have been a final
>> response and would be used to update the cache entries for this URI.
>> Now, with the M bit set, we know that there will be a final response
>> later, but we don't know what that will be.  Continuing to serve a
>> previous response from the cache doesn't sound right.  But then, it
>> could be argued that the server just promised to perform the request
>> atomically later, so nothing has happened yet.  Good question.
>>
>>> In 3.2:
>>>
>>>    A stateless server that simply builds/updates the resource in place
>>>    (statelessly) may indicate this by not setting the more bit in the
>>>    response (Figure 8); in this case, the response codes are valid
>>>    separately for each block being updated.
>>>
>>> What is the behaviour of both the client and the server if PUT on a
>>> particular block fails? Is this clear enough in the document?
>> In the stateless case, the resource is now probably broken (unless the
>> resource is somehow intrinsically robust to this case).  The client
>> should not be using the resource (e.g., try to initiate a firmware
>> update from an image it just has been building).  The server is
>> stateless with respect to individual requests, so it is patiently
>> sitting there for the broken resource to be mended.
> 
> How can a resource be "mended" if a PUT failed? I think it would be
> reasonable for a server implementation to discard the whole accumulated
> payload, so there would be no way to mend it other than by uploading the
> whole thing from the beginning. If my interpretation is invalid, I
> welcome some clarifications on this.

If the server is stateless (in-place replace), the failed PUT may have
had no effect (which should be the case for 4.xx response), so the
client can try doing something else to that block.  If there was a 5.xx
response, that is harder to say.  But the real problem is that the
previous blocks may already have had an effect on the resource, so it
may be inconsistent/incomplete.

> So I think this needs more clarifying text, either describing what
> client might be able to do to fix the resource or explaining that the
> client need to restart upload.

Right, I'll try to separate out the cases and add some text (but not
here in the examples).

>>> Other questions I have after reviewing the document. If I missed where
>>> they are answered, please point me out to existing text in the document
>>> or another RFC:
>>>
>>> Is there a special error for block size mismatch between multiple requests?
>> A block size mismatch is not an error (maybe I don't understand the
>> question).
> 
> There are MUSTs in the document saying that if one end signalled a
> certain block size, the other party needs to use the same or smaller
> size. What happens if the other party doesn't obey this rule. Is there a
> special error code that can be used to signal that a request is rejected
> because the specified block size is too big?

Well, one problem that a client gets if it *increases* the block size is
that it probably can't hit the place where it was (which, at least
initially, is an odd number at the smaller size, so it's not integer at
the larger size).  Say, a client asks for 128 for block 0, but the
server only sends 64; then asking for 128 for block 1 is going to leave
a hole (byte 64 to 127).  Now, if the client instead asks for 64 for
block 1 and then goes back 128 for block 1 (!), the parts do fit
together, but I'd expect the server to be stubborn and again send 64 for
block 2 (!).  This is the Block2 case.  For Block1, we have 4.13.

>>> As block2 is a critical option, how can a server know if a particular
>>> client supports this option?
>> The assumption here is that CoAP clients generally do, unless they are
>> very specialized and never have to deal with non-trivial amounts of
>> information (such as a /.well-known/core).
> 
> Is this generally true for all COAP extensions or just some?

Just this one.  Block-wise transfers were part of the design for the
CoAP protocol from the start, and implementers have been aware that they
had to do block-wise in order to support non-trivial payload sizes (but
they don't have to do them if they don't need to).

> Extensibility for most protocols is done by capability
> discovery/negotiation and graceful service degradation in absence of a
> particular extension. This seems to violate this principle.

Right.  So, for instance Observe was designed so that it can be
gracefully ignored by a server that doesn't implement it.  We still put
a mechanism in RFC 6690 so a server can signal that it offers Observe
for a resource.  I would expect similar out of band information to be
provided for future extensions, so a client doesn't have to waste a
round-trip trying out the extension.  Block is slightly weird in that a
server may need to offer the (critical) extension unsolicited for an
unextended request; we'd try to avoid that for any new extension, but
here we do have the luxury.

Grüße, Carsten

[core] Incoming AD review of draft-ietf-core-bloc… Alexey Melnikov
Re: [core] Incoming AD review of draft-ietf-core-… Carsten Bormann
Re: [core] Incoming AD review of draft-ietf-core-… Alexey Melnikov
Re: [core] Incoming AD review of draft-ietf-core-… Carsten Bormann
Re: [core] Incoming AD review of draft-ietf-core-… Alexey Melnikov