Re: [icnrg] Comments on draft-oran-icnrg-flowbalance-02.txt

"David R. Oran" <daveoran@orandom.net> Sun, 09 February 2020 14:40 UTC

Return-Path: <daveoran@orandom.net>
X-Original-To: icnrg@ietfa.amsl.com
Delivered-To: icnrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ACCFD12004A for <icnrg@ietfa.amsl.com>; Sun, 9 Feb 2020 06:40:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id csSPQP10KuUV for <icnrg@ietfa.amsl.com>; Sun, 9 Feb 2020 06:40:22 -0800 (PST)
Received: from spark.crystalorb.net (spark.crystalorb.net [IPv6:2607:fca8:1530::c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DCFEA12001B for <icnrg@irtf.org>; Sun, 9 Feb 2020 06:40:22 -0800 (PST)
Received: from [192.168.15.102] ([IPv6:2601:184:407f:80ce:314c:6cb:6c24:63fe]) (authenticated bits=0) by spark.crystalorb.net (8.14.4/8.14.4/Debian-4+deb7u1) with ESMTP id 019EeEUt007114 (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Sun, 9 Feb 2020 06:40:16 -0800
From: "David R. Oran" <daveoran@orandom.net>
To: Ken Calvert <calvert@netlab.uky.edu>
Cc: ICNRG <icnrg@irtf.org>
Date: Sun, 09 Feb 2020 09:40:09 -0500
X-Mailer: MailMate (1.13.1r5678)
Message-ID: <695A5658-1C08-486F-9B1D-83BA2DFDC969@orandom.net>
In-Reply-To: <EA2486FA-BFCA-4499-B347-2BDE5CF1A955@netlab.uky.edu>
References: <EA2486FA-BFCA-4499-B347-2BDE5CF1A955@netlab.uky.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/icnrg/tUP6HUlrW63Z0pmuJrjE_Di3MQc>
Subject: Re: [icnrg] Comments on draft-oran-icnrg-flowbalance-02.txt
X-BeenThere: icnrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Information-Centric Networking research group discussion list <icnrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/icnrg>, <mailto:icnrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/icnrg/>
List-Post: <mailto:icnrg@irtf.org>
List-Help: <mailto:icnrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/icnrg>, <mailto:icnrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Feb 2020 14:40:25 -0000

Ken,

Thanks for these really insightful comments. I find them really helpful. 
I’m going to wait a bit to re-spin the draft to get more comments from 
other folks, but in the meantime here are some responses and thoughts 
based on your reading of the draft.

On 5 Feb 2020, at 15:46, Ken Calvert wrote:

> Hi Dave -
>
> I've just read your interesting flow balance draft. (I am back at 
> Kentucky full-time now, trying to get up to speed on ICN 
> developments.)
>
> Some comments follow.  Caveat: I'm not a congestion control guy, so I 
> don't really have a good feel for the importance of the problem being 
> addressed, and everything I say should be discounted appropriately.  
> Also, feel free to respond on the list or not, whatever you think is 
> best.
>
The usual modesty from somebody who knows what they’re talking about! 
:-)

> High-level comments/questions:
>
> The question occurred to me whether the problem is significant enough 
> to merit the additional complexity in the forwarder and the protocol, 
> especially adding another piece of information to be included in 
> Interests, the veracity of which has to be monitored/compensated.  In 
> particular:  the context here, as I understand it, is a CC scheme 
> where downstream BW is allocated/reserved on a byte basis.  Presumably 
> that has to somehow be related to the passage of time to get a rate.

Yes, since bandwidth is measured in bytes, ultimately you need to 
allocate link resources in bytes, accounting for edge conditions like 
minimum packet sizes, header overhead, inter-packet gaps, etc. 
Rate-based congestion control schemes in fact directly include time in 
the calculation, while window-based schemes look only at the outstanding 
byte count relative to the available buffer capacity. The latter then 
gets estimated using some heuristic, like the square-root limit assuming 
you need a BDP (bandwidth-delay product) of buffering with delay being 
the time divisor.

> So is there an implicit assumption that the forwarder is adjusting 
> reservations to account for time passing?

In rate-based schemes, yes, this is done directly. Another complicating 
factor (pointed out in an earlier set of comments by Klaus Schneider) is 
that many link types have themselves varying bandwidth (e.g. wireless or 
tunnels) so the divisor of link capacity is not a constant either. This 
all essentially devolves to guessing, with guesses either being 
optimistic and mis-estimation ameliorated by some kind of adaptive queue 
management (AQM), or pessimistic assuming worst-case RTTs if you need to 
meet hard service guarantees (with the concomitant loss of capacity).

> In the middle of the network a node doesn't know how long it will take 
> to get the data back (beyond the upper bound provided by the PIT 
> timeout).

Correct, which is why in separate papers/drafts, I and others argue to 
make Interest lifetimes be on measured RTT scale rather than application 
response time scale.

> I wonder (N.B. I haven't done any calculations, though it should be 
> straightforward to do some BoE estimation to convert between size and 
> delay variation for different link speeds) how inaccuracy of rate 
> estimation due to the dynamic range of the Data size compares to other 
> sources of inaccuracy due to unknowable aspects like upstream path 
> length and losses?  I'm guessing this is old news, but maybe it's 
> worth saying something about it explicitly.
>

This is a really good idea to explore further. Let me think on it. I 
will observe however, that knowing the data sizes more-or-less 
accurately can only make things better. You are right to question 
whether the added complexity makes enough of a difference given the 
other error sources.

Perhaps it would make sense in the draft to be more explicit in showing 
that once you have a congestion control algorithm that is counting 
outstanding interests and keeping that state (as a number of the cited 
schemes do) the additional overhead of parsing one number and 
adding/subtracting that rather than adding/subtracting 1 is minimal. If 
you have just a simple AQM scheme that doesn’t track interests, you of 
course would not bother counting bytes either.

(aside: one of the arguably big advantages of CCNx/NDN hop-by-hop state 
is that you can do hop-by-hop congestion control. Simple AQM on 
returning data negates nearly all of that advantage, since you end up 
falling back on end-to-end machinery and timeouts when you throw data 
away. It’s not quite so bad as TCP, since you can cache the data 
upstream of the bottleneck that dropped the data with AQM, but you still 
have a pretty severe timeout penalty compared to rejecting the interest 
with explicit feedback that allows the consumer to immediately retry 
after adjusting his rate/window.)

> Also, you are essentially proposing a network-standard maximum "chunk 
> size" here (64KB), as a by-product of specifying a maximum dynamic 
> range. Seems like that might merit a discussion all its own, although 
> you do a good job motivating it. (Or maybe that discussion has already 
> happened.)
>

Well, I’m only taking the number that is allowed by CCNx, which is 
considerably larger than the current compiled-in 4K maximum in the NDN 
code base. Some folks, like Lixia, have argued that even larger chunk 
sizes might make sense and to allow them, but in a completely unrelated 
thread I argue that history shows that the likelihood of this being a 
good tradeoff is essentially zero.

> Specific comments:
>
> Introduction - p. 3 - Path MTU/"Link PMTU" - what's the meaning of 
> that term in this context?  The traditional interpretation doesn't 
> seem to apply.  Since you assume the use of a fragmentation protocol, 
> seems like you could just say "MTU".
>

I suppose so, but the fragmentation scheme is either different or 
possibly a single scheme operating differently for the single link case 
(for which a hop-by-hop only fragmentation scheme is adequate) and the 
Path MTU case. If you know the Path MTU either a priori or by 
measurement, a multi-hop scheme with cut-through like the one cited 
[Ghali2013] has substantial advantages. I didn’t want to make this 
specification about fragmentation if possible, since the flow balance 
issues are independent aside from the slight bandwidth efficiency lost 
through a higher ratio of header size to data size after fragmentation.

I think it makes sense for the reader to have the distinction clear. If 
it still isn’t I’ll try to come up with some additional explanation 
that doesn’t wander too far off-topic.

> Section 3. - Last para on p.4 - "by accepting one Interest packet from 
> an [sic] downstream node, implicitly this provides a guarantee (either 
> hard or soft) that there is sufficient bandwidth on the inverse 
> direction of the link to send back one Data packet."  - What does it 
> mean to "accept" an incoming Interest packet?  It seems to me that 
> saying instead "...by forwarding one Interest packet..." would be more 
> precise and accurate.

I’ll change this, since the corner cases of accepting an Interest and 
putting it the transmission queue as opposed to actually succeeding at 
forwarding it are not relevant to the discussion.

> Also, in this context, where you are assuming fragmentation for larger 
> objects, the meaning of "Data packet" is ambiguous.  Suggest "Data 
> message", which I think you've used elsewhere in the document.
>
Yes. good change.

> Second paragraph on p.5 - "This allows...to accurately allocate 
> bandwidth on the inverse path for the returning Data message."  
> Suggest "to more accurately allocate" - as noted above, there are 
> other sources of inaccuracy.
>
Yes. Good change.

> Aside at the end of Section 3.1 - can you cite one of the 
> existing/proposed fragmentation protocols as a "well-designed" example 
> that satisfies your requirement?
>

A bit self-serving, since I’m a co-author, but already cited 
[Ghali2013] is multi-hop with cut through, and had good security 
properties to boot! I’ll include the cross-reference here to help the 
reader.

> First full para on p.7 - T_MTU_TOO_LARGE - I am confused by this name. 
>  The error being indicated doesn't seem to have anything to do with 
> MTU. If I understand rightly, the meaning is that the returned Data 
> object is larger than the *estimated* size that was in the Interest 
> (or the size used in allocating BW).  In any case, it's not the MTU 
> that's too big!
>

Well, T_MTU_TOO_LARGE is an existing CCNx error code, and I hoped the 
text was clear the it would be wrong to hijack this existing error code 
for the case cited. I’ll make it clear that I’m not inventing a new 
error. I don’t want to revisit the names of existing error codes; I 
think the intended reading of the error code name is “Packet is too 
large to fit in the MTU”.

> Same paragraph - "When the Interest Return eventually arrives back to 
> the issuer of the Interest, the user can, [insert: if] they desire, 
> reissue the Interest..."
> Better(?): "the user MAY reissue the Interest..." - since you included 
> the requirements language paragraph.
>

Yes, I like the requirements language formulation. Thanks.

> Last paragraph of Section 3.3 Handling 'too small' cases - this seems 
> like it should be the first paragraph of Section 3.4.
>

Yes. Good catch. Will fix.

> Section 3.4 - Last paragraph on p. 8 - missing closing paren after 
> "Data".
>

Will fix.

> Section 3.5 - I have trouble parsing the first sentence.  Is there a 
> missing word somewhere?

Gack. How’d I let that slip by? Probably the ugliness of calling a 
hop-by-hop option “optional” and then deleting the redundancy. 
I’ll rewrite the sentence.

>
> Section 4, second paragraph - "This is at most a minor concern given 
> the above discussion of overestimation by honest clients."  Suggest a 
> more specific pointer than "above" to where you make the argument 
> (i.e., Section 3.2).

I’ll put in an explicit cross-reference.

> More substantively, you are claiming that accurate size estimates are 
> important enough that we need to add a fair amount of complexity to 
> the network infrastructure.  This seems to be arguing the opposite 
> position!?  If it is a minor concern to mitigate fake estimates that 
> are off by the whole dynamic range of the parameter, then one might 
> ask why do we need the T_DATASIZE facility at all?
>

I’m trying to get the point across that if you handle mistakes by 
honest clients reasonably, the malicious clients are adequately handled 
as well, not that mis-estimation in general is not an important problem. 
I’ll try to clarify.

> Cheers,
>
> Ken

DaveO