Re: Last Call: draft-ietf-rmt-bb-norm-revised (Multicast Negative-Acknowledgment (NACK) Building Blocks) to Proposed Standard

Brian Adamson <adamson@itd.nrl.navy.mil> Mon, 14 April 2008 17:13 UTC

Return-Path: <ietf-bounces@ietf.org>
X-Original-To: ietf-archive@megatron.ietf.org
Delivered-To: ietfarch-ietf-archive@core3.amsl.com
Received: from core3.amsl.com (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 23A2B28C2ED; Mon, 14 Apr 2008 10:13:05 -0700 (PDT)
X-Original-To: ietf@core3.amsl.com
Delivered-To: ietf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C647E3A6B37; Fri, 11 Apr 2008 15:22:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AEQ6rUAS0kcD; Fri, 11 Apr 2008 15:22:00 -0700 (PDT)
Received: from s2.itd.nrl.navy.mil (s2.itd.nrl.navy.mil [132.250.83.3]) by core3.amsl.com (Postfix) with ESMTP id A1A7C3A6863; Fri, 11 Apr 2008 15:22:00 -0700 (PDT)
Received: from smtp.itd.nrl.navy.mil (smtp.itd.nrl.navy.mil [132.250.86.3]) by s2.itd.nrl.navy.mil (8.13.8+Sun/8.12.8) with SMTP id m3BMMLoK009680; Fri, 11 Apr 2008 18:22:23 -0400 (EDT)
Received: from [132.250.92.151] ([132.250.92.151]) by smtp.itd.nrl.navy.mil (SMSSMTP 4.1.16.48) with SMTP id M2008041118221932746 ; Fri, 11 Apr 2008 18:22:20 -0400
Mime-Version: 1.0
Message-Id: <p06240808c425883a5645@[132.250.92.151]>
In-Reply-To: <alpine.LRH.1.10.0804071058380.23953@netcore.fi>
References: <20080403140021.9211F28C5C9@core3.amsl.com> <alpine.LRH.1.10.0804071058380.23953@netcore.fi>
Date: Fri, 11 Apr 2008 18:22:17 -0400
To: Pekka Savola <pekkas@netcore.fi>, ietf@ietf.org
From: Brian Adamson <adamson@itd.nrl.navy.mil>
Subject: Re: Last Call: draft-ietf-rmt-bb-norm-revised (Multicast Negative-Acknowledgment (NACK) Building Blocks) to Proposed Standard
X-Mailman-Approved-At: Mon, 14 Apr 2008 10:13:04 -0700
Cc: adamson@itd.nrl.navy.mil, macker@itd.nrl.navy.mil, rmt@ietf.org
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ietf-bounces@ietf.org
Errors-To: ietf-bounces@ietf.org

Pekka,

I appreciate your comments here.  I plan to issue a new version of 
the draft that addresses these to the extent I can.  I have some 
questions about your concerns with comments in-line below:

At 11:01 AM +0300 4/7/08, Pekka Savola wrote:
>On Thu, 3 Apr 2008, The IESG wrote:
>>The IESG has received a request from the Reliable Multicast Transport WG
>>(rmt) to consider the following document:
>>
>>- 'Multicast Negative-Acknowledgment (NACK) Building Blocks '
>>   <draft-ietf-rmt-bb-norm-revised-04.txt> as a Proposed Standard
>>
>>The IESG plans to make a decision in the next few weeks, and solicits
>>final comments on this action.  Please send substantive comments to the
>>ietf@ietf.org mailing lists by 2008-04-17. Exceptionally,
>>comments may be sent to iesg@ietf.org instead. In either case, please
>>retain the beginning of the Subject line to allow automated sorting.
>
>Meta-level comments
>-------------------
>
>Looking at the document, my main question is, "is this ripe for standards
>track?".  Looking at it, my inclination would be to say "probably not, at
>least in parts *)".  As Section 6 says, there have not been 
>substantial changes
>since the preceding experimental RFC 3941 was published in 2004.  All the
>cited material (research etc.) predates RFC 3941.  So it seems that either
>1) there has not been significant experience since the Experimental document
>was published, 2) the experiences have been fully aligned with the earlier
>document ("the document was already good enough"), or 3) the lessons learned
>have not been reflected in this document revision.


I will see what updated references I can find.  As you mention later 
below, there are definitely updated references on SSM that would be 
better!

I think one aspect here is the fact that this is one of our RMT 
"Building Block" documents and the _general_ techiques that it 
describes with respect to NACK-based reliable multicast protocol 
design have stood the test of time including some that predates the 
Experimental (RFC 3941) specification.  There have been some long 
term deployment of these protocols, one of which I mention later 
below.  I think it is accepted within the RMT community that these 
are mature techniques.  I think your #2 above "the experiences have 
been fully aligned with the earlier document" is the case.  We 
actually had some considerable history with these types of protocols 
even prior to Experimental RFC publication.


>
>The one thing I'd have been interested in seeing is an applicability
>statement of reliable multicast and its different bits and pieces (beyond
>what's in Section 3.11) but that seems out of scope of this document.
>For example, it is not obvious to me which (if any) RMT mechanisms would be
>applicable in a context where I want to distribute video or voice where it
>isn't acceptable to buffer the stream too long to accommodate for data
>resends; it seems this NACK mechanism is geared towards bulk file transfer
>where this is not applicable.
>
>*) the parts I'm mostly concerned with are router assistance and 
>security (also touching the protocol/ops aspects when some receivers 
>are misconfigured or behind slow links).


The _focus_ of the current RMT protocols was purposefully scoped to 
address "bulk transfer".   I think this is described in the RFC3048 
(which this proposed document _should_ but fails to reference). 
While "bulk transfer" was the focus here, the Nack-Oriented Reliable 
Multicast (NORM) protocol (RFC 3941) does (RFC 3941 is a "Protocol 
Instantiation" that was derived from the earlier version of this 
"Building Block" document), in fact, provide for a _optional_ 
"stream" support that we have used for video and voice streams.  This 
feature was made "optional" so that one could have a compliant 
implementation that solely provided "bulk transfer" capability.


I agree that "router assistance" has not been followed through.  It 
was originally part of the RMT WG charter but we were unable to 
sustain activity in that area.

My _personal_ opinion is that since this is a "Building Block" 
document that it would not have been complete to fail to mention the 
_potential_ of intermediate system "assistance" (along the lines that 
was discussed in the working group at one point) to improve 
scalability and/or performance of NACK-based reliable multicast.  In 
fact, I have some work in progress in the context of wireless 
networks to re-examine what sort of "assistance" to end-to-end 
reliable flows that intermediate systems may be able to provide.  But 
this is certainly not a "fully-baked" area.  So this discussion 
_could_ be removed ... I suppose it will persist in the RFC 3941 for 
historical purposes until there is further interest in the area?

Similarly, since this is a "Building Block" document, it does not do 
an extremely deep dive on security solutions.  We have another 
document in development within the RMT WG that may serve to describe 
security vulnerabilities, etc for RMT protocols.  And, we (under the 
good guidance of our area director)  have strived in the revised 
"Protocol Instantiation" documents (which are detailed protocol 
specifications) to fully address security, providing a description of 
how to secure the protocols with IPsec, etc.


>
>Substantial
>-----------
>
>I was expecting to see some discussion of MTU and application framing issues
>with multicast.  Specifically, in a big multicast tree with dynamic
>membership, it could very well happen that when a new member joins, the
>lowest common denominator MTU decreases.  How is this scenario expected to
>be handled?   It may be that this issue has already been discussed somewhere
>else as it isn't specific to this document.

I think MTU discovery for multicast was not in the scope of the RMT 
working group.  In my personal opinion, I do think MTU discovery for 
multicast in general has not been well addressed, and there is more 
work that could be done here by someone.  Practical deployment tends 
to count on multicast apps to be properly preconfigured for MTU  (The 
RMT protocols do allow for configurable packet payload sizes to 
accommodate MTUs of different deployments).

It is _possible_ that the scalable feedback mechanisms described here 
_could_ be applied to find the lowest MTU (the techniques are used to 
get scalable feedback of group-wide minima/maxima for purposes of 
congestion control, NACK suppression, etc as mentioned in this 
document and some of the other RMT building blocks (e.g., TFMCC)).



>
>I doubt router/intermediate system assistance has seen very wide deployment
>and I don't think it is very feasible to expect to see that.  As this
>document is moving to Standards Track I would very much like to remove any
>recommendations for router assistance because I don't see those being
>implemented in any significant router implementation.  That means removing
>and rewording e.g. sections 2.7, 2.4, 3.10 and some others.


See my comments above on this.  This change could be made.


>
>    The sender's transmissions SHOULD make good utilization
>    of the available capacity (which may be limited by the application
>    and/or by congestion control).
>
>How do you figure out what is the available capacity?  Are you 
>referring to the capacity on sender's uplink or the collective 
>capacity of the receivers or both?  Do you adapt to the lowest 
>common denominator of all receivers (e.g., document previously 
>quoted 56Kbit/s modems..)?  Does this have security impact? (Similar 
>comment would apply to MTU/application framing aspects already 
>mentioned above.)

The TFMCC (congestion control) building block addresses automated 
rate adjustment.  We have made congestion control distinct from 
reliability.  It is a "single-rate" scheme and is subject to the 
lowest common denominator of all receivers.  If this is not 
sufficient or acceptable for an application, additional mechanisms to 
eject poor-performing receivers from the group may be needed.

The intent of the sentence above is that the protocol should strive 
to not have "dead-air" time to the extent possible.  In the past, 
some reliable multicast protocols (incl. NACK based) have had 
sender/receiver interaction conducted in distinct "rounds" (i.e. the 
sender sends some data and then waits for feedback before continuing 
or some variants of that) and has resulted in poor goodput.

I agree that some clarification of that statement above is needed to 
make this point clear.



>
>    In absence of a group size determination mechanism
>    a default group size value of 10,000 is RECOMMENDED for reasonable
>    management of feedback given the scalability of expected NACK-based
>    reliable multicast usage.
>
>What is the impact of this recommendation?  Is it safer to recommend 
>too small or big?  Given that this would likely be close to a world 
>record in production multicast group size, I'm not sure if this 
>recommendation is reasonable; if it is deemed reasonable, it would 
>be nice to have a citation justifying the number.  This is one area 
>where figures based on experimentation would have helped. However, 
>if recommending too big doesn't cause a problem even when the 
>typical default size would be 10, 50 or 100 receivers, then it would 
>be OK.


With the timer-based feedback suppression mechanism described, the 
"group size" estimate doesn't have to be very accurate to work and it 
is "safer" (with respect to impact on the network) to err on a larger 
group size.

In retrospect, the 10,000 value that was recommended was based on 
closer to the maximum group size that these protocols may be useful 
for.

In fact, the U.S. Postal Service has used a NACK-based protocol to 
deliver bulk data content to a group of 10,000 - 20,000 receivers in 
a single multicast group over a fairly limited IP-based VSAT delivery 
system.  This system was (and still is as far as I know) been used 
operationally on a daily basis for more than 5 years.

One of the references for this document is some work I did to assess 
(and to predict) the volume of feedback of these types of protocols 
with group sizes through this scale.

I can probably provide more clarification on the impact ... erring on 
the large size may add some extra latency to the NACK-based 
reliability process and require more buffering in the implementation 
to maintain state.


>
>    NACK-based reliable multicast is compatible with IP security (IPsec)
>    authentication mechanisms [RFC4301] that are RECOMMENDED for
>    protection against session intrusion and denial of service attacks.
>
>The details how one might apply IPsec to the unicast channel are absent.
>I'm not commenting on the multicast delivery part because that is somewhat
>covered though details are fuzzy.  Unicast has two major issues that I did
>not see clearly addressed:
>
>  1) malicious, misconfigured or under-performing (beyond small capacity
>     links etc.) receivers.  Is there even a way to differentiate between
>     these classes of receivers?  When these send a lot of NACK feedback,
>     progress of the stream is deterred.  How do you deal with this issue
>     (this is partly operations, protocol, and security problem)?


This is an issue.  I did try to point out that (but perhaps still too 
subtly) in the "Security Considerations" section.  The idea in the 
text there was to point out that SSM operation eliminates direct 
receiver<->receiver messaging, simplifying security such that only 
the sender need to authenticate/trust receiver operations.  For the 
case of IPsec, that means the sender implementation may have alot of 
Security Association state depending upon group size.  But I thought 
it beyond the scope of this "Building Block" document to go into the 
details of this.  It should more thoroughly addressed in any 
"Protocol Instantiations" that are made.


>
>  2) receiver authentication for the feedback back-channel; how could you
>     do it?  This seems unfeasible in practise if the expected default
>     group sizes (e.g. the recommended default of 10,000 receivers) would
>     be realized.

There indeed may be practical limitations on group size due to 
security.  I suppose it is comparable to a server that would need 
maintain alot of simultaneous secure TCP connections?  But again I am 
not sure it is in the scope of the NACK Building Block document to 
predict scalability limits of IPsec implementations?  But I suppose 
that some language could be added to point out these issues.  Would 
that address your concerns here?




>
>editorial
>---------
>
>The document header should have "Obsoletes: 3941" or similar; likewise in
>abstract/introduction.
>
>[McastModel] refers to a (good) SSM PhD dissertation, but I'd say reference
>to either RFC3569 or RFC4607 is probably more readily available and more
>appropriate in the IETF context.
>
>    1.  Multicast Sender Transmission
>
>    2.  NACK Repair Process
>
>    3.  Multicast Receiver Join Policies
>
>    1.  Node (member) Identification
>...
>
>In section 3, the building block numbering wraps around; there are two
>instances of building blocks 1-3.


I will fix _all_ of these.



-- 
Brian
__________________________________
Brian Adamson
<mailto:adamson@itd.nrl.navy.mil>
_______________________________________________
IETF mailing list
IETF@ietf.org
https://www.ietf.org/mailman/listinfo/ietf