RE: [NSIS] AD review: draft-ietf-nsis-ntlp-09 (M1: congestion control/rate limiting issues)

hi,

> -----Original Message-----
> From: Magnus Westerlund [mailto:magnus.westerlund@ericsson.com] 
> Sent: 16 June 2006 15:27
> To: Hancock, Robert
> Cc: nsis@ietf.org
> Subject: Re: [NSIS] AD review: draft-ietf-nsis-ntlp-09 (M1: 
> congestion control/rate limiting issues)
> 
> 
> Hancock, Robert wrote:
> > Hi all,
> > 
> > comments on the congestion control/rate limiting issues
> > (M1 and in fact L16-part as well):
> > 
> >> -----Original Message-----
> >> From: Magnus Westerlund [mailto:magnus.westerlund@ericsson.com] 
> >> Sent: 01 June 2006 15:23
> >> To: nsis@ietf.org
> >> Subject: [NSIS] AD review: draft-ietf-nsis-ntlp-09
> >>
> >> M1. Congestion Control
> >>
> >> This affects text not only in 5.3.3 but also 7.1.3 and 
> possibly other 
> >> places. But I do have a general concern that the 
> congestion control 
> >> measurements described in the specification is underspecified.
> >>
> >> First in 5.3.3 I don't see any normative minimal values, or even 
> >> recommended values for T1 and T2 that will be safe to 
> deploy on the 
> >> internet. I don't find it acceptable that the developer needs to 
> >> investigate which values that are safe to use and which are 
> >> not. There 
> >> should also be some criteria documented for when it is 
> >> acceptable to go 
> >> beyond these values. So my concern here is that the 
> >> retransmission runs 
> >> havoc and create way to many packets to be sent.
> > 
> > This is a fair point. I suppose it is impossible to get away
> > permanently without giving concrete numbers in this case.
> > 
> > In terms of what sort of values to use and how to describe the
> > constraints on them, I think a good model to follow is probably
> > the SIP INVITE transaction specification (17.1.1.2 of rfc3261)
> > which says [fortunately the timer names are the same...]
> > 
> > "  The default value for T1 is 500 ms.  T1 is an estimate of the RTT
> >    between the client and server transactions.  Elements 
> MAY (though it
> >    is NOT RECOMMENDED) use smaller values of T1 within 
> closed, private
> >    networks that do not permit general Internet connection. 
>  T1 MAY be
> >    chosen larger, and this is RECOMMENDED if it is known in advance
> >    (such as on high latency access links) that the RTT is larger.
> >    Whatever the value of T1, the exponential backoffs on 
> retransmissions
> >    described in this section MUST be used."
> > 
> > and later that T2 should be 64*T1. In our case, 500ms seems 
> a reasonable
> > default also; I think T2<=64*T1 since there is a separate 
> bound on the
> > T2 value from the signalling application (see the second 
> paragraph of
> > 5.3.3). I would be tempted to relax the NOT RECOMMENDED 
> clause, since
> > a smaller timeout would be valid and possibly quite useful 
> on a wider
> > range of networks, in particular Internet-connected networks but
> > where it is known that the Query should be answered within the local
> > network. Comments and text suggestions welcome.
> 
> I think this is reasonable start. It seems to be to be a good idea to 
> allow the adaptation of the timers based on feedback from the 
> peer. It 
> seems possible to use the GIST query and the response to measure the 
> current RTT. Allowing modification of the timers to track 
> that value + 
> some processing delay + fudge factors seems one way forward.

ok, we can work on including this.

> 
> 
> > 
> >> Secondly, there are also no values documented for the rate 
> control. I 
> >> think it is necessary to document what internet safe values 
> >> are here so 
> >> that one does not cause problems. In addition is seems a bit 
> >> simplistic 
> >> to use a token bucket with some parameters selected based on 
> >> the local 
> >> link as GIST clearly sends messages beyond the local link. 
> Thus one 
> >> might have to consider being a bit smarter and more adaptive 
> >> to what is 
> >> seen for different flows.
> > 
> > Here I am not so sure. The text here was informed by the equivalent
> > discussion for ICMPv6 (reference [26], now RFC4443), which caused
> > an extensive thread on the v6 mailing list (start at
> > http://www1.ietf.org/mail-archive/web/ipv6/current/msg01343.html +
> > another 60 or so messages).
> > 
> > GIST is not ICMP but many of the same issues arise: messages are
> > generated in the IP 'control plane' (in so far as this is a
> > meaningful term), partly autonomously but mainly in response to
> > events initiated by end systems, the messages go beyond the local
> > link, rules have to be written so they apply to a host and a core
> > router and everything in between. The end result for ICMP was to 
> > write something minimal. (The link bandwidth here is used as an
> > indicator for where a router is in a network - core/access/whatever.
> > It's clearly imperfect but there's nothing else apart from dynamic
> > adaptiveness to make use of, and fixed values seem even worse.)
> > 
> > We'd really like to avoid adaptiveness in the D-mode state machine.
> > The main use of D-mode should be for Queries/Responses for which 
> > adaptation is not meaningful for initial messages (there is no
> > pre-existing state); if there is a large amount of signalling data
> > to send for a given flow, then GIST should transition to C-mode
> > anyway, and the rate limits chosen to be cautious to encourage
> > that. We aimed for robustness and simplicity rather than 
> performance.
> > 
> > It might be possible to use some sort of adaptiveness to select
> > an appropriate rate to apply to refresh queries used for GIST
> > probing (see also your point at the end of L16). The 
> current situation
> > is that you can probe as fast as you like until you hit the 
> rate limit,
> > and that it's up to the implementer to decide how fast is really
> > necessary depending on an assessment of route stability (for which
> > I don't know any good objective estimator). On the assumption that 
> > most probes will go to the peer you already know about, one could
> > refine this to apply a separate token bucket limiter for probe 
> > messages towards that peer, which was adapted according to knowledge
> > of congestion state with that peer (based on message loss). We need
> > input on whether that complexity is really necessary however, since
> > it doesn't change the situation for the whole of D-mode but just a
> > particular subset of it.
> 
> I can understand the reluctance to use adaptation. I am also 
> fine with 
> not using adaptation as long as the traffic is not causing any sever 
> problem. The question is if the parameters present in the 
> ICMP RFC are 
> good enough also for GIST. Has someone made any analysis of 
> this? What 
> amount of bit-rate are we talking about. Do we need to 
> provide guidance 
> on when the default parameters present in ICMP is appropriate 
> to use for 
> GIST?

that would be too easy ... ;-)

the end result of the ICMP discussion (as I read it) has been that
there *are* no default parameters in the ICMP spec. (The relevant
section is 2.4(f) of rfc4443.) There is a definition of a set of
parameters (bucket rate N and size B), but the closest one gets to
parameters is "For example, in a small/mid-size device, the possible 
defaults could be B=10, N=10/s." which is about as tentative as it
comes. And I would not say that the numbers are that useful; what to
reuse would be the text that says
- it's configurable
- take into account where you are in the network
(and implicitly that limiting based purely on inter-packet timers
or absolute bandwidth caps are not useful, which was what was removed
compared to the previous version of the spec.)

There have been analysis documents in the distant past about message
rates, but only applying to QoS (based on RSVP experience) and even
they have been controversial because they depend so strongly on
assumptions
about refresh reduction. Now we have a wider range of signalling
applications, so the situation is likely to be even less clear.

the best advice to implementors I can think of would be to say that 
*) the bucket size (B) should be related to the number of signalling
   sessions that the node is expected to support
*) the rate should allow the bucket to be discharged in a second

(and these would be SHOULD starting points).

That still doesn't give an actual number, but it does relate the number
to a more 'objective' system sizing parameter that implementors will
have to think about anyway.

> 
> 
> > 
> >> Third, the implication and congestion issues with local 
> >> repair seems to 
> >> have been brushed over. Section 7.1.3 do indicate that you 
> >> need to take 
> >> care, but nothing more. Are there some potential for 
> >> aggregation of the 
> >> queries to minimize the load and have quicker convergence?
> > 
> > There is no mechanism that I can think of. Certainly there is
> > no aggregation possible in general, since every affected flow
> > might be affected differently, especially if next-NSIS-routers
> > are many hops away. We depend on the rate limiting to prevent
> > the generated Queries causing a flood, but that's about it. 
> > (There are aggregation techniques for transmitting the 
> notifications,
> > but they take place at the NSLP level.)
> > 
> 
> Okay, it might not be a real issue. As long as there is no 
> risk for the 
> local repair to cause instabilities due to its caused load preventing 
> other operations from happening due the the repair. But if I 
> understand 
> correctly most keep-alive towards NSLPs are done in C-mode.

certainly, signalling application state refresh (distinct from
routing state refresh) can be done this way.

cheers,

r.

> 
> Cheers
> 
> Magnus Westerlund
> 
> Multimedia Technologies, Ericsson Research EAB/TVA/A
> ----------------------------------------------------------------------
> Ericsson AB                | Phone +46 8 4048287
> Torshamsgatan 23           | Fax   +46 8 7575550
> S-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com
> 

_______________________________________________
nsis mailing list
nsis@ietf.org
https://www1.ietf.org/mailman/listinfo/nsis