RE: [NSIS] AD review: draft-ietf-nsis-ntlp-09 (M1: congestioncontrol/rate limiting issues)

Hi all,

>> M1. Congestion Control
>> 
>> This affects text not only in 5.3.3 but also 7.1.3 and 
>possibly other 
>> places. But I do have a general concern that the congestion control 
>> measurements described in the specification is underspecified.
>> 
>> First in 5.3.3 I don't see any normative minimal values, or even 
>> recommended values for T1 and T2 that will be safe to deploy on the 
>> internet. I don't find it acceptable that the developer needs to 
>> investigate which values that are safe to use and which are not.
There 
>> should also be some criteria documented for when it is acceptable to 
>> go beyond these values. So my concern here is that the retransmission

>> runs havoc and create way to many packets to be sent.
>
>This is a fair point. I suppose it is impossible to get away 
>permanently without giving concrete numbers in this case.
>
>In terms of what sort of values to use and how to describe the 
>constraints on them, I think a good model to follow is 
>probably the SIP INVITE transaction specification (17.1.1.2 of 
>rfc3261) which says [fortunately the timer names are the same...]
>
>"  The default value for T1 is 500 ms.  T1 is an estimate of the RTT
>   between the client and server transactions.  Elements MAY (though it
>   is NOT RECOMMENDED) use smaller values of T1 within closed, private
>   networks that do not permit general Internet connection.  T1 MAY be
>   chosen larger, and this is RECOMMENDED if it is known in advance
>   (such as on high latency access links) that the RTT is larger.
>   Whatever the value of T1, the exponential backoffs on
retransmissions
>   described in this section MUST be used."
>
>and later that T2 should be 64*T1. In our case, 500ms seems a 
>reasonable default also; I think T2<=64*T1 since there is a 
>separate bound on the
>T2 value from the signalling application (see the second 
>paragraph of 5.3.3). I would be tempted to relax the NOT 
>RECOMMENDED clause, since a smaller timeout would be valid and 
>possibly quite useful on a wider range of networks, in 
>particular Internet-connected networks but where it is known 
>that the Query should be answered within the local network. 
>Comments and text suggestions welcome.

What do you think of this, Magnus?

>> Secondly, there are also no values documented for the rate control. I

>> think it is necessary to document what internet safe values are here 
>> so that one does not cause problems. In addition is seems a bit 
>> simplistic to use a token bucket with some parameters selected based 
>> on the local link as GIST clearly sends messages beyond the local 
>> link. Thus one might have to consider being a bit smarter and more 
>> adaptive to what is seen for different flows.
>
>Here I am not so sure. The text here was informed by the 
>equivalent discussion for ICMPv6 (reference [26], now 
>RFC4443), which caused an extensive thread on the v6 mailing 
>list (start at 
>http://www1.ietf.org/mail-archive/web/ipv6/current/msg01343.htm
>l + another 60 or so messages).
>
>GIST is not ICMP but many of the same issues arise: messages 
>are generated in the IP 'control plane' (in so far as this is 
>a meaningful term), partly autonomously but mainly in response 
>to events initiated by end systems, the messages go beyond the 
>local link, rules have to be written so they apply to a host 
>and a core router and everything in between. The end result 
>for ICMP was to write something minimal. (The link bandwidth 
>here is used as an indicator for where a router is in a 
>network - core/access/whatever.
>It's clearly imperfect but there's nothing else apart from 
>dynamic adaptiveness to make use of, and fixed values seem even worse.)
>
>We'd really like to avoid adaptiveness in the D-mode state machine.
>The main use of D-mode should be for Queries/Responses for 
>which adaptation is not meaningful for initial messages (there 
>is no pre-existing state); if there is a large amount of 
>signalling data to send for a given flow, then GIST should 
>transition to C-mode anyway, and the rate limits chosen to be 
>cautious to encourage that. We aimed for robustness and 
>simplicity rather than performance.
>
>It might be possible to use some sort of adaptiveness to 
>select an appropriate rate to apply to refresh queries used 
>for GIST probing (see also your point at the end of L16). The 
>current situation is that you can probe as fast as you like 
>until you hit the rate limit, and that it's up to the 
>implementer to decide how fast is really necessary depending 
>on an assessment of route stability (for which I don't know 
>any good objective estimator). On the assumption that most 
>probes will go to the peer you already know about, one could 
>refine this to apply a separate token bucket limiter for probe 
>messages towards that peer, which was adapted according to 
>knowledge of congestion state with that peer (based on message 
>loss). We need input on whether that complexity is really 
>necessary however, since it doesn't change the situation for 
>the whole of D-mode but just a particular subset of it.

Comments on this?

>> Third, the implication and congestion issues with local repair seems 
>> to have been brushed over. Section 7.1.3 do indicate that you need to

>> take care, but nothing more. Are there some potential for aggregation

>> of the queries to minimize the load and have quicker convergence?
>
>There is no mechanism that I can think of. Certainly there is 
>no aggregation possible in general, since every affected flow 
>might be affected differently, especially if next-NSIS-routers 
>are many hops away. We depend on the rate limiting to prevent 
>the generated Queries causing a flood, but that's about it. 
>(There are aggregation techniques for transmitting the 
>notifications, but they take place at the NSLP level.)

Comments on this? I'm wondering if we need to specify anything here,
or just indicate what one should be aware of.

John

_______________________________________________
nsis mailing list
nsis@ietf.org
https://www1.ietf.org/mailman/listinfo/nsis