Re: [tcpm] Genart last call review of draft-ietf-tcpm-rto-consider-14

See a few comments (marked GF) from the perspective of other transport 
RFCs, in case this helps you find text...

-------- Forwarded Message --------
Subject: 	Re: [tcpm] Genart last call review of 
draft-ietf-tcpm-rto-consider-14
Date: 	Thu, 18 Jun 2020 11:00:15 +0100
From: 	Stewart Bryant <stewart.bryant@gmail.com>
To: 	Martin Duke <martin.h.duke@gmail.com>
CC: 	tcpm <tcpm@ietf.org>, Review Team <gen-art@ietf.org>, Mark Allman 
<mallman@icir.org>, Last Call <last-call@ietf.org>, Stewart Bryant 
<stewart.bryant@gmail.com>, tom petch <daedulus@btconnect.com>, 
draft-ietf-tcpm-rto-consider.all@ietf.org

> On 17 Jun 2020, at 18:20, Martin Duke <martin.h.duke@gmail.com 
> <mailto:martin.h.duke@gmail.com>> wrote:
>
> Hi Stewart,
>
> If there are no further objections, I'm going to declare consensus.
>
> On Thu, Jun 11, 2020 at 1:45 PM Martin Duke <martin.h.duke@gmail.com 
> <mailto:martin.h.duke@gmail.com>> wrote:
>
>     Stewart,
>
>     do we need more cycles for this, or is draft-15 sufficient to
>     address your concerns?
>
>     On Mon, Jun 8, 2020 at 12:52 PM Mark Allman <mallman@icir.org
>     <mailto:mallman@icir.org>> wrote:
>
>
>         Hi Stewart, et.al <http://et.al/>.!
>
>         I just submitted a new version of rto-consider. Please ask the
>         datatracker for diffs between this and rev -14.  The highlights:
>
>           - The diffs with the last rev are here:
>         https://tools.ietf.org/rfcdiff?difftype=--hwdiff&url2=draft-ietf-tcpm-rto-consider-15.txt
>

*In the general case, delay across a network path depends not only on 
distance, but also a number of variable components such as the route and 
the level of buffering in intermediate devices.*

Its is more the contending/conflicting traffic rather than the 
buffering, or perhaps the time spent in queues, but “buffering” is a 
link a transport colloquial term.

GF: The word being sought might be "queueing" (I think that buffering is 
thought of as memory- and hence max queue).

*Since our wide-area network paths are best effort, packet loss is a 
regular occurrence. *

No the best effort Internet experiences this. There ate many well 
engineered WAN that do not.

What I am not seeing is clearer text that distinguishes between user 
traffic and “engineering” traffic that is used to make the network work, 
and between the end to end traffic and traffic within an AS that may be 
there for other purposes (high value service also offered by the 
provider) and WANs that are well engineered.

Perhaps we could include a clearer disclaimer regarding the 
non-best-effort-internet-end-to-end traffic?

You have some text on this down in section 2 but it is a bit buried.

Perhaps something early on of the form: This document is specially 
concerned with end to end behaviour over the best effort Internet. As 
noted in section 2 it may not me applicable to other types of WAN, or to 
the  traffic used in affecting the operation of the Internet itself.

GF: Actually, I do think a well-engineering WAN can be in scope of your 
spec. The two wrods I was expecting were "controlled environment" or 
"pre-provisioned" capacity, these might not see the same oath 
properties. A DC is typically regarded in transport specs as a 
"controlled environment".

*An exception to this rule is if an IETF standardized mechanism 
determines that a particular loss is due to a non-congestion event 
(e.g., packet corruption). *

That is a bit heavy. It should be “a protocol” there than an IETF 
standardarized mechanism. The IETF does not have a monopoly on 
pre-blessing protocols before they are deployed.

GF: Unsure myself what is needed - isn't this guidance for design of 
protocol mechansims?

>
>           - All small comments addressed.
>
>           - I think we all agree that this is not a one-size-fits-all
>             situation.  Rather, this document is meant to be a default
>         case.
>             So, the main action of this rev is to make that point more
>             clearly.  The first paragraph in the intro is new.  Also,
>         there
>             are some more words fleshing out the context more in
>         section 2.
>             In particular, more emphatically making the point that other
>             loss detectors are fine for specific cases.
>

As I note above from a routing and packet transport (as opposed to the 
transport layer) perspective I think we should more clearly recognise at 
the beginning the fact that this is for the worst case network, not for 
well engineered (WAN and DC) networks  and the mechanisms fundamental to 
the operation of the network itself.

>
>           - The first paragraph in the intro also makes clear we adopt the
>             loss == congestion model (as that is the conservative default,
>             not because it is always true).
>
>           - I made one other change that wasn't exactly called for, but
>             seems like an oversight.
>
>             Previously guideline (4) said loss MUST be taken as an
>             indication of congestion and some standard response
>         taken.  But,
>             this guideline has an explicit exception for cases where
>         we know
>             the loss was caused by some non-congestion event. 
>         Guideline (3)
>             says you MUST backoff.  But, it did not have this
>         exception for
>             cases where we can tell the cause.  But, I think based on the
>             spirit of (4), (3) should also have these words.  So, I added
>             them.
>

In some cases you cannot tell the cause, but it is more important to 
ignore the loss. OAM being a particularly good example.

>
>             Also, I swapped (3) and (4) because it seemed more natural in
>             re-reading to first think about taking congestion action and
>             then dealing with backoff.  I think the ordering is a small
>             thing, but folks can yell and I'll put it back if there is
>             angst.
>
>         Please take a look and let me know if this helps things along or
>         not.
>
>         allman
>

We are getting there, but I would ask that you take the transport hat 
off and look again from an infrastructure and packet transport perspective.

Best regards

Stewart

On 18/06/2020 11:00, Stewart Bryant wrote:
>
>
>> On 17 Jun 2020, at 18:20, Martin Duke <martin.h.duke@gmail.com 
>> <mailto:martin.h.duke@gmail.com>> wrote:
>>
>> Hi Stewart,
>>
>> If there are no further objections, I'm going to declare consensus.
>>
>> On Thu, Jun 11, 2020 at 1:45 PM Martin Duke <martin.h.duke@gmail.com 
>> <mailto:martin.h.duke@gmail.com>> wrote:
>>
>>     Stewart,
>>
>>     do we need more cycles for this, or is draft-15 sufficient to
>>     address your concerns?
>>
>>     On Mon, Jun 8, 2020 at 12:52 PM Mark Allman <mallman@icir.org
>>     <mailto:mallman@icir.org>> wrote:
>>
>>
>>         Hi Stewart, et.al <http://et.al/>.!
>>
>>         I just submitted a new version of rto-consider. Please ask the
>>         datatracker for diffs between this and rev -14.  The highlights:
>>
>>           - The diffs with the last rev are here:
>>         https://tools.ietf.org/rfcdiff?difftype=--hwdiff&url2=draft-ietf-tcpm-rto-consider-15.txt
>>
>
> *In the general case, delay across a network path depends not only on 
> distance, but also a number of variable components such as the route 
> and the level of buffering in intermediate devices.*
>
> Its is more the contending/conflicting traffic rather than the 
> buffering, or perhaps the time spent in queues, but “buffering” is a 
> link a transport colloquial term.
>
>
> *Since our wide-area network paths are best effort, packet loss is a 
> regular occurrence. *
>
> No the best effort Internet experiences this. There ate many well 
> engineered WAN that do not.
>
> What I am not seeing is clearer text that distinguishes between user 
> traffic and “engineering” traffic that is used to make the network 
> work, and between the end to end traffic and traffic within an AS that 
> may be there for other purposes (high value service also offered by 
> the provider) and WANs that are well engineered.
>
> Perhaps we could include a clearer disclaimer regarding the 
> non-best-effort-internet-end-to-end traffic?
>
> You have some text on this down in section 2 but it is a bit buried.
>
> Perhaps something early on of the form: This document is specially 
> concerned with end to end behaviour over the best effort Internet. As 
> noted in section 2 it may not me applicable to other types of WAN, or 
> to the  traffic used in affecting the operation of the Internet itself.
>
>
> *An exception to this rule is if an IETF standardized mechanism 
> determines that a particular loss is due to a non-congestion event 
> (e.g., packet corruption). *
>
> That is a bit heavy. It should be “a protocol” there than an IETF 
> standardarized mechanism. The IETF does not have a monopoly on 
> pre-blessing protocols before they are deployed.
>
>
>
>>
>>           - All small comments addressed.
>>
>>           - I think we all agree that this is not a one-size-fits-all
>>             situation.  Rather, this document is meant to be a
>>         default case.
>>             So, the main action of this rev is to make that point more
>>             clearly.  The first paragraph in the intro is new.  Also,
>>         there
>>             are some more words fleshing out the context more in
>>         section 2.
>>             In particular, more emphatically making the point that other
>>             loss detectors are fine for specific cases.
>>
>
>
> As I note above from a routing and packet transport (as opposed to the 
> transport layer) perspective I think we should more clearly recognise 
> at the beginning the fact that this is for the worst case network, not 
> for well engineered (WAN and DC) networks  and the mechanisms 
> fundamental to the operation of the network itself.
>
>>
>>           - The first paragraph in the intro also makes clear we
>>         adopt the
>>             loss == congestion model (as that is the conservative
>>         default,
>>             not because it is always true).
>>
>>           - I made one other change that wasn't exactly called for, but
>>             seems like an oversight.
>>
>>             Previously guideline (4) said loss MUST be taken as an
>>             indication of congestion and some standard response
>>         taken.  But,
>>             this guideline has an explicit exception for cases where
>>         we know
>>             the loss was caused by some non-congestion event. 
>>         Guideline (3)
>>             says you MUST backoff.  But, it did not have this
>>         exception for
>>             cases where we can tell the cause.  But, I think based on the
>>             spirit of (4), (3) should also have these words.  So, I added
>>             them.
>>
>
> In some cases you cannot tell the cause, but it is more important to 
> ignore the loss. OAM being a particularly good example.
>
>>
>>             Also, I swapped (3) and (4) because it seemed more natural in
>>             re-reading to first think about taking congestion action and
>>             then dealing with backoff.  I think the ordering is a small
>>             thing, but folks can yell and I'll put it back if there is
>>             angst.
>>
>>         Please take a look and let me know if this helps things along or
>>         not.
>>
>>         allman
>>
>
> We are getting there, but I would ask that you take the transport hat 
> off and look again from an infrastructure and packet transport 
> perspective.
>
> Best regards
>
> Stewart
>
>
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm