[LOOPS] Transport recovery and interaction with, link, and LOOPS loss recovery

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Mon, 27 July 2020 15:22 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: loops@ietfa.amsl.com
Delivered-To: loops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 90DDB3A1721 for <loops@ietfa.amsl.com>; Mon, 27 Jul 2020 08:22:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id StMHTMJafrXX for <loops@ietfa.amsl.com>; Mon, 27 Jul 2020 08:22:37 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id DF41C3A0F15 for <loops@ietf.org>; Mon, 27 Jul 2020 08:22:35 -0700 (PDT)
Received: from GF-MacBook-Pro.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id D12291B002BC for <loops@ietf.org>; Mon, 27 Jul 2020 16:21:37 +0100 (BST)
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
To: loops <loops@ietf.org>
Message-ID: <3b756bdc-12ed-5560-c336-4e0f3e341d1d@erg.abdn.ac.uk>
Date: Mon, 27 Jul 2020 16:21:37 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/loops/yon5kgkZ8nPWbzyTzAmrRK7FVac>
Subject: [LOOPS] Transport recovery and interaction with, link, and LOOPS loss recovery
X-BeenThere: loops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Local Optimizations on Path Segments <loops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/loops>, <mailto:loops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/loops/>
List-Post: <mailto:loops@ietf.org>
List-Help: <mailto:loops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/loops>, <mailto:loops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jul 2020 15:22:40 -0000

Thank you for taking the trouble to update 
draft-li-tsvwg-loops-problem-opportunities, the new text does help in 
many places, and I'm sorry not to have responded earlier to this new 
text.  I am still not clear yet what is antcipated from the transport 
congestion-control response. Please see comments below:

(1) Moving figure 1 seems helpful to introduce the context. Thanlks.

(2) In Section 6: I read section 6 as describing two recovery cases: One 
is about DupACK-based recovery. The other says it is about RACK 
recovery, I wonder if this is mainly the case where the dupACK, where I 
think this is about TLP (which uses RACKL) to short-cut the RTO by 
trying to recover after ~2RTTs?

(3) I think the text suggests these might have different implications on 
LOOPS, which seems reasonable, although I have questions:

“If LOOPS network does not buffer the out-of-order packets caused
       by packet loss, TCP sender which uses a time based loss detection
       like RACK [I-D.ietf-tcpm-rack] will perform well here.  It uses
       the notion of time to replace the conventional DUPACK threshold
       approach to detect losses.  Hence it prevents the TCP sender from
       invoking fast retransmit too early. “

(4) In Section 4.1:

“When a sender does not receive an ACK for a given segment within
    a certain amount of time called retransmission timeout (RTO), it re-
    sends the segment [RFC6298].  RTO can be as long as several seconds.”
…
“Even when
    the lost packet is not an exact tail, it can possibly add another RTT
    because there may not be enough packets in flight to trigger the fast
    retransmit).”

- I think this text argues simply for TLP?

- How would the method imporve performance, if TLP were to be widely 
deployed, since it's been promoted for TCP and specified for QUIC?


(5) Also, maybe I misunderstand, can’t RACK trigger retransmission 
earlier than timeout for a TLP - which the start of the ID said was an 
important use-case?

“Local retransmission will not
       interfere the sender's retransmission generally in this case.  If
       time based loss detection is not supported at the sender, end to
       end retransmission may be invoked as usual.  It consumes extra
       bandwidth Because the lost packets (i.e. recovered packet) is
       normally a very small percentage of the total packets.  Then extra
       bandwidth cost is not significant.”
- I don’t really understand this argument. Is this saying there will be 
few losses, and hence the additional capacity used by retransmission is 
small. If that was my correct interpretation, how does LOOPS know there 
are few losses? Does it measure this and rate-control it’s 
retransmissions to ensure this, or maybe some other method is used to 
avoid there being significant load from two levels of retransmission?

(6) Is some sort of check needed anyway perhaps? Because LOOPS could 
potentially work over a path with L2 retransmissions (e.g. WiFi, and any 
loops retransmissions multiply the number of L2 retransmissions?
- Can LOOPS be tunnelled over LOOPS in any cases?

(7) I now think I might understand the scope of the proposed work is 
constrained only to cases where ECN has been deployed and is enabled 
end-to-end. In Section 5.2:


“LOOPS can CE(Congestion Experienced) marks
    its recovered packets as the loss signal to end-to-end. Converting a
    packet loss signal to CE marking signal brings the benefits of
    reducing Head-of-Line blocking and probability of RTO expiry
    [RFC8087] without affecting TCP sender's loss based congestion
    control behaviour while enjoying the faster local recovery. ECN
    based indication is equivalent to a loss event at the TCP sender
    [RFC3168].”
…
“ In this way, a requirement is set for applying LOOPS.”

I still think this would benefit from being clearer at the start of the 
document, because only ECT (ECN-Capable Transport) flows should be 
directed to an LOOPS enabled path segment, as I think it says?

Best wishes,

Gorry