Re: [tsvwg] New Version Notification for draft-wang-tsvwg-tcp-coding-00.txt

"Scheffenegger, Richard" <rs@netapp.com> Wed, 25 February 2015 12:59 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D722B1A035F for <tsvwg@ietfa.amsl.com>; Wed, 25 Feb 2015 04:59:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.861
X-Spam-Level:
X-Spam-Status: No, score=-3.861 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_22=0.6, MIME_CHARSET_FARAWAY=2.45, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g1pFr_bOHKUi for <tsvwg@ietfa.amsl.com>; Wed, 25 Feb 2015 04:59:45 -0800 (PST)
Received: from mx141.netapp.com (mx141.netapp.com [216.240.21.12]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BC8C91A0273 for <tsvwg@ietf.org>; Wed, 25 Feb 2015 04:59:45 -0800 (PST)
X-IronPort-AV: E=Sophos;i="5.09,644,1418112000"; d="scan'208";a="26579179"
Received: from hioexcmbx01-prd.hq.netapp.com ([10.122.105.34]) by mx141-out.netapp.com with ESMTP; 25 Feb 2015 04:54:38 -0800
Received: from HIOEXCMBX05-PRD.hq.netapp.com (10.122.105.38) by hioexcmbx01-prd.hq.netapp.com (10.122.105.34) with Microsoft SMTP Server (TLS) id 15.0.995.29; Wed, 25 Feb 2015 04:54:38 -0800
Received: from HIOEXCMBX05-PRD.hq.netapp.com ([::1]) by hioexcmbx05-prd.hq.netapp.com ([fe80::c4b3:e711:88fe:6ce%21]) with mapi id 15.00.0995.031; Wed, 25 Feb 2015 04:54:38 -0800
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "Black, David" <david.black@emc.com>, wangjinzhu <wangjinzhu@chinamobile.com>, "Eggert, Lars" <lars@netapp.com>
Thread-Topic: [tsvwg] New Version Notification for draft-wang-tsvwg-tcp-coding-00.txt
Thread-Index: AdBDPSLSA6bR2HFWR1SFWin5jBg7rwAAPrFQAF/EeYAAH0vpAAAF+4oAABUouYAAUHRigAAePXSAAmUMnUA=
Date: Wed, 25 Feb 2015 12:54:37 +0000
Message-ID: <898dc3284c314a04a570dcfc96bd9f46@hioexcmbx05-prd.hq.netapp.com>
References: <20150208011821.32324.57024.idtracker@ietfa.amsl.com> <012501d0433e$ebb3d910$c31b8b30$@com> <E1A9E674-3FE0-463A-B947-E4DF152DA4FA@netapp.com> <00e801d044f7$51b8c9a0$f52a5ce0$@com> <7D01DD65-90C8-470C-BFB0-1B9D79E95772@netapp.com> <CE03DB3D7B45C245BCA0D243277949363624B8@MX104CL02.corp.emc.com> <002501d046a5$b42fe410$1c8fac30$@com> <CE03DB3D7B45C245BCA0D24327794936365DB2@MX104CL02.corp.emc.com>
In-Reply-To: <CE03DB3D7B45C245BCA0D24327794936365DB2@MX104CL02.corp.emc.com>
Accept-Language: de-AT, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.120.60.34]
Content-Type: text/plain; charset="gb2312"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/tsvwg/UoF8Ja1VX34DWZsh-GVhV2WXvrs>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>, '邓灵莉/Lingli Deng' <denglingli@chinamobile.com>
Subject: Re: [tsvwg] New Version Notification for draft-wang-tsvwg-tcp-coding-00.txt
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Feb 2015 12:59:49 -0000

Hi Jinzhu,

Two points, in addition to David's what are all right spot-on:


> [Jinzhu] I think the work on distinguishing link error loss from
> congestion loss can be classified into explicit solution and
> implicit solution. ECN belongs to the explicit solution. The
> explicit solution requires routers to disclose the network state
> and in some cases may not be easy to be deployed since some
> routers may not enable ECN. Thus if the implicit solution can be
> applied to the coded TCP, it will be easier to be deployed. I’m
> not sure whether there is a method that can distinguish link
> error loss from congestion loss without the feedback from the
> network. Suggestions on the implicit solution will be greatly
> appreciated. 

As David tried to point out, and I will make this more explicit: The TCP part of the ECN feedback mechanism, as it exists since RFC3168, can be used completely without any support by the network routers.

Obviously, in such an environment, the receiver will never actually get any CE-marked IP Packets, which it would have to signal back to the sender, but in your case, the loss, and successful reconstruction of a packet would constitute just as much - thus the RFC3168 section on TCP is fully applicable.

(Recent ECN testing has shown that end-hosts often support ECN in the TCP layer, but paths more often have issues dealing with the IP-layer signals - masking the IP ECN codepoints, setting them inappropriately, etc).


As a side-note, you may want to participate in the TCPM WG discussions on accureate ECN feedback. It's been found that a single feedback signal per RTT is not enough information, for more attenuated responses. In your case, think of an encoding scheme (Reed-Solomon, Low-density parity-check, diagonal parity, ...) that could recover more than a single segment of the encoded packets, or that one RTT actually covers a multitude of encoded groups of packets. Potentially, you want to give a differentiated feedback signal on the extent of that loss - not for TCP congestion control purposes alone, but also for example for your encoder. A less lossy network could do with larger groups of encoded packets, while with a more lossy network, you'd probably want to increase the encoding ratio...



The second observation I want to point out is that this work seems quite similar to 


TCP Instant Recovery: Incorporating Forward Error Correction in TCP Tobias Flach, USC, N. Dukkipati, Y. Cheng, B. Raghavan, Google
http://tools.ietf.org/html/draft-flach-tcpm-fec-00


which actually also looked into secondary problems like stateful firewall traversal, backwards-compliant encoding of the parity packets, etc. 


and, of course, the work on Coded TCP by Muriel Médard et.al.
http://www.ietf.org/proceedings/87/slides/slides-87-nwcrg-4.pdf


As David and Lars pointed out, you will have to be very careful NOT to inadvertently conceal congestion loss from the sender. You may want to collaborate with these groups...

Best regards,
  Richard
 



> -----Original Message-----
> From: tsvwg [mailto:tsvwg-bounces@ietf.org] On Behalf Of Black, David
> Sent: Freitag, 13. Februar 2015 00:50
> To: wangjinzhu; Eggert, Lars
> Cc: tsvwg@ietf.org; '邓灵莉/Lingli Deng'
> Subject: Re: [tsvwg] New Version Notification for draft-wang-tsvwg-tcp-
> coding-00.txt
> 
> Jinzhu,
> 
> > >+1 - the baseline approach should be ECN-like behavior where the
> > >+recovered
> > data is delivered, but the loss is communicated to the sender in order
> > to cause the sender's congestion control logic to react (ECN treats a
> > CE-marked packet as if it were dropped). I would suggest using at
> > least the ECN signaling, even if the packets aren't marked as using
> > ECN, as  ECN has the receiver-to-sender signaling worked out (i.e.,
> > please don't try to reinvent that).
> >
> > [Jinzhu] Thanks, this suggestion is important to coded TCP. By using
> > ECN, the network congestion can be explicitly notified and the sender
> > can adjust congestion window timely when congestion occurs. I will
> > work on it and further update the draft to combine coded TCP with ECN.
> 
> That helps, but it's not that simple, sorry ...
> 
> When ECN is used, drops may still indicate congestion (i.e., use of ECN
> does not result in all drops being due to link errors or the like and
> never due to congestion).  There are at least two reasons for this:
> 
> 	- Bottleneck is at a node that does not implement ECN and hence
> drops
> 		instead of applying a CE mark.
> 	- ECN-using node gets sufficiently overloaded that it has to drop
> 		instead of forwarding with a CE mark.
> 
> The primary piece of ECN that I'm recommending that you use is the
> receiver-to-sender TCP signaling (i.e., the CWR and ECE flags, and
> associated procedures - see Section 6 of RFC 3168).  Every time the
> receiver sees a loss, it should signal congestion to the sender, even if
> the receiver successfully recovers the data from the redundancy coding.
> As noted above, please just use the ECN signaling support in TCP, don't
> try to reinvent it.
> 
> > [Jinzhu] I think the work on distinguishing link error loss from
> > congestion loss can be classified into explicit solution and implicit
> > solution. ECN belongs to the explicit solution.
> 
> Actually, ECN can't do that, sorry, see above.
> 
> > [Jinzhu] On the other hand, in the current coded TCP even if it cannot
> > distinguish the random loss from the congestion loss, it can control
> > congestion by another way, though this behavior is different to
> > current Internet congestion control. For example, we set the
> > redundancy as 10% in the coded TCP. In the standard TCP, when the
> > network bottleneck utilization achieves to 100%, the TCP sender
> > reduces sending rate to response the congestion (this is just a simple
> illustration, the 100% is inaccurate).
> > While in the coded TCP, when the bottleneck utilization achieves to a
> > higher value (perhaps 110%), the redundancy cannot recover loss and
> > thus the coded TCP sender reduces the sending rate to control Internet
> congestion.
> > Currently, I’m not sure whether this behavior will impact the network
> > or it will works well in the network. Discussions and suggestions will
> > be greatly appreciated
> 
> This behavior can be expected to impact the network - it will be your
> responsibility to demonstrate that this is safe (i.e., the default
> hypothesis to start from is that it is unsafe, and show us under what
> assumptions for what sort of networks it is save).  Among other things,
> this looks like it could worsens bufferbloat problems by piling more
> packets into a congested network than ordinary TCP would.
> 
> Thanks,
> --David
> 
> 
> > -----Original Message-----
> > From: wangjinzhu [mailto:wangjinzhu@chinamobile.com]
> > Sent: Thursday, February 12, 2015 4:24 AM
> > To: Black, David; 'Eggert, Lars'
> > Cc: tsvwg@ietf.org; '邓灵莉/Lingli Deng'
> > Subject: Re: [tsvwg] New Version Notification for
> > draft-wang-tsvwg-tcp-coding- 00.txt
> >
> > Hi David
> >
> > Thanks for your comments.
> > Please see inline.
> >
> > Best regards,
> > Jinzhu
> >
> > > -----Original Text----
> > > Sender: Black, David [mailto:david.black@emc.com]
> > > Date: 2015.2.11 3:01
> > > Receiver: Eggert, Lars; wangjinzhu
> > > cc: tsvwg@ietf.org; 邓灵莉/Lingli Deng
> > > Subject: RE: [tsvwg] New Version Notification for
> > draft-wang-tsvwg-tcp-coding-00.txt
> > >
> > >> Your TCP seems to not reduce CWND if lost data can be recovered
> > >> through coding, which is a change to standard TCP congestion control.
> > >
> > >+1 - the baseline approach should be ECN-like behavior where the
> > >+recovered
> > data is delivered, but the loss is communicated to the sender in order
> > to cause the sender's congestion control logic to react (ECN treats a
> > CE-marked packet as if it were dropped). I would suggest using at
> > least the ECN signaling, even if the packets aren't marked as using
> > ECN, as  ECN has the receiver-to-sender signaling worked out (i.e.,
> > please don't try to reinvent that).
> >
> > [Jinzhu] Thanks, this suggestion is important to coded TCP. By using
> > ECN, the network congestion can be explicitly notified and the sender
> > can adjust congestion window timely when congestion occurs. I will
> > work on it and further update the draft to combine coded TCP with ECN.
> >
> > >The bigger concern is that this draft's mechanism is intended for
> > deployment in an environment that exhibits high non-congestive losses,
> > as described in the Introduction:
> > >
> >    In wireless network, there are lot of factors (e.g., weather
> >    conditions, urban obstacles, multi-path interferences, limited
> >    coverage, mobility of the handset, etc.,) leading to unstable air-
> >    link.  As a result, wireless links exhibit much higher BERs than
> >    wired links.  Since all packet losses are considered as network
> >    congestion in standard TCP, packet loss caused by the high BER of the
> >    wireless link would trigger the TCP sender to reduce its sending rate
> >    unnecessarily.  This leads to the drastic decrease of TCP's
> >    throughput in the wireless network.
> > >
> > >Unfortunately, this draft goes to the other extreme, and treats any
> > recoverable loss (and with good choice of coding algorithm, most of
> > them will be) as not indicating congestion.  As Lars notes, that's not
> > acceptable because it fails to react to recovered losses that do
> > actually indicate congestion.
> > >
> > >There has been some prior work on trying to determine when a loss is
> > >caused
> > by congestion vs. link errors, but I can't quickly provide any
> > pointers - perhaps Lars can.
> >
> > [Jinzhu] Yes, we are trying to recover link error loss at the receiver
> > side by using redundancy coding. However, since the coded TCP sender
> > cannot distinguish wireless link error loss from congestion loss, it
> > treat the link error loss the same as the congestion loss.
> >
> > [Jinzhu] I think the work on distinguishing link error loss from
> > congestion loss can be classified into explicit solution and implicit
> > solution. ECN belongs to the explicit solution. The explicit solution
> > requires routers to disclose the network state and in some cases may
> > not be easy to be deployed since some routers may not enable ECN. Thus
> > if the implicit solution can be applied to the coded TCP, it will be
> > easier to be deployed. I’m not sure whether there is a method that can
> > distinguish link error loss from congestion loss without the feedback
> > from the network. Suggestions on the implicit solution will be greatly
> appreciated.
> >
> > [Jinzhu] On the other hand, in the current coded TCP even if it cannot
> > distinguish the random loss from the congestion loss, it can control
> > congestion by another way, though this behavior is different to
> > current Internet congestion control. For example, we set the
> > redundancy as 10% in the coded TCP. In the standard TCP, when the
> > network bottleneck utilization achieves to 100%, the TCP sender
> > reduces sending rate to response the congestion (this is just a simple
> illustration, the 100% is inaccurate).
> > While in the coded TCP, when the bottleneck utilization achieves to a
> > higher value (perhaps 110%), the redundancy cannot recover loss and
> > thus the coded TCP sender reduces the sending rate to control Internet
> congestion.
> > Currently, I’m not sure whether this behavior will impact the network
> > or it will works well in the network. Discussions and suggestions will
> > be greatly appreciated
> >
> >
> > >Thanks,
> > >--David
> >
> > >> -----Original Message-----
> > >> From: tsvwg [mailto:tsvwg-bounces@ietf.org] On Behalf Of Eggert,
> > >> Lars
> > >> Sent: Tuesday, February 10, 2015 3:55 AM
> > >> To: wangjinzhu
> > >> Cc: tsvwg@ietf.org; 邓灵莉/Lingli Deng
> > >> Subject: Re: [tsvwg] New Version Notification for
> > >> draft-wang-tsvwg-tcp-coding- 00.txt
> > >>
> > >> Hi,
> > >>
> > >> I'm sorry, but this statement:
> > >>
> > >> On 2015-2-10, at 07:03, wangjinzhu <wangjinzhu@chinamobile.com>
> wrote:
> > > >> The coded TCP dose not modify the TCP's congestion control.
> > >>
> > >> is contradicted by this statement:
> > >>
> > > >> In the transmission, when packet loss occurs, if the receiver
> > > >> side can
> > > >recover the loss by using redundancy (packet loss is low), the
> > > >sender side does not see the packet loss and thus keep on
> > > >increasing congestion
> > window.
> > >>
> > >> Your TCP seems to not reduce CWND if lost data can be recovered
> > > >through coding, which is a change to standard TCP congestion control.
> > >>
> > > >Lars
> >
> >