Re: [rtcweb] RTP Usage: Is RTP Retransmission REQUIRED or RECOMMENDED

Randell Jesup <randell-ietf@jesup.org> Mon, 02 July 2012 07:24 UTC

Return-Path: <randell-ietf@jesup.org>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B8E3321F86E5 for <rtcweb@ietfa.amsl.com>; Mon, 2 Jul 2012 00:24:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.361
X-Spam-Level:
X-Spam-Status: No, score=-1.361 tagged_above=-999 required=5 tests=[AWL=0.038, BAYES_00=-2.599, J_CHICKENPOX_36=0.6, J_CHICKENPOX_38=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WlKz7CEhJq3D for <rtcweb@ietfa.amsl.com>; Mon, 2 Jul 2012 00:24:53 -0700 (PDT)
Received: from r2-chicago.webserversystems.com (r2-chicago.webserversystems.com [173.236.101.58]) by ietfa.amsl.com (Postfix) with ESMTP id B6F2A21F86E4 for <rtcweb@ietf.org>; Mon, 2 Jul 2012 00:24:52 -0700 (PDT)
Received: from pool-108-16-41-249.phlapa.fios.verizon.net ([108.16.41.249] helo=[192.168.1.12]) by r2-chicago.webserversystems.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from <randell-ietf@jesup.org>) id 1Slb0B-00015v-Q3 for rtcweb@ietf.org; Mon, 02 Jul 2012 02:24:55 -0500
Message-ID: <4FF14C97.4060005@jesup.org>
Date: Mon, 02 Jul 2012 03:24:07 -0400
From: Randell Jesup <randell-ietf@jesup.org>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20120614 Thunderbird/13.0.1
MIME-Version: 1.0
To: rtcweb@ietf.org
References: <4FEAB80A.7040207@ericsson.com> <4E5389B4-F54C-4060-952E-8319A801FDC3@iii.ca> <4FED4E81.7000607@ericsson.com> <CAEdus3KnqLHyBRtCUfE03C4rdTJfyEDoZReEo60cnz_30GuBnw@mail.gmail.com> <4FEFF1B6.6050504@jesup.org> <CAEdus3JSzOORFj4ihiYQ8XcbbQ+KYjbi-0KsPNJn-wT0Vn7K9A@mail.gmail.com>
In-Reply-To: <CAEdus3JSzOORFj4ihiYQ8XcbbQ+KYjbi-0KsPNJn-wT0Vn7K9A@mail.gmail.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - r2-chicago.webserversystems.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - jesup.org
X-Source:
X-Source-Args:
X-Source-Dir:
Subject: Re: [rtcweb] RTP Usage: Is RTP Retransmission REQUIRED or RECOMMENDED
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jul 2012 07:24:53 -0000

On 7/1/2012 8:17 AM, Stefan Holmer wrote:
>
>
> On Sun, Jul 1, 2012 at 8:44 AM, Randell Jesup <randell-ietf@jesup.org
> <mailto:randell-ietf@jesup.org>> wrote:
>
>     On 6/29/2012 3:30 AM, Stefan Holmer wrote:
>
>         Doesn't this in the end boil down to what tools we have for baseline
>         resilience? Retransmissions are fairly easy to implement and will in
>         most cases give a much better experience than for example relying on
>         periodic key frames or doing key frame requests. Also it isn't
>         necessary
>         for the decoder to support decoding streams with lost packets.
>
>
>     "Better experience" incorporates a bunch of assumptions.
>
>     Assuming we're talking video only:
>
>     If you want it to be seamless, you have to be running a jitter
>     buffer depth of at least 1 RTT plus
>     time-to-decide-packet-may-be-__lost plus random-constant.  The
>     second part of that can be tricky and network-state dependent.  On
>     corporate networks, I'd see sudden 100ms changes in base delay over
>     1-3 packets; a short time-to-decide would flag each of those as
>     missing.  This is ok (the re-xmit will get ignored as a dup), but
>     adds packets at an apparent bad point for the network.
>
>     That said, small RTTs certainly happen within LANs and corporations,
>     and between neighbors in a single provider often.
>
>     Note that often/normally with small RTTs you also get reasonably
>     small jitter, and thus the adaptive jitter buffer might run very
>     small numbers, so one might need to put a lower limit on the jitter
>     buffer in these cases.
>
>     The alternatives:
>
>     1) request re-xmit and freeze video queue when the jitter buffer
>     runs dry.  If the packet comes in ASAP, freeze <= RTT+constant
>     (modulo application scheduling delay at the sender side).  Then you
>     can repair the broken frame, decode it, and then decode any
>     following frames. Typically after a freeze you drop a frame or more;
>     you might end up showing some of them depending on decode speed and
>     how long the freeze lasted.  If the re-xmit (or the request) is
>     lost, then delays can stretch much longer, since you need to wait
>     RTT+bigger constant before requesting again.
>
>
> Yes, this is what I would consider the simplest approach, which leads to
> a system which can guarantee a complete stream thus not requiring the
> decoder to be able to decode with errors, and at the same time having a
> good chance of recovering within slightly longer than one RTT.

Simple, yes.  There are aspects of this I don't like:

* Frozen for minimum of 1RTT+.  With 100ms RTT, you're probably at a 
minimum of 4 frames, perhaps 5, and if you lose another packet during 
those frames, it will move forward but not catch up sync, which can lead 
to jerky, out of sync video.  If you lose a re-xmit packet, you're 
talking minimum 2x the delay (and in reality a fair bit more because you 
have to decide when to ask for a re-re-xmit...).
* Decoding with errors: not generally a problem.  The issue is whether 
you prefer (or think the users prefer) a short (1 frame) freeze plus 
1RTT-1frame-ish period with motion but errors, or a 1+RTT min freeze, 
maybe/occasionally much more.  Our experience at WorldGate was that 
keeping motion active was preferable to freezing or worse loss of sync, 
even if there are errors.

That conclusion is debatable, and the answer may vary according to RTT, 
type of call, resolution, bandwidth, type of application, and isn't 
something the IETF should mandate here.

>     2) request IDR (or refresh of corrupted slice).  Downside here is
>     that you need to wait for the next frame encode at the sender side
>     (so 0-33ms normally, or perhaps 0-100ms), and then the frame
>     typically takes many packets to send (which also may be lost, and
>     take time to receive especially on low-BW links), and IDR initial
>     quality may be noticeably lower.  The receiver may freeze the video
>     until the IDR is received or keep decoding with errors.  If there's
>     a significant chance of the IDR losing a packet, a freeze could be
>     substantial time.
>
>
> If the losses are related to congestion, sending an IDR is usually a bad
> idea. And as you are saying, if the losses are due to corruption, the
> chance of having a loss in an IDR is much bigger than for a P-frame. I
> don't think this approach is nearly as good as retransmissions, but it
> is a possible baseline option.

Agreed, though in practice if you're not getting hammered with high loss 
and you keep the quantization high on the IDR, it will usually get 
through.  Obviously as bitrates increase the number of packets in the 
IDR goes up, and risk of loss in the IDR goes up.  If correction is only 
for a slice, this risk may not be high.

>     3) request repair.  Note that #2 and #3 may be the same at the
>     receiver side, with the sender deciding the best available repair.
>       In this case, the sender would instead of an IDR use a some other
>     set of (smaller) packets to create an error-free up-to-date state at
>     the receiver, such as encoding using a long-term-reference-frame, or
>     using some other known-error-free frame in the receiver.  Note that
>     repair is basically just a normal p-frame, albeit with somewhat
>     lower quality and/or somewhat higher bandwidth used.  A big plus is
>     that the end state is close to the same as re-transmit, but you
>     don't have to speed through decoding any skipped frames.  Repair
>     also allows the receiver to use alternate recovery mechanisms than
>     simply freezing, including simply continuing to decode p-frames
>     ignoring the lost frame.  This induces artifacts, but keeps motion
>     loss to a minimum, especially in longer-RTT links.
>
>
> Yes, I agree that this is also a good approach. It may be more expensive
> than #1 if the long-term reference is old, but in general this would be
> a good baseline as well.

There are mechanisms in VP8 that can be leveraged here.

> It is possible to continue decoding p-frames, ignoring the loss, with
> retransmissions as well. However it's a bit more complicated and you
> will have to speed through some frames at the decoder when the rtx
> arrives, although you can skip rendering them.

Yes, you'd have to clone the decoder state at the loss point in order to 
"rewind" and then correct.

>
>
>     Note that #3 in particular should result in similar freeze length as
>     #1, perhaps 1 frame longer, if it freezes.  If it plays with errors
>     instead, the freeze is typically 1 frame (the lost one).
>
>     (Others?)
>
>     My normal preference is #3. But, as discussed above, in low-RTT
>     networks it may be a good option and might give 0-frames of freeze.
>       But, you still have all the complexities about how to decide when
>     and how to use re-xmit; I think the 1 frame freeze is a reasonable
>     option and a reasonable fallback if the source doesn't re-transmit.
>
>     Also, if retransmit is REQUIRED, and the media is gatewayed to other
>     sources (non-webrtc), and they don't support retransmit (likely),
>     then the gateway would need to buffer and act as a retransmit agent
>     (complicating the gateway).  If it's RECOMMENDED, this is not a big
>     deal and the gateway can stay simpler.
>
>
> Sure, that's a downside, and that may be a good enough reason for
> RECOMMENDED. The same goes for #3, right? Non-webrtc sources may not
> support long-term references and therefore have to rely on IDRs.

Yup, and most do today.  The point is you report the loss as best you 
can (PLI, etc) and let the sender fix it as best it can.

I do think this is a strong argument for RECOMMEND (or even something 
less strong).  We could of course REQUIRE support but warn that even if 
supported, and endpoint might not agree to retransmits.  It does make 
the position of a gateway to this clause ... interesting, but in 
practice it could reject re-xmits unless it knows the non-webrtc source 
supports them.  So this doesn't mandate RECOMMEND over REQUIRE, but it 
is justification for RECOMMEND.

> Just saying that it would be nice to have a baseline which is something
> better than requesting IDRs.

Yes.  But I don't think this is in the purview of the spec, unless you 
want to tie it to congestion control issues.

-- 
Randell Jesup
randell-ietf@jesup.org