Re: [rtcweb] RTP Usage: Is RTP Retransmission REQUIRED or RECOMMENDED

On Mon, Jul 2, 2012 at 9:24 AM, Randell Jesup <randell-ietf@jesup.org>wrote:

> On 7/1/2012 8:17 AM, Stefan Holmer wrote:
>
>>
>>
>> On Sun, Jul 1, 2012 at 8:44 AM, Randell Jesup <randell-ietf@jesup.org
>> <mailto:randell-ietf@jesup.org**>> wrote:
>>
>>     On 6/29/2012 3:30 AM, Stefan Holmer wrote:
>>
>>         Doesn't this in the end boil down to what tools we have for
>> baseline
>>         resilience? Retransmissions are fairly easy to implement and will
>> in
>>         most cases give a much better experience than for example relying
>> on
>>         periodic key frames or doing key frame requests. Also it isn't
>>         necessary
>>         for the decoder to support decoding streams with lost packets.
>>
>>
>>     "Better experience" incorporates a bunch of assumptions.
>>
>>     Assuming we're talking video only:
>>
>>     If you want it to be seamless, you have to be running a jitter
>>     buffer depth of at least 1 RTT plus
>>     time-to-decide-packet-may-be-_**_lost plus random-constant.  The
>>
>>     second part of that can be tricky and network-state dependent.  On
>>     corporate networks, I'd see sudden 100ms changes in base delay over
>>     1-3 packets; a short time-to-decide would flag each of those as
>>     missing.  This is ok (the re-xmit will get ignored as a dup), but
>>     adds packets at an apparent bad point for the network.
>>
>>     That said, small RTTs certainly happen within LANs and corporations,
>>     and between neighbors in a single provider often.
>>
>>     Note that often/normally with small RTTs you also get reasonably
>>     small jitter, and thus the adaptive jitter buffer might run very
>>     small numbers, so one might need to put a lower limit on the jitter
>>     buffer in these cases.
>>
>>     The alternatives:
>>
>>     1) request re-xmit and freeze video queue when the jitter buffer
>>     runs dry.  If the packet comes in ASAP, freeze <= RTT+constant
>>     (modulo application scheduling delay at the sender side).  Then you
>>     can repair the broken frame, decode it, and then decode any
>>     following frames. Typically after a freeze you drop a frame or more;
>>     you might end up showing some of them depending on decode speed and
>>     how long the freeze lasted.  If the re-xmit (or the request) is
>>     lost, then delays can stretch much longer, since you need to wait
>>     RTT+bigger constant before requesting again.
>>
>>
>> Yes, this is what I would consider the simplest approach, which leads to
>> a system which can guarantee a complete stream thus not requiring the
>> decoder to be able to decode with errors, and at the same time having a
>> good chance of recovering within slightly longer than one RTT.
>>
>
> Simple, yes.  There are aspects of this I don't like:
>
> * Frozen for minimum of 1RTT+.  With 100ms RTT, you're probably at a
> minimum of 4 frames, perhaps 5, and if you lose another packet during those
> frames, it will move forward but not catch up sync, which can lead to
> jerky, out of sync video.  If you lose a re-xmit packet, you're talking
> minimum 2x the delay (and in reality a fair bit more because you have to
> decide when to ask for a re-re-xmit...).
> * Decoding with errors: not generally a problem.  The issue is whether you
> prefer (or think the users prefer) a short (1 frame) freeze plus
> 1RTT-1frame-ish period with motion but errors, or a 1+RTT min freeze,
> maybe/occasionally much more.  Our experience at WorldGate was that keeping
> motion active was preferable to freezing or worse loss of sync, even if
> there are errors.
>

Yes, it doesn't solve all problems.

>
> That conclusion is debatable, and the answer may vary according to RTT,
> type of call, resolution, bandwidth, type of application, and isn't
> something the IETF should mandate here.
>
>
>      2) request IDR (or refresh of corrupted slice).  Downside here is
>>     that you need to wait for the next frame encode at the sender side
>>     (so 0-33ms normally, or perhaps 0-100ms), and then the frame
>>     typically takes many packets to send (which also may be lost, and
>>     take time to receive especially on low-BW links), and IDR initial
>>     quality may be noticeably lower.  The receiver may freeze the video
>>     until the IDR is received or keep decoding with errors.  If there's
>>     a significant chance of the IDR losing a packet, a freeze could be
>>     substantial time.
>>
>>
>> If the losses are related to congestion, sending an IDR is usually a bad
>> idea. And as you are saying, if the losses are due to corruption, the
>> chance of having a loss in an IDR is much bigger than for a P-frame. I
>> don't think this approach is nearly as good as retransmissions, but it
>> is a possible baseline option.
>>
>
> Agreed, though in practice if you're not getting hammered with high loss
> and you keep the quantization high on the IDR, it will usually get through.
>  Obviously as bitrates increase the number of packets in the IDR goes up,
> and risk of loss in the IDR goes up.  If correction is only for a slice,
> this risk may not be high.
>
>
>      3) request repair.  Note that #2 and #3 may be the same at the
>>     receiver side, with the sender deciding the best available repair.
>>       In this case, the sender would instead of an IDR use a some other
>>     set of (smaller) packets to create an error-free up-to-date state at
>>     the receiver, such as encoding using a long-term-reference-frame, or
>>     using some other known-error-free frame in the receiver.  Note that
>>     repair is basically just a normal p-frame, albeit with somewhat
>>     lower quality and/or somewhat higher bandwidth used.  A big plus is
>>     that the end state is close to the same as re-transmit, but you
>>     don't have to speed through decoding any skipped frames.  Repair
>>     also allows the receiver to use alternate recovery mechanisms than
>>     simply freezing, including simply continuing to decode p-frames
>>     ignoring the lost frame.  This induces artifacts, but keeps motion
>>     loss to a minimum, especially in longer-RTT links.
>>
>>
>> Yes, I agree that this is also a good approach. It may be more expensive
>> than #1 if the long-term reference is old, but in general this would be
>> a good baseline as well.
>>
>
> There are mechanisms in VP8 that can be leveraged here.
>
>
Yes.

>
>  It is possible to continue decoding p-frames, ignoring the loss, with
>> retransmissions as well. However it's a bit more complicated and you
>> will have to speed through some frames at the decoder when the rtx
>> arrives, although you can skip rendering them.
>>
>
> Yes, you'd have to clone the decoder state at the loss point in order to
> "rewind" and then correct.
>
>
>
>>
>>     Note that #3 in particular should result in similar freeze length as
>>     #1, perhaps 1 frame longer, if it freezes.  If it plays with errors
>>     instead, the freeze is typically 1 frame (the lost one).
>>
>>     (Others?)
>>
>>     My normal preference is #3. But, as discussed above, in low-RTT
>>     networks it may be a good option and might give 0-frames of freeze.
>>       But, you still have all the complexities about how to decide when
>>     and how to use re-xmit; I think the 1 frame freeze is a reasonable
>>     option and a reasonable fallback if the source doesn't re-transmit.
>>
>>     Also, if retransmit is REQUIRED, and the media is gatewayed to other
>>     sources (non-webrtc), and they don't support retransmit (likely),
>>     then the gateway would need to buffer and act as a retransmit agent
>>     (complicating the gateway).  If it's RECOMMENDED, this is not a big
>>     deal and the gateway can stay simpler.
>>
>>
>> Sure, that's a downside, and that may be a good enough reason for
>> RECOMMENDED. The same goes for #3, right? Non-webrtc sources may not
>> support long-term references and therefore have to rely on IDRs.
>>
>
> Yup, and most do today.  The point is you report the loss as best you can
> (PLI, etc) and let the sender fix it as best it can.
>

Yes, that is nice with #3, and if we can assume most endpoints will support
#3 I agree that is the way to go, even though the baseline (hopefully used
for a very small %) will have to be IDRs.

>
> I do think this is a strong argument for RECOMMEND (or even something less
> strong).  We could of course REQUIRE support but warn that even if
> supported, and endpoint might not agree to retransmits.  It does make the
> position of a gateway to this clause ... interesting, but in practice it
> could reject re-xmits unless it knows the non-webrtc source supports them.
>  So this doesn't mandate RECOMMEND over REQUIRE, but it is justification
> for RECOMMEND.

>
>  Just saying that it would be nice to have a baseline which is something
>> better than requesting IDRs.
>>
>
> Yes.  But I don't think this is in the purview of the spec, unless you
> want to tie it to congestion control issues.
>
>
> --
> Randell Jesup
> randell-ietf@jesup.org
>
> ______________________________**_________________
> rtcweb mailing list
> rtcweb@ietf.org
> https://www.ietf.org/mailman/**listinfo/rtcweb<https://www.ietf.org/mailman/listinfo/rtcweb>
>