Re: [rtcweb] Transcoding

John Leslie <john@jlc.net> Wed, 15 January 2014 14:11 UTC

Return-Path: <john@jlc.net>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D07BD1AE2A8 for <rtcweb@ietfa.amsl.com>; Wed, 15 Jan 2014 06:11:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.138
X-Spam-Level:
X-Spam-Status: No, score=-4.138 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_21=0.6, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.538] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sOkNMQDUqMDH for <rtcweb@ietfa.amsl.com>; Wed, 15 Jan 2014 06:11:18 -0800 (PST)
Received: from mailhost.jlc.net (mailhost.jlc.net [199.201.159.4]) by ietfa.amsl.com (Postfix) with ESMTP id F36F51AE0D7 for <rtcweb@ietf.org>; Wed, 15 Jan 2014 06:11:17 -0800 (PST)
Received: by mailhost.jlc.net (Postfix, from userid 104) id 9CAD6C94A9; Wed, 15 Jan 2014 09:11:03 -0500 (EST)
Date: Wed, 15 Jan 2014 09:11:03 -0500
From: John Leslie <john@jlc.net>
To: "Espen Berger (espeberg)" <espeberg@cisco.com>
Message-ID: <20140115141103.GC8358@verdi>
References: <20140112205608.GG47523@verdi> <CAD6AjGQ7X-h9oNtVwztSv5wkmvPAo0Fto=L=6VKuM1WnWe8bwQ@mail.gmail.com> <20140113050631.GH3245@audi.shelbyville.oz> <E8F5F2C7B2623641BD9ABF0B622D726D2B5192DB@xmb-rcd-x11.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <E8F5F2C7B2623641BD9ABF0B622D726D2B5192DB@xmb-rcd-x11.cisco.com>
User-Agent: Mutt/1.4.1i
Cc: rtcweb@ietf.org
Subject: Re: [rtcweb] Transcoding
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb/>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jan 2014 14:11:21 -0000

Espen Berger (espeberg) <espeberg@cisco.com> wrote:
> 
> Comments on transcoding based on testing in lab and research. 

   :^)

> *  In side by side comparison, single encode/decode and
>    encode/transcode/decode we can see that the untranscoded video
>    stream can be 20-35% lower bitrate and give the same visual quality...

   This tradeoff will be acceptable in many cases, IMHO.

> * Lip-sync should be within 50 ms, this is the conclusion from a
>   Norwegian based ph.d student and also recommendation from EBU...

   The EBU document discusses HDTV programming (and indeed recommends
audio no more than 5 msec early or 15 msec late). They claim to base
this on failure of lip-sync becoming "perceptable to 50% of observers".

   The 50 msec number seems plausible, but I have no basis to judge.

   Lip-sync can always be accomplished by delaying one or the other:
this gives us a trade-off (which I'd prefer to be under application
control).

> * Advanced media resilience techniques like LTRF, disposable frames
>   and more are hard to map between different video codecs, so
>   transcoding will likely reduce robustness for packet loss. 

   (discussed elsewhere)

> * Transcoding adds delay, between 60 - 200 ms  (depending on packet
>   loss and implementation).

   I have no basis to judge those numbers; but they seem high.

>   This has an impact since camera to screen latency should be below
>   300 ms to get a smooth conversation between to participants in a
>   video call. Test on people shows that glass to glass latency
>   should be below 330 ms, to avoid latency to be noticed.  

   Do you have a source for that?

   My gut-feel is that folks will tolerate at least 200 msec glass-to-
glass, but have problems when mouth-to-ear exceeds 150 msec. Alas I
don't have a source for that... :^(

> All in all we should avoid transcoding for video conferencing use
> cases to get to a good user experience. 

   I quite agree it's worth "avoiding" -- but that's not at all the
same as saying we should try to prevent it.

--
John Leslie <john@jlc.net>