[rtcweb] Transcoding Delay

John Leslie <john@jlc.net> Wed, 15 January 2014 13:44 UTC

Return-Path: <john@jlc.net>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E465F1AE0C8 for <rtcweb@ietfa.amsl.com>; Wed, 15 Jan 2014 05:44:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.538
X-Spam-Level:
X-Spam-Status: No, score=-3.538 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_31=0.6, J_CHICKENPOX_61=0.6, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.538] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cHD-cTr53Kfi for <rtcweb@ietfa.amsl.com>; Wed, 15 Jan 2014 05:44:53 -0800 (PST)
Received: from mailhost.jlc.net (mailhost.jlc.net [199.201.159.4]) by ietfa.amsl.com (Postfix) with ESMTP id A8BBA1AE0C3 for <rtcweb@ietf.org>; Wed, 15 Jan 2014 05:44:52 -0800 (PST)
Received: by mailhost.jlc.net (Postfix, from userid 104) id 1729BC94A9; Wed, 15 Jan 2014 08:44:38 -0500 (EST)
Date: Wed, 15 Jan 2014 08:44:38 -0500
From: John Leslie <john@jlc.net>
To: Stephan Wenger <stewe@stewe.org>
Message-ID: <20140115134438.GB8358@verdi>
References: <CAHp8n2kq+_uG=9XwoAGtRgqYU2Asc2Fv6RZ0aCW6cJi-LnhD+A@mail.gmail.com> <10390_1389365676_52D009AC_10390_2407_1_2842AD9A45C83B44B57635FD4831E60A06CBE540@PEXCVZYM14.corporate.adroot.infra.ftgroup> <52D0222F.4010006@bbs.darktech.org> <949EF20990823C4C85C18D59AA11AD8B112238@FR712WXCHMBA11.zeu.alcatel-lucent.com> <CAHp8n2=m3i77SNPZWmJchqVdg1c2WEJCt5g-pFRfmeWA2yV5xw@mail.gmail.com> <949EF20990823C4C85C18D59AA11AD8B114B2D@FR712WXCHMBA11.zeu.alcatel-lucent.com> <CEFAAC25.3F7FD%stewe@stewe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CEFAAC25.3F7FD%stewe@stewe.org>
User-Agent: Mutt/1.4.1i
Cc: "rtcweb@ietf.org" <rtcweb@ietf.org>
Subject: [rtcweb] Transcoding Delay
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb/>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Jan 2014 13:44:55 -0000

Stephan Wenger <stewe@stewe.org> wrote:
> 
> As for the delay introduced by the transcoding engine alone, let me note
> that the single or two frame delay commonly attributed to transcoding is
> realistic only for straightforward PPP type coding.

   I agree that seems optimistic...

> My understanding is that VP8, in its default encoder settings, uses
> much more complex GOP structures for error resilience and coding
> efficiency reasons.  H.264 constrained baseline certainly also allows
> that (it?s even more flexible in that regard). Complex GOP structures
> are anywhere between helpful to essential (depending on your viewpoint)
> for error resilience.

   I would hope that error resilience could become orthogonal to
transcoding -- though I agree it doesn't seem to be there yet.

> Complex GOP structures do not necessarily increase the latency in an
> overall system design in the absence of errors (they are not based on
> MPEG-2 B frame type of thinking), but when error correction becomes
> necessary, a transcoder would need to emulate receiver behavior, and
> that certainly would induce delay of several hundred milliseconds
> which would not be observable in a transcoder-free operation (under
> the assumption that the end system chooses to display slightly
> distorted video) but will be observable and unavoidable in the
> transcoder-based design.

   The mistake here is optimizing for resilience assuming the same
coding/decoding rules when, in fact, the endpoints don't share the
same rules. Yes, that very much gets in the way of minimal latency.

> I?m very familiar with H.264 baseline, and somewhat familiar with the VP8
> syntax.  Based on this knowledge, I doubt that one can transcode between
> the two formats in the compressed domain, i.e. without full reconstructed
> to the sample level.  I am absolutely convinced that compressed domain
> transcoding in the direction from H.264 to VP8 is impossible, due to the
> larger feature set of H.264.  ?Impossible? means here that I could create
> an H.264 compliant bitstream that cannot be transcoded into VP8 in the
> compressed domain, because there is no syntax equivalent for an H.264 tool
> in VP8.

   I agree. (And this is worth re-reading several times for anyone
participating in discussion of transcoding.)

> For example, H.264 constrained baseline supports more than three
> reference pictures, whereas VP8 uses no more than three (last frame,
> golden frame, and alternate reference frame).  It wouldn?t take me long to
> extend this list to several pages.  I?m also fairly certain that people
> more familiar with the VP8 syntax could similarly identify VP8 features
> that have no direct counterpart in H.264.  Insofar, I take any statement
> about ?availability? of compressed domain transcoding between the two
> coding schemes with a large grain of salt.  It may be possible (I don?t
> know) to transcode in the compressed domain between H.264/VP8 bitstreams
> when the respective input bitstream is specifically tailored for that
> purpose, and perhaps that is what people have in mind.

   It would also be possible to declare transcoding failure in cases like
these -- that's a tradeoff between precision and latency which should
be available at application layer.

   But, IMHO, it's probably better to disable error-resilience features
that poison transcoding when we know transcoding will be used.

   YMMV, of course...

--
John Leslie <john@jlc.net>