Re: [AVTCORE] WGLC for draft-ietf-avtext-framemarking

Jonathan Lennox <jonathan@vidyo.com> Fri, 05 January 2018 21:01 UTC

Return-Path: <jonathan@vidyo.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EB3FA129C6D for <avt@ietfa.amsl.com>; Fri, 5 Jan 2018 13:01:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.101
X-Spam-Level:
X-Spam-Status: No, score=-1.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_SORBS_WEB=1.5, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=vidyo-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lJFUWCgESLgB for <avt@ietfa.amsl.com>; Fri, 5 Jan 2018 13:01:42 -0800 (PST)
Received: from mail-qk0-x230.google.com (mail-qk0-x230.google.com [IPv6:2607:f8b0:400d:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3CA351270A3 for <avt@ietf.org>; Fri, 5 Jan 2018 13:01:42 -0800 (PST)
Received: by mail-qk0-x230.google.com with SMTP id o126so7519872qke.12 for <avt@ietf.org>; Fri, 05 Jan 2018 13:01:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vidyo-com.20150623.gappssmtp.com; s=20150623; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=eZy0L2WwALhkhzLdjp/alkEewwH7xQeZNA7rEMmuo4s=; b=xT9vuLyZRyACOs2Ekl82aPnVJhOrfITnbdGdoT9zzoy8RawTk9aGB2bJtiVLK4nAh3 5zCcliFDXTISQyxiPslJIUuUQuixef/PvsBu9EjU3NkT/tZqInBqCLJIbGcuXczv8v60 nQjtyc5vNdDbT5GL4TdY1wSAqHUgJY9vh1dAcYiolrP6Cc1l0rV7MC6txSyXu8tPLRR6 um3hZ66F8aigjiUk505T708ooWLeSDiUjd4S4l+5V6h0lDVMojJ24kiVGVnuC2vi2Dxm vpxLwXTLMFiqUvX8RmfNfpuOTVVocPbjwqv8cHl+HT8E02+bJdKqs8ud37pr4aekFdzF o/ZQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=eZy0L2WwALhkhzLdjp/alkEewwH7xQeZNA7rEMmuo4s=; b=DUwQ/7BO35fvnuZlySgGUrfFIa751Fyiv2cAbSGlwYS4SP6S9it7lYSqxbUNsHwbCo GKkWi3Badz8MZf84gZOQg1PE8BYz8SMURww3TzTPmbqMpw2orbbz81cymhj2OBW+tdv0 6VLeWLXeCY2kpNnHquuzcSy2Ir6aFqsHZaZ4moLniCJfNv+3Ei3aei38V4JxJ8QwErRF lXN9VKtQJws7hdyjc04ESPM8/7xnCdD4PKPbqoX6q+n+oNjYXy3h71+lRdb0nGwh5evo qw0j3dd4CDwkoAbnsqi9eQY245qSG9k/XadrOapbC7NrQ3FZJfHiwLVLqD8JT5NlLzSZ USGw==
X-Gm-Message-State: AKwxyteyuPOaJy/XIMsomvTZAkGbeFhx90xKJjsG3CkK+8XARybCb586 7IaF5Mw8AqZLqAMyScCk+XWr5C/e1TM=
X-Google-Smtp-Source: ACJfBosEwzTVD6OxMM9sP+S0/tP6FaZxtYUfYfWXJ+rPblZDa6R6b08xthe/ZfZX2kiqse6xQ5xQIg==
X-Received: by 10.55.79.12 with SMTP id d12mr829527qkb.328.1515186100863; Fri, 05 Jan 2018 13:01:40 -0800 (PST)
Received: from [172.16.2.142] ([160.79.219.114]) by smtp.gmail.com with ESMTPSA id m30sm4104137qtb.29.2018.01.05.13.01.39 for <avt@ietf.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 05 Jan 2018 13:01:40 -0800 (PST)
From: Jonathan Lennox <jonathan@vidyo.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Date: Fri, 05 Jan 2018 16:01:39 -0500
References: <9207B4E8-6531-4E7E-8F81-06CD0326CF56@vidyo.com>
To: IETF AVTCore WG <avt@ietf.org>
In-Reply-To: <9207B4E8-6531-4E7E-8F81-06CD0326CF56@vidyo.com>
Message-Id: <9DEE1EBD-6B42-4E5C-8188-F6E11FC25BC9@vidyo.com>
X-Mailer: Apple Mail (2.3273)
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/rGbHrt87jwdmjmSCPy9k9fLCLOQ>
Subject: Re: [AVTCORE] WGLC for draft-ietf-avtext-framemarking
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Jan 2018 21:01:44 -0000

Here are my comments (as an individual) on draft-ietf-avtext-framemarking.  (I’m not bothering to repeat any issues that Magnus raised; his are all good points.)

Section 3.1:

   o  The remaining (4 bits) - MUST be 0 for non-scalable streams.

I think this should say MUST be set to 0 by senders and MUST be ignored by receivers, for the sake of future extensibility (such as Roni’s proposal).


Section 3.2:

   o  I: Independent Frame (1 bit) - MUST be 1 for frames that can be
      decoded independent of temporally prior frames, e.g. intra-frame,
      VPX keyframe, H.264 IDR [RFC6184], H.265 IDR/CRA/BLA/RAP
      [RFC7798]; otherwise MUST be 0.  Note that this bit only signals
      temporal independence, so it can be 1 in spatial or quality
      enhancement layers that depend on temporally co-located layers but
      not temporally prior frames.

One issue that might arise — in a base spatial layer refresh, i.e. this coding structure (Figure 2 from draft-ietf-avtext-lrr):

        ... <--  S1  <--  S1  <--  S1  <--  S1  <-- ...
                  |        |        |        |
                 \/       \/       \/       \/
        ... <--  S0  <--  S0       S0  <--  S0  <-- ...

                  1        2        3        4
the S0 packets of Frame 3 will have the I bit set, even though this is not a true IDR / keyframe.  In order for a receiver to detect a true IDR, it must look for an I bit set on every spatial layer of a frame.  Is this definitely what we want?  If so, Section 3.4 is wrong, and this procedure should be spelled out there.

Alternately, if I=1 and LID=0 means true IDR, this should be stated explicitly; it simplifies IDR detection but would mean that the Figure 2 coding structure cannot be described by frame marking.


   o  D: Discardable Frame (1 bit) - MUST be 1 for frames that can be
      discarded, and still provide a decodable media stream; otherwise
      MUST be 0.

I feel “MUST” for the first half of this definition is too strong here — there are scenarios where this information might not be fully available to the entity creating the frame marking, and so it might want to set the value to 0 in scenarios where it’s uncertain.  I think this should be permitted.


   o  B: Base Layer Sync (1 bit) - MUST be 1 if this frame only depends
      on the base layer; otherwise MUST be 0.  If no scalability is
      used, this MUST be 0.

It should be made clear whether “on the base layer” refers to the base temporal layer, the base spatial layer, or both.  I think the intention is that it mean the base temporal layer, but I’m not sure.

This is another case where the entity creating the frame marking might not have full information, and so might want to set the value to 0 if it’s uncertain. 

Additionally, it’s weird that “If no scalability is used, this MUST be 0”, since conceptually a stream that consists entirely of base layer frames would set this to 1.  In general, I think all the “If no scalability is used” clauses in Section 3.2 should be removed, since they’re redundant with Section 3.1.


   o  TL0PICIDX: Temporal Layer 0 Picture Index (8 bits) - Running index
      of base temporal layer 0 frames when TID is 0.  When TID is not 0,
      this indicates a dependency on the given index.  If no scalability
      is used, this MUST be 0 or omitted.  When omitted, LID MUST also
      be omitted.

How do we want TL0PICIDX values to work across IDR frames?  Some codecs (H.264 SVC) reset their TL0PICID on IDR, others (VP8) insist on continuous values across them.

The former requires an IDRPICID (which I don’t think we want to add) for full correctness, but the latter can make splicing annoying (requiring a splicer to rewrite TL0PICIDX values forever).


Section 3.2.1.5:

I think this section needs to call out the requirement in section 3.2 that LID=0 always indicates the base layer — i.e., it’s not valid for a future LID mapping to be defined in a way that that breaks that invariant.  Otherwise, receivers have no way to tell whether they’ve received the first spatial layer of a frame.

Also, I think it’s necessary — either here, or somewhere else — to require that frame marking only be used with RTP payload formats that follow the usual marker bit rule for video, that a marker bit indicates the last packet of the picture/access unit/temporal unit/whatever.  Otherwise, there’s no way for a receiver to tell whether it’s received the *last* spatial layer of a frame.


Section 3.4.1:

This should mention that because frame marking can only be used with temporally-nested streams, temporal-layer LRR refreshes are unnecessary for frame-marked streams.

Other refreshes can be detected based on the I bit being set for the spatial layers in question (modulo the decision on my point about the I bit, above).