Re: [AVTCORE] WG Last Call: "Frame Marking RTP Header Extension"
Bernard Aboba <bernard.aboba@gmail.com> Sat, 05 December 2020 07:30 UTC
Return-Path: <bernard.aboba@gmail.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23BB33A0E8B for <avt@ietfa.amsl.com>; Fri, 4 Dec 2020 23:30:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eVBaZB--RAXT for <avt@ietfa.amsl.com>; Fri, 4 Dec 2020 23:30:52 -0800 (PST)
Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1C1893A0E8A for <avt@ietf.org>; Fri, 4 Dec 2020 23:30:52 -0800 (PST)
Received: by mail-lf1-x12c.google.com with SMTP id v14so10852008lfo.3 for <avt@ietf.org>; Fri, 04 Dec 2020 23:30:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=bPi2m9H3AMdV4W9JUlNc8abl7hTlISRQkr/wz1krJ5U=; b=ljykyKR6AdsaBNdQeBIJmlj1lJZz4dnfZvhZAYjrZPeSoew+jiHAcZNPRUARKnNZNt E9Wz/fJOJnCE8x2C6r1g9B3w16z4qQ4Ho1GdqItBfer+khtH39XzqyfYGtz0rGvOHMxo efBpkwjKxF+cXkmzlVw7QSDzIUYnm9McmKi4/ss9ZCH7x2VH6aFHg+zVed+KvrvLbfz/ B32LZuMUrIf+6tYlyxYqan86jftbwyLZUPo5BngkUo7kOxB95ERAhyrk1noX1f0rpld0 pmFWwnOEqAWn+JUo9U4mTc1Dj2HvPto9pbul9wR+nciCTNnFzr69/GTngl2rJ5B7B8wD 6rCw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=bPi2m9H3AMdV4W9JUlNc8abl7hTlISRQkr/wz1krJ5U=; b=oxxEKYw5tgluyNk9mBJicC7m1wPMagCHID0uJa+n0Bj92k7L0HyBQofdP5iNMgjEt1 d9s9tp8drCMyIXsRI6MJoVPKOJ1IqUYq+PsNtXQ/F/1L/oMUug50L6YtvjcKv4y6/WjX 6vr7D33Pn4iouTHvDyK15xMf8sPXak11VufxD+25/LWgGHPULvY8ZNMXrIF+odRXqCG9 sx/wReXWvEFWLpIqUiP7x08M3fA4HjxBlrb+711uR/7jwM5nch6diyRxwd7R4rq+wgAv IyIlXsYpIeX79dd/uaa4y5fEH5sd8tXcMSZkuarJUsLVq7J6sou4oefa5mppyeM7U5NM rkYQ==
X-Gm-Message-State: AOAM532JBJ1gDW9MHGeiWQe1SEIgV37qGpI5/tQRVMqLwhNWVBtnXbhQ BqaUGlapyTnzvm7buiLCXONsQlnOx8Z/ZPg1Rt2qX8aMv4yfSQ==
X-Google-Smtp-Source: ABdhPJw12cr0+E3p3KgyFIU4Y5I260YHZZboPpExwnekxlVEphWDlM2ilL0R0FNSA1Y1GPzj85Tv40ukOY8fXKSvOu8=
X-Received: by 2002:a05:6512:51a:: with SMTP id o26mr907266lfb.560.1607153449347; Fri, 04 Dec 2020 23:30:49 -0800 (PST)
MIME-Version: 1.0
From: Bernard Aboba <bernard.aboba@gmail.com>
Date: Fri, 04 Dec 2020 23:30:39 -0800
Message-ID: <CAOW+2ds+pgpG8cd+iZJpvhMsu5Q77zAmNf9C3Dycx4TpnVfpiA@mail.gmail.com>
To: IETF AVTCore WG <avt@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000072ad3005b5b292ac"
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/SznfLrr7YorwYjPEYXdlH5AU4VA>
Subject: Re: [AVTCORE] WG Last Call: "Frame Marking RTP Header Extension"
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 05 Dec 2020 07:30:57 -0000
Here are my comments. Overall, I think the document needs to be more clear about goals. For example, even handling temporal scalability in a codec-agnostic way may not be easily achieved; implementers have indicated that peculiarities of the VP8 RTP Payload, described in RFC 7741 Section 4.2, require parsing (and rewriting) the VP8 payload descriptor. Section 1 The goal is to provide a set of streams back to the participants which enable them to render the right media content. In a simple video configuration, for example, the goal will be that each participant sees and hears just the active speaker. In that case, the goal of the switch is to receive the voice and video streams from each participant, determine the active speaker based on energy in the voice packets, possibly using the client-to-mixer audio level RTP header extension [RFC6464 <https://tools.ietf.org/html/rfc6464>], and select the corresponding video stream for transmission to participants; see Figure 1. [BA] Is the goal only to switch to the active speaker? Most SFUs now attempt to do more than this, such as to select an operating point based on the available bandwidth of each participant. o Because of inter-frame dependencies, it should ideally switch video streams at a point where the first frame from the new speaker can be decoded by recipients without prior frames, e.g switch on an intra-frame. [BA] Rather than "switching video streams", it seems to me that we are really talking about "switching operating points". If so, it should be noted that upswitch points can exist outside of an intra-frame. o Furthermore, it is highly desirable to do this in a payload format-agnostic way which is not specific to each different video codec. Most modern video codecs share common concepts around frame types and other critical information to make this codec- agnostic handling possible. [BA] Are we sure that this goal is achievable, with framemarking or a successor RTP header extension? Perhaps the goal should be reset. By providing meta-information about the RTP streams outside the encrypted media payload, an RTP switch can do codec-agnostic selective forwarding without decrypting the payload. [BA] Based on some of the peculiarities of codecs such as VP8, it appears that "codec-agnostic forwarding" is difficult. Overall, it seems to me that Section 1 needs to contain an applicability statement. Section 3.3.4 VP8 LID mapping [BA] Implementers have reported that framemarking is not suitable for dealing with VP8 temporal scalability. The problem is due to the following peculiarity noted in RFC 7741 Section 4.2: PictureID: 7 or 15 bits (shown left and right, respectively, in Figure 2) not including the M bit. This is a running index of the frames, which MAY start at a random value, MUST increase by 1 for each subsequent frame, and MUST wrap to 0 after reaching the maximum ID (all bits set). The 7 or 15 bits of the PictureID go from most significant to least significant, beginning with the first bit after the M bit. The sender chooses a 7- or 15-bit index and sets the M bit accordingly. The receiver MUST NOT assume that the number of bits in PictureID stays the same through the session. Having sent a 7-bit PictureID with all bits set to 1, the sender may either wrap the PictureID to 0 or extend to 15 bits and continue incrementing. The problem is that the PictureID "MUST increase by 1 for each subsequent frame". This means that an SFU may need to rewrite the PictureID field, so as to compensate for the frames that it does not forward. Note that this issue is *not* unique to this specification, but will also occur with other frame forwarding RTP header extensions such as the Dependency Descriptor (DD) <https://aomediacodec.github.io/av1-rtp-spec/#dependency-descriptor-rtp-header-extension>. If the goal is to be able to handle VP8 temporal scalability without requiring the SFU to parse the VP8 Payload Descriptor, it seems that you would need to include the PictureID in this (or another) RTP header extension, so as to allow the SFU to modify it. This is somewhat ugly because it implies that the receiver will need to trust the modified PictureID instead of the PictureID that it receives in the VP8 payload descriptor.
- [AVTCORE] WG Last Call: "Frame Marking RTP Header… Bernard Aboba
- Re: [AVTCORE] WG Last Call: "Frame Marking RTP He… Stephan Wenger
- Re: [AVTCORE] WG Last Call: "Frame Marking RTP He… Bernard Aboba
- Re: [AVTCORE] WG Last Call: "Frame Marking RTP He… Sergio Garcia Murillo
- Re: [AVTCORE] WG Last Call: "Frame Marking RTP He… Alexandre GOUAILLARD
- Re: [AVTCORE] WG Last Call: "Frame Marking RTP He… Stephan Wenger
- Re: [AVTCORE] WG Last Call: "Frame Marking RTP He… Bernard Aboba