Re: [AVTCORE] [Phishing Risk] Re: [External] Re: Comments on draft-ietf-avtcore-rtp-vvc-05.txt

Ye-Kui Wang <> Thu, 05 November 2020 23:42 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id A096D3A07C4 for <>; Thu, 5 Nov 2020 15:42:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id oAHyT1BO3FbD for <>; Thu, 5 Nov 2020 15:42:03 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::532]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 69DB13A0799 for <>; Thu, 5 Nov 2020 15:42:03 -0800 (PST)
Received: by with SMTP id x13so2484162pgp.7 for <>; Thu, 05 Nov 2020 15:42:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=from:to:cc:references:in-reply-to:subject:date:message-id :mime-version:thread-index:content-language; bh=/7ArPPtLVZg6PPc34CvKikw57ohjvsf30Kku9F2GDJ0=; b=fZyOzhD6NuCn4Cd15xeO4PA9f7nKHxbSc8OE3pH/RaweQkL5ZgkFalwczHIlrI+5nR dwCffxenpY+/nSOHHZjNi2qN6xOskzzZy76tGrxvKn3y2DPqpPW8Tm3oPfxiBGr18xHX gvuSBjwrsKnHbfo575BGfFy2YgeIySjCiWq89cc2+cgF9/CYytZXGpQBiEOYlTzSr8S9 Fsy9uAX2iZHhPO1b8OVP9stmBAXzbbQEr9I6QmovxE8OclS7E7gdRdUdPYGiNSkhN5WP 003buaYLjDBQpSkqQg5IEsVcX60Gnc0gbt2HvPNiXCOuQxg9aTl/+stPm+jbxyzM0bQY FrKQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:from:to:cc:references:in-reply-to:subject:date :message-id:mime-version:thread-index:content-language; bh=/7ArPPtLVZg6PPc34CvKikw57ohjvsf30Kku9F2GDJ0=; b=nUgAXnxTFxjG6SuuqikzextBr7NLwpj8aIiKAFhennVJCfk8IIyCcznpnYv12zl1I8 8r2fFq0w8rVGHLHLPjL2C+5SvIOcXIWrTG75Zfi8gIyMdjHHT+hkeuyQ+1Dqd++wnq1D D6IaOh8C4Kh9qGM9Oc6HrSEosxUtuNoyMaBCuA7Sg/3/OcJKPJoJTMXV1pDCYcIrO3eq oTh7Kq1WGeaViSk+bv4uB13m8ymRYw6bsblRMrNkX0nkmIs1834tDzZLAUoMhTjXpBn8 2LwGUZLROkojAU4iFiO5n4Z78QhJsIf2sU0OPonEM4zjuEOhoFDWVtdOnquqloMjsjgZ xLwg==
X-Gm-Message-State: AOAM530acl21d8G51Nyox5aJ+BGeEjut5dhtzqUfxIp/10n+Xw3Xonrh 1Buo6RR9ITv4Lq1uZF+IfY8YWtfNqSfX5Q==
X-Google-Smtp-Source: ABdhPJx0v4j4YKipCE3+2iKTpDhOK0T2kzvT0jlzTyEbSlDZHXSe2VyExXlxRbHMYpGeaXZgNSbQTQ==
X-Received: by 2002:a17:90a:b907:: with SMTP id p7mr4753582pjr.146.1604619722428; Thu, 05 Nov 2020 15:42:02 -0800 (PST)
Received: from BTJS3X2BJA ( []) by with ESMTPSA id v7sm3512082pfu.39.2020. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 05 Nov 2020 15:42:01 -0800 (PST)
From: "Ye-Kui Wang" <>
To: "'Sanchez de la Fuente, Yago'" <>
Cc: "'Stephan Wenger'" <>, "'Martin Pettersson M'" <>, "'shuaiizhao\(Shuai Zhao\)'" <>, <>
References: <> <> <1eaa01d6b3b5$64584fd0$2d08ef70$> <>
In-Reply-To: <>
Date: Thu, 5 Nov 2020 15:41:59 -0800
Message-ID: <205101d6b3cd$42453df0$c6cfb9d0$>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_2052_01D6B38A.3427CA50"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQDftn+wRnf3EgIuMrkVNaRCW1HhIwJCUKEwA09gVGgB/eDlpKtrsCvA
Content-Language: en-us
Archived-At: <>
Subject: Re: [AVTCORE] [Phishing Risk] Re: [External] Re: Comments on draft-ietf-avtcore-rtp-vvc-05.txt
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 05 Nov 2020 23:42:07 -0000

Hi Yago,


Regarding GDR pictures with ph_recovery_poc_cnt equal to 0, my opinion is similar as for CRA pictures, i.e., if as an encoder to response to a FIR, there is no reason for the encoder to encode an intra refresh picture as CRA or GDR with ph_recovery_poc_cnt equal to 0 instead of as IDR. 




From: Sanchez de la Fuente, Yago <> 
Sent: Thursday, November 5, 2020 13:12
To: Ye-Kui Wang <>
Cc: Stephan Wenger <>rg>; Martin Pettersson M <>rg>; shuaiizhao(Shuai Zhao) <>om>;
Subject: [Phishing Risk] Re: [AVTCORE] [External] Re: Comments on draft-ietf-avtcore-rtp-vvc-05.txt


Dear Ye-Kui, all, 


After reading Ye-Kui’s email I thought again about my previous Email and even using GDR pictures with ph_recovery_poc_cnt equal to 0 would be questionable as a response for a FIR since in that case the same picture could have a NAL unit type of IDR. Same reasoning as Ye-Kui gave for CRA, there is no good reason for the encoder to encode an intra refresh picture as GDR pictures with ph_recovery_poc_cnt equal to 0 instead of as IDR if it could be an IDR.


As for the I bit I agree with Ye-Kui that it seems cleaner to me that if an indication of GDR is desirable, a separate indication could be included in draft-ietf-avtext-framemarking instead of reusing the existing one that states that a frame is independent. However, should we allow or disallow GDR pictures with ph_recovery_poc_cnt equal to 0 for this case? 


Best regards,

Yago Sánchez



Department Video Coding & Analytics

Group Multimedia Communications

Fraunhofer HHI - Heinrich Hertz Institute
Einsteinufer 37, 10587 Berlin, Germany

Tel.: +49 30 310 02663 <> 



On 5. Nov 2020, at 21:51, Ye-Kui Wang < <> > wrote:


Hi Martin, Stephan, All,


I was also hesitating to say yes when I first saw most of the suggestions, so hesitating such that I was hoping Stephan et al would reply and address them 😊


Great that Stephan did respond. Thanks!


Now a few additional comments from my side.


Firstly, regarding the suggestion of changing “Upon reception of a FIR, a sender must send an IDR picture.” to “Upon reception of a FIR, a sender must send an IDR, a CRA or a GDR picture.” Herein, my hesitation is because that a FIR, as the name indicates, requests a full intra picture that immediately stop any prediction from earlier pictures, hoping that once this is received by the receiver, all pictures and all picture areas are correct. GDR won’t do that. And note that we are also in a low delay environment, using GDR would need the receiver to wait much longer (than using IDR) to have correct full pictures. Note that rfc5104 also mentions that using GDR the user experience would not be as good. My personal opinion is, if using GDR were really acceptable, we would have added that in RFC 7798 (the HEVC RTP payload format), even though in HEVC GDR is indicated by the recovery point SEI message instead of by a NAL unit type as in VVC.


Secondly, on adding CRA. Herein my hesitation is because CRA is not really supposed to be used in low-delay conversational application environment. If as an encoder you don’t plan to have some associated leading pictures encoded for a CRA picture, there is no reason for the encoder to encode an intra refresh picture as CRA instead of as IDR. That’s why we did not allow CRA as a response to FIR in RFC 7798.


Thirdly, including GDR to the I bit constraint confuses both the name and the semantics of the bit, “independent of temporally prior frames”. To me, if an indication of GDR is important, that should be included in draft-ietf-avtext-framemarking, preferrable using a separate indication, and if so, it should also include GDR in AVC and HEVC indicated by the recovery point SEI message, and then in this draft carries that over in the same way as carrying over the I bit. (BTW, Shuai, we should update the status of the [FrameMarking] reference.)


However, adding the GDR abbreviation is good, and I think we should also add a brief description of GDR into clause 1.1.2 (Systems and Transport Interfaces).




From: avt < <> > On Behalf Of Stephan Wenger
Sent: Thursday, November 5, 2020 10:50
To: Martin Pettersson M < <> >; shuaiizhao(Shuai Zhao) < <> >; <> 
Subject: [External] Re: [AVTCORE] Comments on draft-ietf-avtcore-rtp-vvc-05.txt


Hi Martin,


Thanks for those suggested changes, which I think would consistently implement the option to react to a “Full Intra Request” (FIR) with a gradual decoder refresh (GDR) series of pictures.


As to whether we should allow GDR as a reaction to a FIR: I’m a bit torn here.  Arguments can be made either way, please see below.  Would others in the WG please weigh in?  


On one hand, the argument (somewhat rephrased here) that VVC’s GDR pictures are a fully specified replacement for traditional “all intra” pictures is a good one.  I concur that this is somewhat new in VVC, compared to older video coding standards.  Pretty much all of those could do some form of GDR (even good old H.261 and MPEG-2), but things were clumsy, results were not guaranteed, or one had to rely on SEI messages and similar exotics for implementation.  In the environments where FIR matters—video conferencing mostly—no one ever used GDR in any context except in those ca. 1990 H.261-based systems which didn’t implement full intra pictures at all, and relied on intra macroblock walk-around during the initial communication setup.


On the other hand, there’s a reason why FIR until now was consistently interpreted as a requirement of sending a single “all intra” picture (whatever that translates to in the various video coding standards and technologies).  That reason was related to the architecture of the MCUs that were around when RFC 5104 was written, back in the 2005-2008 timeframe.  What people requested then was that the internal architecture of an MCU should stay as independent of the codec in use as possible.  For FIR, that means that means: if an MCU sends out a FIR to a sending endpoint, it expects exactly one intra picture at the earliest opportunity that can be used to sync in added decoders of unknown state.  That logic would now have to change to receive either a single IDR picture or a series of pictures that make up a GDR.  A transcoding MCU would have to go further and include the decoding of those multiple pictures with all the tricky (though now fully specified!) stuff that goes on in VVC, before transcoding.   





From: avt <> on behalf of Martin Pettersson M < <> >
Date: Tuesday, November 3, 2020 at 08:38
To: "shuaiizhao(Shuai Zhao)" < <> >, " <> " < <> >
Subject: [AVTCORE] Comments on draft-ietf-avtcore-rtp-vvc-05.txt




Thanks for the good progress on the VVC RTP payload format. Below are some suggested modifications for your consideration:


1.	In section 3.2, add “GDR                       Gradual Decoding Refresh”


2.	In section 8.4, change “Upon reception of a FIR, a sender must send an IDR picture.” to “Upon reception of a FIR, a sender must send an IDR, a CRA or a GDR picture.”



One of the versatile features in VVC is its support for low-latency coding where the GDR picture is a key component to achieve low latency. Compared to AVC and HEVC where GDR is signaled in an SEI message with optional support by the decoder, the GDR picture in VVC is a normative part of the specification and the decoder must be able to tune in at a GDR picture. Therefore it makes sense to allow a sender to respond with a GDR picture upon receiving a FIR. Note also that a gradual decoding refresh point is mentioned as a possible Decoder Refresh Point in response to the FIR command in


Sending a CRA picture as a response to FIR would be fine as well in my opinion. I don’t see the reason to exclude that.



3.	In section 9.1, change “The I bit MUST be 1 when the NAL unit type is 7-9 (inclusive), otherwise it MUST be 0.” to “The I bit MUST be 1 when the NAL unit type is 7-10 (inclusive), otherwise it MUST be 0.”


In section 9.2, change “The I bit MUST be 1 when the NAL unit type is 7-9 (inclusive), otherwise it MUST be 0.” to “The I bit MUST be 1 when the NAL unit type is 7-10 (inclusive), otherwise it MUST be 0.”



NAL unit type 10 is GDR_NUT. 


In the I bit is specified as:

I: Independent Frame (1 bit) - MUST be 1 for frames that can be decoded independent of temporally prior frames, e.g. intra-frame, VPX keyframe, H.264 IDR [RFC6184], H.265 IDR/CRA/BLA/RAP [RFC7798]; otherwise MUST be 0.


The GDR picture is typically not fully refreshed in one frame, but it does not need prior temporal pictures to start the decoding process, i.e. a bitstream that starts with a GDR picture in VVC is a valid bitstream.


Best regards,

Martin Pettersson



From: avt < <> > On Behalf Of shuaiizhao(Shuai Zhao)
Sent: den 2 november 2020 23:11
To: <> 
Subject: [AVTCORE] FW: I-D Action: draft-ietf-avtcore-rtp-vvc-05.txt(Internet mail)


In this revision, Yago’s proposal for SDP parameters were implemented in section 7.2.1.


Editor’s notes were added for things we will provide clearfication in next revision.  So do review and critisize lightly. ☺





From: avt < <> > on behalf of " <> " < <> >
Reply-To: " <> " < <> >
Date: Monday, November 2, 2020 at 14:07
To: " <> " < <> >
Cc: " <> " < <> >
Subject: [AVTCORE] I-D Action: draft-ietf-avtcore-rtp-vvc-05.txt(Internet mail)



A New Internet-Draft is available from the on-line Internet-Drafts directories.

This draft is a work item of the Audio/Video Transport Core Maintenance WG of the IETF.


        Title           : RTP Payload Format for Versatile Video Coding (VVC)

        Authors         : Shuai Zhao

                          Stephan Wenger

                          Yago Sanchez

                          Ye-Kui Wang

                Filename        : draft-ietf-avtcore-rtp-vvc-05.txt

                Pages           : 61

                Date            : 2020-11-02



   This memo describes an RTP payload format for the video coding

   standard ITU-T Recommendation H.266 and ISO/IEC International

   Standard 23090-3, both also known as Versatile Video Coding (VVC) and

   developed by the Joint Video Experts Team (JVET).  The RTP payload

   format allows for packetization of one or more Network Abstraction

   Layer (NAL) units in each RTP packet payload as well as fragmentation

   of a NAL unit into multiple RTP packets.  The payload format has wide

   applicability in videoconferencing, Internet video streaming, and

   high-bitrate entertainment-quality video, among other applications.



The IETF datatracker status page for this draft is:


There are also htmlized versions available at:


A diff from the previous version is available at:



Please note that it may take a couple of minutes from the time of submission

until the htmlized version and diff are available at <> .


Internet-Drafts are also available by anonymous FTP at:




Audio/Video Transport Core Maintenance <>



Audio/Video Transport Core Maintenance <>