Re: [payload] Gen-art 2nd LC review of draft-ietf-payload-vp8-16

"Ben Campbell" <ben@nostrum.com> Tue, 25 August 2015 13:56 UTC

Return-Path: <ben@nostrum.com>
X-Original-To: payload@ietfa.amsl.com
Delivered-To: payload@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F5C71B31FD; Tue, 25 Aug 2015 06:56:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.31
X-Spam-Level:
X-Spam-Status: No, score=-1.31 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_26=0.6, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h8_1AW9R_9vw; Tue, 25 Aug 2015 06:56:31 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6482A1B31C8; Tue, 25 Aug 2015 06:56:25 -0700 (PDT)
Received: from [10.0.1.23] (cpe-70-119-203-4.tx.res.rr.com [70.119.203.4]) (authenticated bits=0) by nostrum.com (8.15.2/8.14.9) with ESMTPSA id t7PDu5ud084391 (version=TLSv1 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Tue, 25 Aug 2015 08:56:15 -0500 (CDT) (envelope-from ben@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host cpe-70-119-203-4.tx.res.rr.com [70.119.203.4] claimed to be [10.0.1.23]
From: Ben Campbell <ben@nostrum.com>
To: Henrik Lundin <hlundin@google.com>
Date: Tue, 25 Aug 2015 08:56:04 -0500
Message-ID: <272CCA38-0FE6-4E95-BCAD-C8CB090C5250@nostrum.com>
In-Reply-To: <CAOhzyfmZe9tQSsKYA87r+AkyUe7CQWheWKPR8qtzvbx_0GBxtQ@mail.gmail.com>
References: <55A3A7F1.8010109@dial.pipex.com> <CCE6B6D4-E5AC-43D0-B876-3B539694228B@nostrum.com> <FC225BB3-C81D-4C2F-BDC3-B9605EF12CF2@nostrum.com> <55B9DCCE.1050902@alvestrand.no> <73A192FD-294B-4D94-8B27-75E7CF835E58@nostrum.com> <CAOhzyfmZe9tQSsKYA87r+AkyUe7CQWheWKPR8qtzvbx_0GBxtQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
X-Mailer: MailMate (1.9.2r5107)
Archived-At: <http://mailarchive.ietf.org/arch/msg/payload/jmtqTldA2dzPh6fD8mMQURg4N6U>
Cc: draft-ietf-payload-vp8.all@ietf.org, Elwyn Davies <elwynd@dial.pipex.com>, payload-chairs@ietf.org, payload@ietf.org
Subject: Re: [payload] Gen-art 2nd LC review of draft-ietf-payload-vp8-16
X-BeenThere: payload@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Audio/Video Transport Payloads working group discussion list <payload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/payload>, <mailto:payload-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/payload/>
List-Post: <mailto:payload@ietf.org>
List-Help: <mailto:payload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/payload>, <mailto:payload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Aug 2015 13:56:35 -0000

Excellent, thanks!

On 25 Aug 2015, at 1:48, Henrik Lundin wrote:

> Hi Ben,
>
> We have an update. We're reviewing it internally right now. You can expect
> a new submission shortly.
>
> Thanks!
> /Henrik
>
>
> On Mon, Aug 24, 2015 at 11:21 PM, Ben Campbell <ben@nostrum.com> wrote:
>
>> (- GenART and IETF lists)
>>
>> Hi Harald,
>>
>> Any updates?
>>
>> Thanks!
>>
>> Ben.
>>
>> On 30 Jul 2015, at 3:14, Harald Alvestrand wrote:
>>
>>> Unfortunately this is vacation time in Scandinavia - won't do anything
>>> with this until mid-August, I'm afraid. (Henrik might.)
>>>
>>> So it'll have to wait another 2 telechats or so.....
>>>
>>>
>>> On 07/29/2015 11:58 PM, Ben Campbell wrote:
>>>> Hi VP8 Authors:
>>>>
>>>> Has there been a response to Elwyn's Gen-ART review from the authors?
>>>> If so, I haven't seen it. (I did see Colin's response).
>>>>
>>>> Based Elwyn's review, and on Harald's response to my earlier
>>>> questions, I assume that the authors will wish to update the draft
>>>> before it goes back to an IESG telechat. If that's not the case,
>>>> please let me know. Please be advised that tomorrow (July 30) is the
>>>> nominal deadline for getting things on the August 6 telechat.
>>>>
>>>> Thanks!
>>>>
>>>> Ben.
>>>>
>>>>
>>>> On 13 Jul 2015, at 15:57, Ben Campbell wrote:
>>>>
>>>>> (+payload and Harald)
>>>>>
>>>>> Elwyn: Thanks for the detailed review!
>>>>>
>>>>> Authors: Please address Elwyn's comments at your earliest
>>>>> convenience. There's quite a bit here, and I assume we will need a
>>>>> new revision before taking this to the IESG telechat, but it would be
>>>>> helpful to discuss any proposed changes via email before submitting.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Ben.
>>>>>
>>>>>
>>>>> On 13 Jul 2015, at 6:58, Elwyn Davies wrote:
>>>>>
>>>>>> I am the assigned Gen-ART reviewer for this draft. For background on
>>>>>> Gen-ART, please see the FAQ at
>>>>>>
>>>>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>>>>>
>>>>>> Please resolve these comments along with any other Last Call comments
>>>>>> you may receive.
>>>>>>
>>>>>> Document: draft-ietf-payload-vp8-16.txt
>>>>>> Reviewer: Elwyn Davies
>>>>>> Review Date: 2015/07/10
>>>>>> IETF LC End Date: 2015/07/13
>>>>>> IESG Telechat date: (if known) -
>>>>>>
>>>>>> Summary: Gosh, it has been a long time since the last review!
>>>>>>
>>>>>> Major issues:
>>>>>>
>>>>>> Minor issues:
>>>>>> s4.2: Various of the frame indices either SHOULD or, in the case of
>>>>>> KEYIDX, MAY start at random values.  The rationale(s) for these
>>>>>> requirements are not explained - nor is the reason why it it is only a
>>>>>> MAY for KEYIDX whereas it is SHOULD for PictureID and what might
>> happen
>>>>>> were the MAY/SHOULD to be ignored.
>>>>>>
>>>>>> Is the rationale something that should be mentioned in the security
>>>>>> considerations?
>>>>>>
>>>>>> s10.1:  The downref to RFC 6386 identified by idnits was duly noted
>>>>>> in the last call
>>>>>> announcement.  I don't have a problem with the downref.
>>>>>>
>>>>>> Nits/editorial comments:
>>>>>>
>>>>>> General: Consistent usage of "prediction and mode" vs
>>>>>> "prediction/mode" would be desirable.
>>>>>>
>>>>>> s1, para 2: Needs to introduce the 'macroblock' jargon defined in
>>>>>> RFC 6386. Suggest:
>>>>>> OLD:
>>>>>> VP8 is based on decomposition of frames into square sub-blocks of
>>>>>> pixels, prediction of such sub-blocks using previously constructed
>>>>>> blocks, and...
>>>>>> NEW:
>>>>>> VP8 is based on decomposition of frames into square sub-blocks of
>>>>>> pixels known as "macroblocks" (see Section 2 of [RFC6386]).
>> Prediction
>>>>>> of such sub-blocks using previously constructed blocks, and...
>>>>>> END
>>>>>>
>>>>>> s2: There are a number of technical terms brought over from RFC 6386.
>>>>>> These should be listed in s2.  At least the following would be in
>>>>>> the list:
>>>>>> key frame, interframe, golden frame, altref frame, macroblock.
>>>>>> Also the terms picture selection index (RPSI) and slice loss
>>>>>> indication (SLI) from RFC 4585.
>>>>>>
>>>>>> s3, para 2: Providing a reference to explain what temporal scalability
>>>>>> and the hierarchy selection is all about would be useful.  One
>>>>>> possibility
>>>>>> would be:
>>>>>>
>>>>>> Lim, J. Yang, and B. Jeon,”Fast Coding Mode Decision for Scalable
>>>>>> Video Coding”,
>>>>>> 10th Int’l Conf. on Advanced Communication Technology, vol. 3, pp.
>>>>>> 1897-1900, 2008.
>>>>>>
>>>>>> It would also help to add a pointer to the TL0PICIDX description in
>>>>>> s4.2, thus:
>>>>>> ADD:
>>>>>> Support for temporal scalability is provided by the optional
>>>>>> TL0PICIDX and TID/Y/KEYIDX
>>>>>> fields described in Section 4.2.
>>>>>> END
>>>>>>
>>>>>> ss3 and 4: The discussion of how the frame payload might or might
>>>>>> not be distributed across
>>>>>> multiple packets is not very clear.   The draft has in mind a
>>>>>> 'preferred use case' (para 1
>>>>>> of s4.4 says:
>>>>>>
>>>>>>> In the preferred use case, the S bit and PID fields
>>>>>>> described in Section 4.2 should be used to indicate what the packet
>>>>>>> contains.
>>>>>>
>>>>>> I think it would be helpful to be a bit more positive about this in
>>>>>> s3, para 3:
>>>>>> OLD:
>>>>>> This memo
>>>>>> allows the partitions to be sent separately or in the same RTP
>>>>>> packet.  It may be beneficial for decoder error-concealment to send
>>>>>> the partitions in different packets, even though it is not mandatory
>>>>>> according to this specification.
>>>>>> NEW:
>>>>>> Accordingly, this document RECOMMENDS that the frame is packetized
>>>>>> by the sender
>>>>>> with each data partition in a separate packet or packets.  This may
>>>>>> be beneficial
>>>>>> for decoder error concealment and the payload format described in
>>>>>> Section 4 provides
>>>>>> fields that allow the partitions to be identified even if the first
>>>>>> partition is
>>>>>> not available.  The sender can, alternatively, aggregate the data
>>>>>> partitions into
>>>>>> a single data stream and, optionally, split it into several packets
>>>>>> without
>>>>>> consideration of the partition boundaries.  The receiver can use the
>>>>>> length
>>>>>> information in the first partition to identify the partitions during
>>>>>> decoding.
>>>>>> END
>>>>>>
>>>>>> s4/Figure 1: s/bytes/octets/ (2 places)
>>>>>>
>>>>>> Figure 1, caption: s/sequel/Sections 4.2 and 4.3/
>>>>>>
>>>>>> Figure 1: The format shown is correct only for the first packet in a
>>>>>> frame.  Subsequent
>>>>>> packets will not contain the payload header and will have later
>>>>>> octets of the frame payload.
>>>>>> This should be mentioned in drawing and/or in the caption.
>>>>>>
>>>>>> s4.1, last two paras:
>>>>>>> Sequence number:  The sequence numbers are monotonically increasing
>>>>>>> and set as packets are sent.
>>>>>>>
>>>>>>> The remaining RTP header fields are used as specified in
>>>>>>> [RFC3550].
>>>>>> Redefining the Sequence Number field seems gratuitous and
>>>>>> potentially dangerous,
>>>>>> given that it is *very carefully* defined in RFC 3550 and the
>>>>>> overall intent does
>>>>>> not seem any different from that in RFC 3550.  OTOH a note about the
>>>>>> (non?-)use of the X bit
>>>>>> and the Payload Type field (PT) would be useful.  I suggest
>>>>>> replacing the paras above with:
>>>>>> NEW:
>>>>>> X bit: MUST be 0.  The VP8 RTP profile does not use the RTP Header
>>>>>> Extension capability.
>>>>>>
>>>>>> PT (Payload Type):  In line with the policy in Section 3 of
>>>>>> [RFC3551], applications
>>>>>> using the VP8 RTP payload profile MUST assign a dynamic payload type
>>>>>> number to
>>>>>> be used in each RTP session and provide a mechanism to indicate the
>>>>>> mapping.
>>>>>> See Section 6.2 for the mechanism to be used with the Session
>>>>>> Description Protocol (SDP)
>>>>>> [RFC4566].
>>>>>>
>>>>>> The remaining RTP Fixed Header Fields (V, P, CC, sequence number,
>>>>>> SSRC and CSRC
>>>>>> identfiers) are used as specified in Section 5.1 of [RFC5330].
>>>>>> END
>>>>>> Note that this may be considered to make the reference to RFC 3551
>>>>>> normative.
>>>>>>
>>>>>> s4.2, X bit:  There is potential for confusion between the X bit in
>>>>>> the fixed header
>>>>>> and the X bit in the VP8 Payload Descriptor.  Perhaps changing this
>>>>>> to C for control bits
>>>>>> would avoid the problem.
>>>>>>
>>>>>> s4.2, M bit: Similarly, it might be better to use B (for BIG) in
>>>>>> place of M to avoid confusion.
>>>>>>
>>>>>> s4.2, para 7 after Fig 2: For consistency:
>>>>>> OLD:
>>>>>> When the X bit is set to 1 in the first octet, the extension bit
>>>>>> field octet MUST be provided as the second octet.  If the X bit is 0,
>>>>>> the extension bit field octet MUST NOT be present, and no extensions
>>>>>> (I, L, T, or K) are permitted.
>>>>>> NEW:
>>>>>> When the X [or C] bit is set to 1 in the first octet, the Extended
>>>>>> Control Bits
>>>>>> field octet MUST be provided as the second octet.  If the X [or C]
>>>>>> bit is 0,
>>>>>> the extension bit field octet MUST NOT be present, and no extensions
>>>>>> (I, L, T, or K) are permitted.
>>>>>> END
>>>>>>
>>>>>> s4.2: The issue of using either 'wrap' or 'restart' but not both for
>>>>>> restarting number sequences has been raised in various comments by
>>>>>> IESG members.
>>>>>>
>>>>>> s4.2, TL0PICIDX: I think, given the statements in s3 about not
>>>>>> defining the hierarchy, that more explanation is needed of what the
>>>>>> 'lowest or base layer' actually means.
>>>>>>
>>>>>> s4.2, TL0PICIDX:
>>>>>>> If TID is larger than 0, TL0PICIDX
>>>>>>> indicates which base layer frame the current image depends on.
>>>>>>> TL0PICIDX MUST be incremented when TID is 0.
>>>>>> What happens if L=1 but both T=0 and K=0 so that there is no TID
>>>>>> value present? Or indeed if T=0 but K=1 so that the TID field is
>>>>>> there but 'MUST be ignored by the receiver'  (definition of TID
>> field)?
>>>>>>
>>>>>> s4.3: It would be useful to include the section number (9.1) where
>>>>>> the uncompressed data chunk specification is found.  I was somewhat
>>>>>> surprised that the ordering is reversed in the protocol spec.
>>>>>>
>>>>>> s4.3, VER: The receiver should validate this field - currently only
>>>>>> values 0-3 are valid.
>>>>>>
>>>>>> s4.3, SizeN:  Make it clear these are integer fields: s/in Size0,/in
>>>>>> integer fields Size0,/
>>>>>>
>>>>>> s4.3 and s4.4:  It would be helpful to implementers to explain what
>>>>>> the offsets in the  first partition payload are for the additional
>>>>>> partition count and additional partition sizes.  Having rummaged
>>>>>> around in RFC  6386, I have to say that I am unclear whether the
>>>>>> values are placed after the full chunk of data indicated by
>>>>>> 1stPartitionSize or are part of this data. And if the latter is the
>>>>>> case, how to work out where the partition sizes are.  I guess that
>>>>>> they ought to be at offset (1stPartitionSize+10 - key frame - or
>>>>>> 1stPartitionSize+3 - interframe) in the VP8 Payload field as it is
>>>>>> difficult to work out where they are without decoding the partition
>>>>>> otherwise.  Also exactly how many partition sizes to expect - I
>>>>>> think I read in RFC 6386 that the last partition size is implicit.
>>>>>> Help!
>>>>>>
>>>>>> In the course of clearing this up, it might be appropriate to move
>>>>>> the last sentence of the first para of s4.4 and the second para of
>>>>>> s4.4 into s4.3 as part of the explanation.  This is the relevant text:
>>>>>>> Note that the length of the first partition can
>>>>>>> always be obtained from the first partition size parameter in the VP8
>>>>>>> payload header.
>>>>>>>
>>>>>>> The VP8 bitstream format [RFC6386] specifies that if multiple DCT/WHT
>>>>>>> partitions are produced, the location of each partition start is
>>>>>>> found at the end of the first (prediction/mode) partition.  In this
>>>>>>> RTP payload specification, the location offsets are considered to be
>>>>>>> part of the first partition.
>>>>>>
>>>>>>
>>>>>> s4.4:  I found this section very difficult to parse.  It is a bit
>>>>>> 'stream of consciousness' and would benefit from a reordering and
>>>>>> rewrite.  Suggestion (assuming the second para gets moved to s4.3):
>>>>>> NEW SECTION 4.4:
>>>>>> An encoded VP8 frame can be divided into two or more partitions, as
>>>>>> described in Section 1.  It is OPTIONAL for a packetizer implementing
>>>>>> this RTP specification
>>>>>> to pay attention to the partition boundaries within an encoded frame.
>>>>>> If packetization of a frame is done without considering the partition
>>>>>> boundaries, the PID field MAY be set to zero for all packets, and the
>>>>>> S bit MUST NOT be set to one for any other packet than the first.
>>>>>>
>>>>>> If the preferred usage suggested in Section 3 is followed, with each
>>>>>> packet
>>>>>> carrying data from exactly one partition, the S bit and PID fields
>>>>>> described in Section 4.2 SHOULD be used to indicate what the packet
>>>>>> contains.  The PID field should indicate which partition the first
>>>>>> octet of the payload belongs to, and the S bit indicates that the
>>>>>> packet starts on a new partition.
>>>>>>
>>>>>> If the packetizer does not pay attention to the partition
>>>>>> boundaries, one
>>>>>> packet can contain a fragment of a  partition, a complete partition,
>>>>>> or an
>>>>>> aggregate of fragments and partitions.  There is no explicit
>>>>>> signaling of
>>>>>> partition boundaries in the payload and the partition lengths at the
>>>>>> end of
>>>>>> the first partition have to be used to identify the boundaries.
>>>>>> Partitions
>>>>>> MUST be aggregated in decoding order.  Two fragments from different
>>>>>> partitions MAY be aggregated into the same packet along with one or
>>>>>> more
>>>>>> complete partitions.
>>>>>>
>>>>>> In all cases, the payload of a packet MUST contain data from only
>>>>>> one video
>>>>>> frame.  Consequently the set of packets carrying the data from a
>>>>>> particular
>>>>>> frame will contain exactly one VP8 Payload Header (see Section 4.3)
>>>>>> carried
>>>>>> in the first packet of the frame.  The last, or only, packet
>>>>>> carrying data for the
>>>>>> frame MUST have the M bit set in the RTP header.
>>>>>>
>>>>>> s4.5.1:  Th algorithm only applies for the RECOMMENDED use case with
>>>>>> partitions in separate packets.
>>>>>> This should be noted.
>>>>>>
>>>>>> s5.2, last para: s/only parts of the frame is corrupted./only part
>>>>>> of the frame is corrupted./
>>>>>>
>>>>>> s6: s/This payload format has two required parameters./This payload
>>>>>> format has two optional parameters./
>>>>>> [I think this is probably a hangover from the previous version.]
>>>>>>
>>>>>> s7: s/Cryptographic system may also allow/The cryptographic system
>>>>>> may also allow/
>>>>>>
>>>>>> s7:
>>>>>>> the usage of SRTP [RFC3711] is recommended.
>>>>>> RECOMMENDED?
>>>>>>
>>>>>> s10.2: I suspect that RFC 3551 is normative.
>>>
>>>
>>> --
>>> Surveillance is pervasive. Go Dark.
>>
>
>
>
> -- 
> Henrik Lundin | WebRTC Software Eng | hlundin@google.com | +46 70 646 13 41