Re: [xrblock] ??: Comments on draft-ietf-xrblock-rtcp-xr-synchronization-01

Mario Montagud Climent <mamontor@posgrado.upv.es> Mon, 26 November 2012 13:43 UTC

Message-ID: <20121126144319.111456x6edehp5o7@webmail.upv.es>
Date: Mon, 26 Nov 2012 14:43:19 +0100
From: Mario Montagud Climent <mamontor@posgrado.upv.es>
To: Qin Wu <bill.wu@huawei.com>
References: <51E6A56BD6A85142B9D172C87FC3ABBB4438F1AB@szxeml539-mbx.china.huawei.com> <20121103195158.715146ywbr8lhqz2@webmail.upv.es> <B8F9A780D330094D99AF023C5877DABA4304850D@szxeml523-mbx.china.huawei.com> <20121108203028.18646pziiczpi6vo@webmail.upv.es> <4DA06E997ACD400DA0B6B2D1E9948078@china.huawei.com>
In-Reply-To: <4DA06E997ACD400DA0B6B2D1E9948078@china.huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; DelSp="Yes"; format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.6)
Cc: xrblock@ietf.org
Subject: Re: [xrblock] ??: Comments on draft-ietf-xrblock-rtcp-xr-synchronization-01
Precedence: list

Hi Qin, all,

We were happy to help! :)

See our comments inline.


Qin Wu <bill.wu@huawei.com> escribió:

> Hi,Mario:
> Thank for your long length comments.:-) I remove the comments we  
> have no issues.
> Please see my reply inline below.
>
> Regards!
> -Qin
> ----- Original Message -----
> From: "Mario Montagud Climent" <mamontor@posgrado.upv.es>
> To: "Qin Wu" <bill.wu@huawei.com>
> Cc: <xrblock@ietf.org>; "Huangyihong (Rachel)"  
> <rachel.huang@huawei.com>; "Hitoshi Asaeda" <asaeda@sfc.wide.ad.jp>
> Sent: Friday, November 09, 2012 3:30 AM
> Subject: Re: ??: Comments on draft-ietf-xrblock-rtcp-xr-synchronization-01
>
>
>>
>> Hi Qin, Rachel, all,
>>
>> See our comments inline.
>>
>>
>> Qin Wu <bill.wu@huawei.com> escribió:
>>
>>> Hi,Mario and Fernando:
>>> Thank for your valuable reviews. Let me try to clarify your concerns.
>>> Also please see my reply below inline.
>>>
>>> Regards!
>>> -Qin
>>>
>>> -----????-----
>>> ???: Mario Montagud Climent [mailto:mamontor@posgrado.upv.es]
>>> ????: 2012?11?4? 2:52
>>> ???: xrblock@ietf.org
>>> ??: Huangyihong (Rachel); Hitoshi Asaeda; Qin Wu
>>> ??: Comments on draft-ietf-xrblock-rtcp-xr-synchronization-01
>>>
>>>
>>> Hi all,
>>>
>>> We (Fernando Boronat and me) have reviewed the updated version of the
>>> draft-ietf-xrblock-rtcp-xr-synchronization-01 and read the issues
>>> associated to this draft that raised recently in the mailing list.
>>> Here are our comments and suggestions:
>>>
>>> Comments regarding the 'Initial Synchronization Delay' Metric:
>>>
>>> - We still find a bit confusing the definition of this metric.
>>>
>>> We are dealing with INTER-STREAM Synchronization Delay, so we think
>>> the word INTER-STREAM should be added in the definition of this metric
>>> for better clarity.
>>>
>>> [Qin]:The Initial Synchronization delay we are using for this metric
>>> is clearly
>>> Specified in RFC6051. The key feature is "Initial" rather than
>>> "inter-stream".
>>> I don't believe it is referred to time different between two stream.
>>> Rather than,
>>> It means how long it take to receive all the components of  
>>> multimedia session
>>> Or layer session. But I agree inter-stream is applied to
>>> synchronization offset metric
>>> We defined in this draft.
>>>
>>> As we understand from the RFC 6051, an appropriate definition could
>>> be: "In multimedia streaming services, the (inter-stream)
>>> synchronization delay refers to the time difference between the moment
>>> a user joins a (multicast) multimedia session, probably involving more
>>> than one media streams (e.g., audio and video, or when using layered
>>> and/or multi-description codecs), and the instant when the correlated
>>> media streams can be synchronously presented to that user, i.e. when
>>> RTCP packets (including SDES and SR reports), or when the first RTP
>>> packets with header extensions including in-band synchronization
>>> metadata, have been received on all the involved RTP sessions in the
>>> multimedia session".
>>>
>>> [Qin]: Looks good to me except the wording about "inter-stream".
>>>
>>
>> [F & M] We didn't refer to the time difference (offset) between
>> different streams, but to the time difference between joining a
>> multimedia session (or a reference RTP session) and the instant at
>> which all involved media streams can be initially presented to the
>> users in a synchronous way. Therefore, we think this metric gives the
>> INITIAL delay for INTER-STREAM synchronization, because all the
>> involved media streams cannot be synchronously presented to the users
>> until the info needed for synchronizing all of them (included in RTCP
>> packets or in RTP header extensions, as specified in RFC 6051) has
>> been received on all the component RTP sessions. This was our
>> rationale for including the term INTER-STREAM in this definition.
>>
>
> [Qin]: my understanding is the initial synchronization delay we care  
> about is total
> time difference between the first stream is joined and all the  
> streams are syncronized.
> We don't care about the initial synchronization delay between any  
> either two streams
>  in the multimedia session.
>
> So add "inter-stream" may confuse people we are measuring time  
> difference or synchronization
> offset between two streams.
>

[F&M] If you think adding "inter-stream" can confuse the readers, this  
is OK for us.
But in compound sessions, we think "inter-stream" sync relates to the  
timing offset between all involved media streams, not only between two  
streams. Anyway, it is up to you.


>>
>>>
>>> We understand that the minimization of this metric is important in
>>> multimedia streaming services, e.g. for minimizing zapping delays, but
>>> we would like to see in this draft the utility of this RFISD block.
>>> For example, should the receiver of this report (we assume the media
>>> source) do something when receiving it? Is this RFISD block only used
>>> for informational purposes?
>>>
>>> [Qin]: The information in this metric report can be used by the
>>> receiver of this
>>> Report to compare actual initial synchronization delay to targets (i.e., a
>>>    numerical objective or Service Level Agreement) to help ensure the
>>>    quality of real-time application performance.
>>>
>>
>> [F&M] We suggest adding a similar paragraph to the draft.
>
> [Qin]: Okay.
>
>>> Furthermore, is it expected that all the receivers join the multimedia
>>> session (or the group of RTP sessions) almost simultaneously? For
>>> instance, there could be significant delay differences between the
>>> instants at which different receivers join the same multimedia session
>>> (or the set of RTP sessions). In such a case, the measurement of the
>>> inter-stream synchronization delay should not have the same reference
>>> point for all of the receivers. Do we need some mechanisms to
>>> establish the same reference point or to indicate the exact instant
>>> for the reference point in each receiver?
>>>
>>> [Qin]: we allow different receivers report the different initial
>>> synchronization delay.
>>> Since these receiver joins at different time. It doesn't matter
>>> since What we report is per receiver metric.
>>
>> [F & M] Ok. But, in this way, if different receivers, even under
>> similar network conditions (e.g., delays, jitter?), join the session
>> at different instants, but inside the same RTCP report interval (i.e.,
>> between two consecutive RTCP packets sent by the media source for that
>> session), there could be significant differences between the "Initial
>> Sync Delays" reported by each one of them. For example, depending on
>> the joining time, a receiver A could receive the RTCP packets from all
>> the RTP sessions with larger/lower initial delay than another receiver
>> B if the latter joined the session before/later than the former.
>>
>> Even though you only want to "compare actual initial synchronization
>> delay to targets (i.e., a numerical objective or Service Level
>> Agreement)", as pointed out in your previous comment, could the above
>> situation be problematic?
>>
>>>
>>> We also think that the terms "start/beginning of session" in the text
>>> should be replaced or better explained.
>>>
>>> - We also think that using 1/65536 second units (giving 15 microsecond
>>> accuracy), instead of a 64-bit timestamp, should be accurate enough
>>> for Initial Sync Delay Reporting. But, wouldn?t be more practical and
>>> simpler the use of the same measurement units for both XR blocks?
>>>
>>> [Qin]: It looks good to me however what accuracy requirements for
>>> initial sync Delay have we?
>>> Why the accuracy of the initial sync delay we currently define is
>>> not sufficient?
>>
>> [F & M] We think this accuracy is sufficient. Our comment was because
>> of simplicity (this way, both reports could employ the same
>> measurement unit).
>
> [Qin]: Looks good to me. Let's see what other people think.
>

[F&M] Ok

>>>
>>> - "SSRC of Media Source" -> Shouldn?t this draft specify a policy for
>>> choosing the component SSRC of a multimedia session to report on this
>>> metric? The draft indicates an arbitrary stream, maybe an option
>>> should be the SSRC identifier of a multimedia session with the longest
>>> RTCP reporting interval ...
>>>
>>> [Qin]: no, we may report on each media stream that belongs to the
>>> same multimedia session.
>>
>>
>> [F & M] Not sure we are following you here.
>
>
>> Must one RFSI block be
>> sent per each SSRC (of each one of the involved RTP sessions) in a
>> multimedia session? Should not be easier to report only on one
>> reference RTP session?
>
> [Qin]: Sorry, I thought you comment on synchronization offset. I  
> take back what I said here.
> I think you are correct. It is more reasonable to report only on one  
> reference RTP session.


[F&M] Ok :) Thanks!


>>
>> We see different options for selecting the reference RTP session for
>> the "initial sync delay" (i.e. the reference point for this
>> measurement).
>>
>> One option could be selecting the session with the longest RTCP
>> reporting interval, as you proposed. Here, we think the "delay for
>> RTCP reporting interval" concept must be defined accordingly to avoid
>> a possible misleading. On one hand, we can see the delay for the RTCP
>> reporting interval for a media session as the time difference between
>> two successive RTCP packets from this media session. On the other
>> hand, we can also see, from the receiver point of view, the delay for
>> the RTCP reporting interval as the time difference between joining a
>> media session and receiving the first RTCP packet from that media
>> session. In the latter case, the delay for the RTCP reporting interval
>> will depend on the specific joining time for that each receiver (each
>> receiver can join a session in a different instant during the RTCP
>> reporting interval). We propose to clarify this in the draft, if
>> adopted.
>>
>> Another option could be "choosing the time when the first/ last RTP
>> session is joined as the beginning of the multimedia session", as
>> Rachel proposed.
>>
>> We see as a more feasible option to choose the time when a receiver
>> joins the FIRST RTP session of the multimedia session as the
>> starting/reference point for the "initial sync delay" measurements.
>> The use of the RTCP reporting delay for choosing a reference session
>> could be problematic.
>>
>> But, in the way we are proposing, we are assuming that the joining
>> time for the first RTP session can be known for all the other RTP
>> sessions involved in the multimedia session (if we do not want to
>> assume this info can be accessible between the RTP sessions, we will
>> have to assume that the joining time is almost the same for all the
>> RTP sessions).
>>
> [Qin]: Good points, In the meeting, we did discuss this issue, we  
> think choosing RTP session
> with longest interval is not reasonable since report interval may  
> change with session size.
> We believe choosing the time when the first RTP session is joined is  
> a good choice
> for the measurement starting point.
>

[F&M] Ok

>>>
>>> Comments regarding the 'Synchronization Offset' Metric:
>>>
>>> - We also agree that reporting synchronization offset per report basis
>>> (instead that for packet basis) can be sufficient.
>>>
>>> - We think that the definition of this metric is clearer as it is now
>>> in the draft.
>>>
>>> We agree with the importance of minimizing the "sync offset" for
>>> guaranteeing QoE. As specified in RFC 3550, synchronization between
>>> two media streams, i.e. inter-stream synchronization, can be achieved
>>> by using the source identification (i.e. the CNAME item), included in
>>> the SDES reports, and the NTP-RTP timestamps correlation info,
>>> included in the SRs, from the different media streams.
>>>
>>> So, as for the other RFISD block, we would like to see the utility of
>>> this block in the draft. Should the receiver of the RFSO block (we
>>> assume the media source/s) do something (e.g. adaption mechanisms)
>>> when receiving it? Is this RFSO block only used for informational
>>> purposes? We think this should be clarified in the draft for both XR
>>> blocks.
>>>
>>> [Qin]: I think the information received from RFSo block is
>>>    valuable to network managers in troubleshooting network and user
>>>    experience issues.
>>>
>>> - The draft specifies a new XR block for reporting Synchronization
>>> Offset between correlated media streams. One of the streams is
>>> selected as the reference, so we think that some criteria for choosing
>>> that reference (i.e., master) media stream should be added in the
>>> draft. Otherwise, different receivers could select different streams
>>> as the reference one.
>>>
>>> Now, the dhe draft indicates (on page 6) that the reference stream
>>> "can be chosen as the arbitrary stream with minimum delay according to
>>> the common criterion defined in section 6.2.2.1 of [Y.1540]".
>>>
>>> Using this mechanisms, different receivers could select different
>>> streams as the reference one. Could this be problematic?
>>>
>>> [Qin]:Good question, we may choose
>>> the SSRC identifier of one session in multimedia session with the longest
>>> RTCP reporting interval since RFSO deal with multiple sessions that
>>> belong to the same multimedia session.
>>
>> [F & M] Thanks. We are not sure about the suitability of this
>> assumption. We do not see why the "stream with the longest RTCP
>> reporting interval" should be selected as the reference stream for the
>> "sync offset" measurement because of the above discussion. We think
>> other mechanisms should be discussed. Possible options include the
>> most lagged/advanced RTP media streams (i.e. the ones with the
>> highest/lowest reception or presentation, i.e. end-to-end, delays) or
>> a fixed reference stream selected based on other criteria.
>
> [Qin]: Looks better than my proposal. :-)

[F&M] Ok. Thanks!

>>
>> Besides, the delay for the RTCP reporting interval does not have to be
>> necessarily linked to the delay for the RTP media stream. A media
>> session could have the longest RTCP reporting interval, but the delay
>> for its RTP media stream could be acceptable.
>
>>
>> For the "sync offset", we think it is better that the reference stream
>> is selected based on the experienced delay for the involved RTP
>> streams, because the sync offset is measured for the RTP streams, and
>> the associated sync adjustments to minimize this offset are also
>> performed on the RTP streams.
>>
>
> [Qin]: Agree.
>
>>>
>>> - And, finally, in our opinion, a very important issue is:
>>>
>>> This draft specifies the Synchronization Offset between correlated
>>> media streams taking into account the arrival times of RTP packets for
>>> the considered streams (see formula on page 8).
>>>
>>> Working on reception times could not be enough accurate for use cases
>>> with stringent inter-stream synchronization requirements, especially
>>> when different types of media streams are involved. This is because
>>> the different RTP streams could experience variable delays at the
>>> receiver side, i.e. from the reception instant of RTP packets until
>>> the instant at which the media units (e.g. video frames or audio
>>> samples) included in these RTP packets are played out, mainly due to
>>> different de-packetizing, de-payloading, de-coding, rendering,
>>> processing delays, etc.  So, if we want to report on accurate sync
>>> offset values, we should consider presentation times for the involved
>>> media streams, as in our IDMS draft. Do you think this requirement is
>>> also needed for this draft?
>>>
>>> [Qin]: Not sure about this, would you like to clarify how to use
>>> presentation times to calculate sync offset?
>>>
>>> Another issue when measuring "sync offset" (per report basis)
>>> considering RTP arrival times is that this measurement can be
>>> significantly affected by the existence of network jitter. As the
>>> streams are sent independently, the RTP packets (or the specific RTP
>>> packet for which this metric is reported) of media stream A could
>>> experience low jitter delays, whilst the RTP packets (or the specific
>>> RTP packet for which this metric is reported) of media stream B could
>>> (sporadically) experience high jitter delays, so this would lead to
>>> the reporting of high and variable "sync offsets" values. So, this
>>> will not provide a smooth, but a variable, measurement.
>>>
>>> [Qin]: We have proposed to change RTP time stamp into NTP timestamps.
>>
>> [F & M] In order to use presentation times, we would need to track RTP
>> packets from their arrival to their presentation (or play out) times.
>> This can be seen as a form of layer-violation in some RTP
>> implementations, as previously discussed in the AVTCORE list for our
>> IDMS draft. That is the reason why, in our IDMS draft, reporting on
>> presentation times is optional, but reporting on arrival times is
>> mandatory.
>>
>> But, if presentation times are supported, the sync offset could be
>> easily (and more accurately) calculated than in the current proposal.
>> The calculation is as follows:
>>
>> Different times for stream A: t_i_A (transmission time, i.e. RTP
>> timestamp of i-th RTP packet of stream A), r_i_A (NTP-based arrival
>> time of i-th RTP packet of stream A), p_i_A (NTP-based presentation
>> time of i-th RTP packet of stream A)
>>
>> Different times for stream B: t_j_B (transmission time, i.e. RTP
>> timestamp of j-th RTP packet of stream B), r_j_B (NTP-based arrival
>> time of j-th RTP packet of stream B), p_j_B (NTP-based presentation
>> time of j-th RTP packet of stream B).
>>
>> Therefore, the sync offset between stream A and B can be calculated as:
>>
>> - Using presentation times: (p_i_A - t_i_A) - (p_j_B - t_j_B), this
>> gives the end-to-end delay variability
>>
>> - Using reception times: (r_i_A - t_i_A) - (r_j_B - t_j_B), this gives
>> the network delay variability
>>
>> * Note that we are assuming that RTP timestamps can be mapped to
>> NTP-format timestamps (for the RTP transmission timestamps), based on
>> the correlation timing info included in RTCP SRs.
>>
>> Therefore, we think that using presentation timestamps the measurement
>> of the "sync offset" metric will be more accurate (and smoother) than
>> using reception times, because of the variable delays at the
>> distribution and at the receiver sides.
>
> [Qin]: Your proposal is using presentation timestamp which is quite  
> different what
> we are currectly proposing in the draft. We will verify your  
> proposal to see which approach
> is better.
>

[F&M] Ok.

>>>
>>> - For both XR blocks defined in this draft, it is stated that: "If the
>>> measurement is unavailable, the value of this field with all bits set
>>> to 1 SHOULD be reported". But, if the measurements are unavailable,
>>> why these XR blocks are needed? Would not it be better to simply not
>>> sending these XR blocks?
>>>
>>> [Qin]: these XR block may be sent in each RTCP report interval, if
>>> we not send them to the receiver of XRBLOCK,
>>> the receiver will regard these XRBLOCK are lost, which is not what
>>> we expected.
>>>
>>> - Finally, we think that this draft should indicate when the proposed
>>> XR blocks are sent. Should the RFISD block be sent only once per media
>>> session? Should the RFSO block be sent in each RTCP report interval in
>>> a compound RTCP packet?
>>>
>>> [Qin]: For RFISD, both allows, but I think it more makes sense to
>>> send only once per media session.
>>
>>
>> [F & M] We assume that these RTCP report blocks will be sent in
>> compound RTCP packets. Regarding the RFSI block, we assume that it is
>> only needed once per session (since it reports on INITIAL Sync Delay,
>> it is not needed during the session lifetime). Regarding the RFSO
>> block, we think that two options could be employed: 1) to send this
>> block in each RTCP report interval, independently of the value of the
>> "sync offset"; 2) to send this block only if the value of the "sync
>> offset" exceeds a configurable allowed asynchrony threshold.
>>
>> Therefore, if these XR blocks are not included in the received RTCP
>> compound packet, we think it can be assumed that the metrics these
>> blocks report on are not available, without the need of sending the
>> block reports (because in such a case, these reports do not contain
>> any further statistics).
>
> [Qin]: What about these RTCP report block is not sent in the  
> compound RTCP packet?
>

[F&M] Do you mean sending these reports as immediate and reduced-size  
RTCP packets?

Best Regards,

Fernando & Mario


>>
>> Best Regards,
>>
>> Fernando & Mario.
>>

[xrblock] FW: New Version Notification for draft-… Huangyihong (Rachel)
[xrblock] Comments on draft-ietf-xrblock-rtcp-xr-… Mario Montagud Climent
[xrblock] 答复: Comments on draft-ietf-xrblock-rtcp… Qin Wu
Re: [xrblock] ??: Comments on draft-ietf-xrblock-… Mario Montagud Climent
Re: [xrblock] ??: Comments on draft-ietf-xrblock-… Qin Wu
Re: [xrblock] ??: Comments on draft-ietf-xrblock-… Mario Montagud Climent
Re: [xrblock] ??: Comments on draft-ietf-xrblock-… Qin Wu