Re: [AVT] RTP streaming and adaptation to AVC of an SVC temporal scalable bitstream

Dear Ye Kui,

I've been doing a little more thinking about the solution of having layer 
specific sequence numbers and I got the following conclusions (please 
correct me if something is wrong):
- each combination of temporal_id (3 bits), dependency_id (3 bits) and 
quality_id (5 bits) completely identify a scalable layer;
- the streamer should keep a separate sequence number variable for any of 
these layers (combination of temporal_id, dependency_id and quality_id);
- separate sequence number variables don't mean different fields in the RTP 
payload header (NALU header), as any NALU belongs to a specific layer and 
not to other layers (1:1 mapping between NALU and layer);
- so, we'd have just one additional field in the NALU header (16 bit, like 
sequence number in RTP); this field could be added just to the prefix NALUs 
syntax to save some bandwidth; this wouldn't make the RTP payload structure 
that ugly, at least in my opinion...;
- I agree that having a prefix NALU for any slice NALU would definitely 
increase the bandwidth occupation and that for temporal scalability the SEI 
message based solution could be better; however, this would mean that in any 
SVC streaming scenario the single RTP session mode must be used in 
conjunction with the sub-sequence SEI message to prevent packet loss 
problems when dropping higher layers;
- if everything above is correct, I suppose that without this additional 
layer-specific sequence number field, in any case the smoother way to handle 
packet losses is transporting different layers of a SVC bitstream in 
different RTP sessions.

Thanks

BR,

Daniele

Daniele Renzi
bSoft -- www.bsoft.info
+39-0733-57707  (tel/fax)

----- Original Message ----- 
From: "daniele renzi (bsoft)" <daniele@bsoft.info>
To: <Ye-Kui.Wang@nokia.com>
Sent: Tuesday, August 21, 2007 9:43 AM
Subject: Re: [AVT] RTP streaming and adaptation to AVC of an SVC temporal 
scalable bitstream - Packet loss

> Dear Ye Kui,
>
> I agree with you that having layer specific sequence numbers would make 
> the payload structure pretty ugly, even though I think also that for 
> spatial and quality scalability the RTP session multiplexing (different 
> RTP sessions for different layers) would be more sensible than for 
> temporal scalability, as one of the biggest benefit to have a single RTP 
> session for a temporal scalable bistream using prefix NAL Units is to make 
> an AVC decoder able to decode even the original SVC stream (with all the 
> layers) by simply discarding the prefix NAL Units.
> This would save from having specific RTP sequence numbers for spatial and 
> quality scalability and combinations of them.
>
> Anyway, as you suggested, using the SEI message based solution in addition 
> to the RTP sequence number seems a very good solution, as well as thinking 
> about removing in our scenario the single RTP session requirement.
>
> I'll try to estimate which is the best solution for us.
>
> Thanks a lot for your help.
>
> Best regards,
>
> Daniele
>
> Daniele Renzi
> bSoft -- www.bsoft.info
> +39-0733-57707  (tel/fax)
>
> ----- Original Message ----- 
> From: <Ye-Kui.Wang@nokia.com>
> To: <daniele@bsoft.info>
> Cc: <avt@ietf.org>; <csp@csperkins.org>; <schierl@hhi.fhg.de>
> Sent: Monday, August 20, 2007 9:58 PM
> Subject: RE: [AVT] RTP streaming and adaptation to AVC of an SVC temporal 
> scalable bitstream - Packet loss
>
>
> Daniele,
>
> My understanding is that the NAL unit header of a NAL unit contained in a 
> single NAL unit packet is considered as the payload header. However, for 
> an aggregation packet, the "NAL unit header" of the aggregation packet 
> itself is considered as the payload header, but not the NAL unit headers 
> of individual NAL units contained in the aggregation packet. Otherwise we 
> have to understand that the payload header is interleaved with payload 
> data.
>
> The SEI message based solution can be used together with normal RTP 
> sequence number to detect loss of a part of a picture. So is true also for 
> the PACSI + TL0PicIdx solution mentioned by Thomas.
>
> However, you are right that a temporal_id specific RTP sequence number 
> would be smoother as parsing of sub-sequence information SEI message is 
> not required. But to me the complexity reduction is not much. Furthermore, 
> having such layer specific RTP sequence numbers would make the payload 
> structure pretty ugly, because general SVC use cases with other 
> scalability dimensions, each combination of temporal_id (3 bits), 
> dependency_id (3 bits) and quality_id (5 bits) then needs to have their 
> specific RTP sequence numbers. The total number is up to 8x8x32.
>
> BR, YK
>
> ________________________________
>
> From: ext daniele renzi (bsoft) [mailto:daniele@bsoft.info]
> Sent: Monday, August 20, 2007 6:48 PM
> To: Wang Ye-Kui (Nokia-NRC/Tampere)
> Cc: avt@ietf.org; csp@csperkins.org; schierl@hhi.fhg.de
> Subject: Re: [AVT] RTP streaming and adaptation to AVC of an SVC temporal 
> scalable bitstream - Packet loss
>
>
> Dear Ye Kui,
>
> sorry for not being clear.
>
> Currently the adapter parses the NALU header to get all the information it 
> needs, e.g. temporal_id.
>
> When I mentioned the RTP payload header I meant also the NALU header, as, 
> according to RFC-3984:
> "All NAL units consist of a single NAL unit type octet, which also 
> co-serves as the payload header of this RTP payload format".
>
> Concerning the packet loss handling in the SEI message based solution, if 
> I understood correctly the sub_seq_frame_num is used to detect a reference 
> picture loss in the sub-sequence.
> However, this way to detect a loss looks to me inefficient (for our 
> scenario) in the case where a packet carrying a slice which is just a 
> sub-part of a picture gest lost, as it is based on sub_seq_frame_num gap 
> detection and sub_seq_frame_num is the same for any slice in the same 
> picture.
>
> For our purposes this solution could be equivalent to simply delimiting an 
> access unit by the marker_bit and inferring that a lost packet belongs to 
> a layer by simply assuming that any packet in that access unit had the 
> same temporal_id, which we can get either from prefix NALU Units or SEI 
> message.
> Maybe the SEI message based solution could look more useful than the one 
> based on prefix NALUs if we consider the SEI message as a better delimiter 
> of an access unit and consequently of a temporal layer.
>
> I still think that a parameter simulating the RTP sequence number inside 
> the NALU header information and specific for any temporal layer would have 
> been the smoother solution to keep the single RTP session mode.
>
> Please correct me if my understanding is somewhere wrong.
>
> Sorry for being a little long-winded with my emails...
>
> Thanks a lot
>
> BR,
>
> Daniele
>
>
> Daniele Renzi
> bSoft -- www.bsoft.info <http://www.bsoft.info>
> +39-0733-57707  (tel/fax)
>
> ----- Original Message ----- 
> From: Ye-Kui.Wang@nokia.com
> To: daniele@bsoft.info ; csp@csperkins.org ; schierl@hhi.fhg.de
> Cc: avt@ietf.org
> Sent: Monday, August 20, 2007 3:35 PM
> Subject: RE: [AVT] RTP streaming and adaptation to AVC of an SVC temporal 
> scalable bitstream - Packet loss
>
> Dear Daniele,
>
> So you were assuming that the adapter does not parse anything else than 
> RTP payload header as specified in the payload format. Then how could it 
> find out which packet contains prefix NAL unit to be discarded? To do the 
> adaption, parsing more than RTP payload header is needed anyway. In that 
> case, parsing of a certain SEI message does not impose much burden in 
> addition. In the sub-sequence information SEI message based solution, the 
> adapter parses an sub-sequence information SEI message to detect whether 
> an earlier packet containing a slice to be included in the outcoming 
> stream was lost, then knows how to set the RTP sequence number of the 
> current outgoing packet.
>
> BR, YK
>
>
> ________________________________
>
> From: ext daniele renzi (bsoft) [mailto:daniele@bsoft.info]
> Sent: Monday, August 20, 2007 1:12 PM
> To: Wang Ye-Kui (Nokia-NRC/Tampere); csp@csperkins.org; schierl@hhi.fhg.de
> Cc: avt@ietf.org
> Subject: Re: [AVT] RTP streaming and adaptation to AVC of an SVC temporal 
> scalable bitstream - Packet loss
>
>
> Dear Ye-Kui,
>
> thanks for your help.
>
> Just one (two) more question(s).
> In the SEI message based solution, how is a packet loss handled?
> Isn't it the same as in the prefix NALUs based solution, that is by the 
> RTP sequence number?
> It seems that the problem would persist if for example the SEI message 
> gets lost, but I could be wrong.
>
> From RFC-3984 and RFC-3550 I suppose that the sequence number wouldn't be 
> anyway specific for the single sub-sequence (temporal layer).
>
> So far I can't see any other solution than multiplexing different RTP 
> sessions for different temporal layers (or sub-sequences), unless a 
> layer-specific sequence number was present in the RTP payload when using a 
> single RTP session.
>
> Thanks.
>
> Best regards,
>
> Daniele
>
> Daniele Renzi
> bSoft -- www.bsoft.info <http://www.bsoft.info>
> +39-0733-57707  (tel/fax)
>
> ----- Original Message ----- 
> From: <Ye-Kui.Wang@nokia.com <mailto:Ye-Kui.Wang@nokia.com> >
> To: <daniele@bsoft.info <mailto:daniele@bsoft.info> >; <csp@csperkins.org 
> <mailto:csp@csperkins.org> >; <schierl@hhi.fhg.de 
> <mailto:schierl@hhi.fhg.de> >
> Cc: <avt@ietf.org <mailto:avt@ietf.org> >
> Sent: Sunday, August 19, 2007 11:16 PM
> Subject: RE: [AVT] RTP streaming and adaptation to AVC of an SVC temporal 
> scalable bitstream - Packet loss
>
>
>
> Yet another solution is to use AVC itself instead of SVC (you can also use 
> RFC 3984 instead of the SVC RTP payload draft), as you need only temporal 
> scalability. This requires the use of sub-sequence information SEI 
> messages. The sub_seq_layer_num indicates the temporal layer. You set each 
> sub-sequence layer (i.e. temporal layer) as one sub-sequence, then the 
> sub_seq_frame_num indicates the frame number of each reference frame 
> inside a temporal layer.
>
> In the prefix NAL unit plus PACSI with TL0PicIndex solution, the adapter 
> needs to parse prefix NAL units, and the outcoming stream can only be of 
> temporal_id equal to 0 (i.e. the lowest temporal layer). In the 
> sub-seqence informtion SEI message based solution, the adapter needs to 
> parse sub-sequence information SEI messages, and the outcoming stream can 
> be of any lower temporal layers.
>
> BR, YK
>
>>-----Original Message-----
>>From: ext daniele renzi (bsoft) [mailto:daniele@bsoft.info]
>>Sent: Sunday, August 19, 2007 8:55 PM
>>To: Wang Ye-Kui (Nokia-NRC/Tampere); csp@csperkins.org 
>><mailto:csp@csperkins.org> ; Thomas Schierl
>>Cc: avt@ietf.org <mailto:avt@ietf.org>
>>Subject: Re: [AVT] RTP streaming and adaptation to AVC of an
>>SVC temporal scalable bitstream - Packet loss
>>
>>Dear Ye-Kui, Colin, Thomas, all,
>>
>>many thanks for your clarifications.
>>
>>Anyway, I'd like to define precisely the scenario.
>>Indeed, as Ye-Kui specified, the SVCtoAVC adapter assign new
>>sequence numbers to the outgoing stream.
>>Here is the problem: if a packet is lost in the incoming
>>stream, it would be good if the adapter reported this anyway
>>to the receiver, even though the packet loss was in a
>>different RTP *segment* (I'm not sure if that can be defined
>>as a *session*...), as in any case there has been a loss
>>between the sender and the receiver that should be handled by
>>the receiver itself.
>>But the adapter cannot say whether this loss affected the base
>>layer or an enhancement layer, as the prefix NALU could have
>>been lost and with it the scalability information
>>(temporal_id). Then it could assert that the sequence number
>>gap in the incoming stream is due to a loss in the enhancement
>>layer (then do nothing) even when this isn't true, or viceversa.
>>
>>We're trying to get a solution (maybe different than forcing
>>the insertion of a sequence number gap in the outgoing stream)
>>to make the receiver able to handle a loss in both the RTP *segments*.
>>
>>I'll try to evaluate the Thomas' proposal and have a better
>>look to RFC-3550 and draft-ietf-avt-topologies-06.txt.
>>
>>Thanks again.
>>
>>Best regards,
>>
>>Daniele
>>
>>
>>Daniele Renzi
>>bSoft -- www.bsoft.info <http://www.bsoft.info>
>>+39-0733-57707  (tel/fax)
>>
>>----- Original Message -----
>>From: <Ye-Kui.Wang@nokia.com <mailto:Ye-Kui.Wang@nokia.com> >
>>To: <csp@csperkins.org <mailto:csp@csperkins.org> >
>>Cc: <daniele@bsoft.info <mailto:daniele@bsoft.info> >; <avt@ietf.org 
>><mailto:avt@ietf.org> >
>>Sent: Sunday, August 19, 2007 7:34 AM
>>Subject: RE: [AVT] RTP streaming and adaptation to AVC of an
>>SVC temporalscalable bitstream - Packet loss
>>
>>
>>
>>OK, forget about my naïve question, because I found the
>>following sentence
>>in RFC 3550, "If multiple data packets are re-encoded into one, or vice
>>versa, a translator MUST assign new sequence numbers to the outgoing
>>packets."
>>
>>BR, YK
>>
>>>-----Original Message-----
>>>From: Wang Ye-Kui (Nokia-NRC/Tampere)
>>>Sent: Sunday, August 19, 2007 12:59 AM
>>>To: 'ext Colin Perkins'
>>>Cc: daniele@bsoft.info <mailto:daniele@bsoft.info> ; avt@ietf.org 
>>><mailto:avt@ietf.org>
>>>Subject: RE: [AVT] RTP streaming and adaptation to AVC of an
>>>SVC temporalscalable bitstream - Packet loss
>>>
>>>
>>>Hi Colin,
>>>
>>>Thanks for your clarification. But according to the following
>>>sentence copied from Daniele's email,
>>>
>>>"... then the SVCtoAVC adapter doesn't know whether this loss
>>>has to be signaled to the receiver, i.e. whether it must
>>>insert a sequence number gap in the outcoming RTP stream...",
>>>
>>>the SVCtoAVC adapter uses a different RTP sequence number
>>>value space for the outcoming RTP steam than the incoming RTP
>>>stream. Is this what a translator can do? It is not clear how
>>>the SVCtoAVC adapter handles the CC, SSRC and CSRC fields and
>>>RTCP traffic, though.
>>>
>>>BR, YK
>>>
>>>>-----Original Message-----
>>>>From: ext Colin Perkins [mailto:csp@csperkins.org]
>>>>Sent: Saturday, August 18, 2007 9:07 PM
>>>>To: Wang Ye-Kui (Nokia-NRC/Tampere)
>>>>Cc: daniele@bsoft.info <mailto:daniele@bsoft.info> ; avt@ietf.org 
>>>><mailto:avt@ietf.org>
>>>>Subject: Re: [AVT] RTP streaming and adaptation to AVC of an SVC
>>>>temporalscalable bitstream - Packet loss
>>>>
>>>>On 18 Aug 2007, at 18:36, <Ye-Kui.Wang@nokia.com 
>>>><mailto:Ye-Kui.Wang@nokia.com> > wrote:
>>>>> In your example the SVCtoAVC adapter is an RTP mixer, which
>>>>terminates
>>>>> the RTP session between the sender and itself and restarts
>>>>another RTP
>>>>> session between itself and the receiver.
>>>>> Therefore the RTP sequence number needs to be updated for the base
>>>>> layer packets anyway.
>>>>
>>>>If the SVCtoAVC adapter is a transcoder from an SVC stream to an AVC
>>>>stream, it will be an RTP translator, not an RTP mixer.
>>>>Neither an RTP translator or a RTP mixer terminate the RTP
>>>session. RFC
>>>>3550 and draft-ietf-avt-topologies-06.txt discuss this in
>>more detail.
>>>>
>>>>--
>>>>Colin Perkins
>>>>http://csperkins.org/
>>>>
>>>>
>>>>
>>
>>
>>
>>-- 
>>No virus found in this incoming message.
>>Checked by AVG Free Edition.
>>Version: 7.5.484 / Virus Database: 269.12.0/961 - Release
>>Date: 19/08/2007
>>07:27
>>
>>
>>
>
>
>
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.484 / Virus Database: 269.12.0/961 - Release Date: 19/08/2007 
> 07:27
>
>
>
> ________________________________
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.484 / Virus Database: 269.12.0/961 - Release Date: 19/08/2007 
> 07:27
>
>
>
>
>
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.484 / Virus Database: 269.12.1/963 - Release Date: 20/08/2007 
> 17:44
> 

_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www1.ietf.org/mailman/listinfo/avt

Re: [AVT] RTP streaming and adaptation to AVC of an SVC temporal scalable bitstream - Packet loss