Re: [rtcweb] [tsvwg] Diffserv QoS for Video

<> Tue, 10 May 2016 06:57 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 9653612B04F; Mon, 9 May 2016 23:57:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -5.216
X-Spam-Status: No, score=-5.216 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.996] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id xSxeRVzsjc7L; Mon, 9 May 2016 23:57:02 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 0F23512B029; Mon, 9 May 2016 23:57:00 -0700 (PDT)
Received: from ([]) by with ESMTP/TLS/DHE-RSA-AES128-SHA; 10 May 2016 08:56:57 +0200
X-IronPort-AV: E=Sophos;i="5.24,604,1454972400"; d="scan'208";a="877421609"
Received: from ([]) by with ESMTP/TLS/AES128-SHA; 10 May 2016 08:56:57 +0200
Received: from ([]) by ([::1]) with mapi; Tue, 10 May 2016 08:56:57 +0200
Date: Tue, 10 May 2016 08:56:55 +0200
Thread-Topic: AW: [tsvwg] Diffserv QoS for Video
Thread-Index: AdGqJXH8NYneSPGYSR6kcmK6JqvclgAXrIyg
Message-ID: <>
References: <> <em88678e54-c513-4d74-8bbd-ba0785d70b36@sydney>
In-Reply-To: <em88678e54-c513-4d74-8bbd-ba0785d70b36@sydney>
Accept-Language: en-US, de-DE
Content-Language: de-DE
acceptlanguage: en-US, de-DE
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <>
Subject: Re: [rtcweb] [tsvwg] Diffserv QoS for Video
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 10 May 2016 06:57:04 -0000

Hi Paul,

I think we agree, that audio and video frames, if both are part of the same (interactive) media flow should be transported by the same PHB [PJ] or the same queue [RG]. The latter is ensured, if the same PHB is picked for audio and video. To me the text of the draft so far doesn't express that both audio and video are supposed to use an "Interactive Video..." PHB, if both are present. I'd prefer to have text with a non binding standard requirement saying  

     However, if the application wishes to send both interactive 
     video and audio, it is RECOMMENDED to transport audio 
     and video packets by the same per hop behavior. For example, 
     audio and video packets would both be marked as AF42 or

I don't insist on descriptive text proposing to transport audio by an AF4 PHB offering a lower drop ratio than that used to transport video. My audio/video experts support this and I'm pretty sure, that also Cisco representatives mentioned that audio quality ranks above video quality in telepresence sessions.



-----Ursprüngliche Nachricht-----
Von: Paul E. Jones [] 
Gesendet: Montag, 9. Mai 2016 21:03
An: Geib, Rüdiger
Betreff: Re: AW: [tsvwg] Diffserv QoS for Video


Perhaps an example might be helpful.  How about I add this text for illustrative purposes?

     To illustrate the use of the above table, let us assume the
     application assigns a priority of "medium" to audio and video
     flows.  Given that assumption, if the application wishes to send
     only audio then packets would be marked EF.  However, if the
     application wishes to send both interactive video and audio,
     then audio and video packets would both be marked as AF42 or
     AF43.  The intent is to ensure that when both audio and video
     are being sent together that they receive similar per-hop

This doesn't get into the preference for AF42 vs. AF43. If it were me, I'd mark all audio as AF42 and only key video frames as AF42.  All predictive frames would be sent with an AF43 marking.  I might even take it a step further and classify all audio as "high".  However, I've seen a tremendous amount of debate on this before, so I'd prefer to not go too far in dictating audio markings vs. video.  I do think most people generally agree about at least ensuring the class is the same, otherwise the wildly different PHB introduces skew between A/V packet arrival, thus inflating the size of buffers managing the A/V streams.  However, we do not want to dictate that audio should be treated significantly better than audio.  For deaf users, for example, the audio really isn't important at all.  That is perhaps an extreme example, but it nonetheless highlights why we should be cautious about exactly what we normatively mandate.


------ Original Message ------
Sent: 5/9/2016 3:34:25 AM
Subject: AW: [tsvwg] Diffserv QoS for Video

>Hi Paul,
>I've talked with audio/video experts of Deutsche Telekom and they too 
>favored what you recommend below: transport audio and video by the same 
>queue. Your statement below however stops there and the draft text 
>doesn't clarify the issue:
>If there's interactive video with audio, then they both should be 
>marked for the same PHB which is:
>- EF ?
>- AF4? Like AF41 Audio, AF42 Video (AF43 in addition, if P or B frames 
>are to receive a lower priority /
>   higher drop precedence PHB)?
>I personally prefer AF4 if audio and video are to be transported in the 
>same queue.
>I'd also ask for the draft text to be clear about the issue when to 
>mark audio by the EF PHB. My understanding after reading your statement 
>below is: Audio marked EF if there's no video flow only.
>BC>Finally, why is audio not also subdivided into interactive and 
>BC>non-interactive? As far as I can see, both are logically possible.
>[PJ] For WebRTC, audio alone is "interactive" in nature (which is
>   why it's marked EF).  However, if one is sending audio and video
>   it makes sense to mark them both same way to get the same PHB and
>   hopefully have them queued in the same buffers along the path.
>   Sending audio as EF and possibly having a PHB that results in
>   packets arriving much faster than corresponding video packets
>   marked as AF42 is not at all helpful for applications that have
>   to synchronize the audio and video flows.
>   Paul