Re: [AVTCORE] draft-ietf-avtcore-multi-party-rtt-mix Issue 1: transport

Gunnar Hellström <> Sat, 16 May 2020 07:07 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 85D073A086A for <>; Sat, 16 May 2020 00:07:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.102
X-Spam-Status: No, score=0.102 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id DiMeN6ympvyJ for <>; Sat, 16 May 2020 00:07:24 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5FE583A0865 for <>; Sat, 16 May 2020 00:07:22 -0700 (PDT)
Received: from [] ( []) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: by (Postfix) with ESMTPSA id CAF322012C; Sat, 16 May 2020 09:07:19 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=dkim; t=1589612840; bh=ZK1oaWeqrCfUyiQkrYOYjsAu0CGMCneF8Nf5a0gLwhY=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=QdZ3jws7XKatgnecACXjqgifdtUcSxWHxyExajIHaw1HaKz89GogXWwElVhKW30bM ZbEIbFABe5OE4EZOTNF8r5vRdjaNMrxga2LaHQsWnoNmyKBJgU/isy20sOXbLbCAai MsfB9tG+NoHt6VgB3jlV2xlf56cNek8p25DO6PFI=
To: Dan Mongrain <>, Brian Rosen <>
References: <> <> <> <> <> <> <> <>
From: Gunnar Hellström <>
Message-ID: <>
Date: Sat, 16 May 2020 09:07:17 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: multipart/alternative; boundary="------------FDD1D9466E628D1F5581A51C"
Content-Language: sv
Archived-At: <>
Subject: Re: [AVTCORE] draft-ietf-avtcore-multi-party-rtt-mix Issue 1: transport
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 16 May 2020 07:07:35 -0000

Thanks, Brian and Dan for good points.

To Brian: Yes, text is more sensitive to dropouts than voice. That is 
why we  in RTP based RTT added two redundant transmissions with 300 ms 
interval. That provides an enormous increase in reliability compared to 
sending once. If you have an evenly distributed loss of 2% packets, then 
the loss of characters of text will be one per 100 000, which is far 
better than the sources you have for text communication: typing or voice 
to text. By having the packets sent with 300 ms intervals, the text will 
also survive all packet loss bursts of up to 599 ms and many up to 899 
ms. And if you get a text loss, you get an indication. So, it is longer 
burst loss that can cause real text loss that is noticable compared to 
human errors. And in such conditions, voice users might experience a gap 
that makes them say "sorry, there was a dropout, what did you say". So, 
I think the current method with RTP and redundancy meets the initial 
requirements well to serve as a human conversation media.

It is only if you have one of the following cases you will have problems:

1) you are so used to the reliability of text messaging that you will 
have problems accepting the relative reliability of this RTT transmission.

2) we let important protocol elements be transported with the same 
mechanism, so they can be lost in transmission. That is the case with 
the "multi-party unaware mixing" included as one option in the current 
draft if you are in contact with a legacy endpoint. One example is at 
source switch, when a line separator and a label is sent inline, before 
text from another source is sent. If any parts of these are lost, it 
causes much more confusion than text loss in human created text. That is 
one reason why the multi-party aware mixing is the primary one, and the 
multi-party unaware has a row of warnings about functional limitations 

However, text messaging also has its moments of confusion when something 
goes wrong.

We already have one reliable transport for RTT, in the recently approved 
draft-ietf-mmusic-t140-usage-data-channel. So, my conclusion was to not 
introduce more options by specifying yet another reliable transport, and 
instead move on with good multi-party support also for RTP based RTT 
transmission with redundancy.

Dan, good point about the need to get through SBC:s without much 

Would the WebRTC data channel also get through? I is based on the 



Den 2020-05-16 kl. 05:20, skrev Dan Mongrain:
> Please keep UDP-based RTT conveyance, even for conference-aware 
> endpoints. While I do not mind if we were to add a reliable conveyance 
> protocol as an option, we must keep the UDP-based option for RTT. The 
> reason is that SBCs out there currently treat RTT as voice traffic and 
> allow it through. If we were to insist that conference-aware endpoints 
> only support a reliable protocol then the SBCs will need to be updated 
> in order for RTT to work. This is not a trivial change, especially 
> since the initial market for RTT is Public Safety which is a small 
> market for SBC vendors thus does not react quickly to these types of 
> changes.
> Thanx,
> Dan
> *Dan Mongrain, eng.
> *Principal Engineer, Standards
> *Motorola Solutions*
> *o*: +1.819.931.2129
> *m*: +1.613.558.0764
> e: 
> <>
> <>
> /
> On Fri, May 15, 2020 at 5:34 PM Brian Rosen < 
> <>> wrote:
>     Ah,  two cases, multiparty aware and multi party unaware slipped
>     my mind.
>     I would say if we’re defining multi-party aware UAs, which is by
>     definition, new code, that we’re better off with a truly reliable
>     transport.  Keeping “repeat it enough times that it ought to work”
>     doesn’t strike me as the best choice given implementations must
>     change.  That was a hack.  It’s an okay hack, but we have the
>     problem that this is very high information density per bit, so
>     it’s quite a bit worse to lose information with RTT than with
>     audio or video.  Your mind can fill in some missing character
>     blanks, but not as well as it can deal with missing audio and
>     video packets.
>     Brian
>>     On May 15, 2020, at 2:29 PM, Gunnar Hellström
>>     <
>>     <>> wrote:
>>     Hi Brian,
>>     Den 2020-05-15 kl. 17:03, skrev Brian Rosen:
>>>     I think we have to consider who has to do what.
>>>     If we are requiring all implementations to change because of
>>>     other multi-party issues, then I think we should us an actual
>>>     reliable protocol, and not just a “repeat enough times that the
>>>     probability it gets there is high enough.
>>>     If we aren’t asking all implementations to change for
>>>     multi-party, but only the mixer, then I think that we’re
>>>     sticking with T.140,
>>>     We’re in the latter case, right?  The point of this work is
>>>     don’t change the endpoints, only the conference bridge.
>>     We are in both cases. And I hope you agree we should be.  And
>>     this is in general, not only for the transport. Both cases are
>>     there in the current draft draft-ietf-avtcore-multi-party-rtt-mix-01
>>     1) A mechanism that the mixer can use when it is revealed that
>>     the endpoint does not support proper multi-party presentation.
>>     There are functional limitations, but it works reasonably well,
>>     especially for few parties taking turns in reasonably good order.
>>     It is specified in section 13.2 and is called "Multi-party mixing
>>     for multi-party unaware endpoints".  You can check the functional
>>     limitations at the end of that section and tell if you agree that
>>     we also need something better.
>>     2) A mechanism to use when both the mixer and the endpoint can
>>     handle fully functional multi-party presentation of text. That
>>     requires active action by the endpoint to place received text in
>>     areas for each participant, and present them in a suitable way,
>>     both providing a good real-time impression, an impression of
>>     approximately when in time order the text entries were produced,
>>     and a collection of text from each participant in suitable
>>     chunks, phrases, sentences or messages, with source information
>>     attached. The latest draft has a format for multi-party transport
>>     that allows up to 16 sources per packet, and can by that provide
>>     text from about 32 simultaneously typing participants without
>>     introducing unacceptable delay. Earlier versions of the draft had
>>     different and much lower performance.  I am glad that the new
>>     format will not be the bottleneck for a good RTT multi-party
>>     experience.
>>     It would have been possible to specify another transport for
>>     mechanism 2), but my reasoning ended up in the same as before:
>>     RTP with RFC2198-type redundancy, with one original and two
>>     redundant transmissions, most often 300 ms apart. You can see the
>>     format as 16-tuple-RFC-4103. Do you agree in this conclusion for
>>     case 2?
>>     Thanks,
>>     Gunnar
>>>     Brian
>>>>     On May 14, 2020, at 11:01 AM, Gunnar Hellström
>>>>     <
>>>>     <>> wrote:
>>>>     I have concluded that only two of the discussed transports are
>>>>     realistic.
>>>>     Comments below
>>>>     Den 2020-05-11 kl. 12:22, skrev Gunnar Hellström:
>>>>>     In a recent e-mail, I listed 9 issues to act on in
>>>>>     draft-ietf-avtcore-multi-party-rtt-mix-00
>>>>>     I want to deal with them one by one or in small groups. Here
>>>>>     is number 1:
>>>>>     1. Consider rapidly if there is any more reliable transport
>>>>>     that is feasible to move to.
>>>>>>     (e.g. Comedia RFC 4145 and RFC 4572, or the recently approved
>>>>>>     WebRTC t140 data channel
>>>>>>     draft-ietf-mmusic-t140-data-channel-usage, or use of SAVPF
>>>>>>     with NAK and retransmission RFC 4588)
>>>>>     It may look strange with this issue after many months as an
>>>>>     individual draft. But I want to touch it anyway before we move
>>>>>     on in one fixed direction.
>>>>>     T.140 and its RTP transport (RFC 2793 - later RFC 4103) were
>>>>>     created 1998 - 2000 as the third real-time medium for human
>>>>>     conversations beside voice and video. The idea was to give
>>>>>     equal opportunities to persons wanting to communicate by text
>>>>>     as the ones who use voice or video. That means real-time
>>>>>     transmission while text is created and accepting some rare
>>>>>     dropouts just as we do with voice and video. However, users
>>>>>     are nowadays used to text messaging where it is customary to
>>>>>     accept a delay and get the text complete in most cases, rather
>>>>>     than to have loss. That user experience might be expected from
>>>>>     real-time text as well. I do not have any strong user
>>>>>     indications that this is the case, it is just my own thinking.
>>>>>     The reason to bring this up now, is that we seem to need to
>>>>>     introduce the multi-party mixed format at least as a new text
>>>>>     media subtype, text/rex instead of text/red. Then we are
>>>>>     anyway introducing signaling complexity of similar kind that
>>>>>     another transport will do.
>>>>>     Are any of the initially mentioned more reliable transports
>>>>>     realistic and easily implemented in the target implementation
>>>>>     environments: NG emergency services, 3GPP IMS MTSI, IETF RUM,
>>>>>     and plain SIP multimedia? Or are there any other not mentioned?
>>>>>     When considering this, we should have in mind that the
>>>>>     proposed transport should be with security so that we do not
>>>>>     need to introduce more options to negotiate between.
>>>>>     And we shall also keep in mind that NAT traversal needs to be
>>>>>     supported as well as multi-party-signaling through the SIP
>>>>>     central conferencing model RFC 4353.
>>>>>     Another complexity is that current regulation requires RFC
>>>>>     4103 and it would be best that the finaly specified
>>>>>     multi-party solution can be perceived as an extension to RFC 4103.
>>>>>     What can be said about the options?
>>>>>      1. Comedia RFC 4145 and RFC 4572. Makes use of TLS for
>>>>>     transport, so it is secured. Should use RFC 6544 ICE for TCP
>>>>>     for NAT traversal. Requires specification of how to arrange
>>>>>     the streams and code the sources in the multi-party
>>>>>     environment. I do not know how well these RFCs are supported
>>>>>     in the target environments. Seems to increase complexity.
>>>>     --Increases complexity - not selected
>>>>>     2. draft-ietf-mmusic-t140-usage-data-channel. Has security,
>>>>>     NAT traversal and possibility to code multi-party source. Has
>>>>>     good opportunity for being supported in endpoint devices,
>>>>>     because all of them are expected to support WebRTC. Maybe less
>>>>>     supported in traditional SIP bridges.
>>>>     --A realistic solution. The base is already approved and is for
>>>>     a popular environment. Multi-party is briefly mentioned but
>>>>     should probably be a bit further specified. Should however not
>>>>     be the only solution. The RTP based solution is also needed.
>>>>>     3. SAVPF with NACK and RFC 4588 retransmission. I assume this
>>>>>     can be combined with OSRTP RFC 8643 for security negotiation.
>>>>>     When the immediate or early feedback option can be used, this
>>>>>     method can likely be used without redundancy to achieve a
>>>>>     reliability enhancement. That will not work well over networks
>>>>>     with high latency. Further study needed if redundancy or FEC
>>>>>     is needed as complement for high latency networks. Easy to
>>>>>     achieve up to 5 simultaneously sending users.
>>>>     --Increases complexity - not selected
>>>>>     4. (Not mentioned in the introduction above) Use RFC 4103 plus
>>>>>     one of the RTP based methods for multi-party source indication
>>>>>     but just increase redundancy to one original and three
>>>>>     (instead of two) redundant generations. Can easily be done if
>>>>>     reliability increase is really a concern. Has low overhead.
>>>>>     Easily applicable to OSRTP security, SIP conferencing model
>>>>>     and ICE NAT traversal.
>>>>     --Easily done by local recommendations if 3 generations
>>>>     redundancy (including the original) would not be felt
>>>>     sufficient somewhere.
>>>>>     5. Accept reliability that is quite good as it is with RTP
>>>>>     with one original and two redundant generations in the RFC
>>>>>     2198 - style ( with one of the additional methods discussed
>>>>>     for increasing switching performance)
>>>>     --Realistic and regarded sufficient. By move to a mixer method
>>>>     allowing 300 ms transmission interval, the protection against
>>>>     burtsy packet loss is quite good. Continue on this track.
>>>>     The conclusion is reflected in version -01 of the draft, just
>>>>     published.
>>>>     Regards
>>>>     Gunnar
>>>>>     Comments please so we can take a rapid decision and move on
>>>>>     with one solution.
>>>>>     Regards
>>>>>     Gunnar
>>>>     --
>>>>     Gunnar Hellström
>>>>     GHAccess
>>>> <>
>>>>     _______________________________________________
>>>>     Audio/Video Transport Core Maintenance
>>>> <>
>>>>     <>
>>     -- 
>>     Gunnar Hellström
>>     GHAccess
>>  <>
>     _______________________________________________
>     Audio/Video Transport Core Maintenance
> <>
> /Your privacy is important to us. That is why we have taken 
> appropriate measures to ensure the data you provide to us is kept 
> secure. To learn more about how we process your personal information, 
> how we comply with applicable data protection laws, and care for the 
> security and privacy of your personal data, please review our Privacy 
> Policy. If you have any questions related to data protection and 
> compliance with applicable laws, please contact us at our Security 
> Operations Center at 1-800-674-4357./
Gunnar Hellström