Re: [AVTCORE] draft-ietf-avtcore-multi-party-rtt-mix Issue 1: transport

Gunnar Hellström <> Sat, 16 May 2020 18:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DD65B3A0978 for <>; Sat, 16 May 2020 11:13:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.002
X-Spam-Status: No, score=0.002 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id XK-gLJbtXNAA for <>; Sat, 16 May 2020 11:13:39 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 764183A045B for <>; Sat, 16 May 2020 11:13:33 -0700 (PDT)
Received: from [] ( []) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: by (Postfix) with ESMTPSA id 8926020042; Sat, 16 May 2020 20:13:31 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=dkim; t=1589652811; bh=lZr3l4iLdXEL1hrjjuEekEUcUNM6PPq54pT+YJIKF0M=; h=Subject:To:References:From:Date:In-Reply-To:From; b=VbTWibwPDZndX1qmpCZjosXlVRayj554VeFHVKGPpCR/SmGZptJJ3nSpooIT0JZyM ru2EYaieIX03VzipZ0Syb7TY/HI91f+lx2jDYVcEC0hYeqHT8Ydrsj8R3qhmt2ySfn Zaaj4ZjSFXYEEAzobdj2/g2fkUXs5feQoWV6k8GY=
To: Paul Kyzivat <>,
References: <> <> <> <> <> <> <> <>
From: Gunnar Hellström <>
Message-ID: <>
Date: Sat, 16 May 2020 20:13:30 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: sv
Archived-At: <>
Subject: Re: [AVTCORE] draft-ietf-avtcore-multi-party-rtt-mix Issue 1: transport
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 16 May 2020 18:13:48 -0000

Den 2020-05-16 kl. 16:41, skrev Paul Kyzivat:
> On 5/15/20 5:33 PM, Brian Rosen wrote:
>> Ah,  two cases, multiparty aware and multi party unaware slipped my 
>> mind.
>> I would say if we’re defining multi-party aware UAs, which is by 
>> definition, new code, that we’re better off with a truly reliable 
>> transport.  Keeping “repeat it enough times that it ought to work” 
>> doesn’t strike me as the best choice given implementations must 
>> change.   That was a hack.  It’s an okay hack, but we have the 
>> problem that this is very high information density per bit, so it’s 
>> quite a bit worse to lose information with RTT than with audio or 
>> video.  Your mind can fill in some missing character blanks, but not 
>> as well as it can deal with missing audio and video packets.
> I agree with you in theory. But going that way has some problems in 
> practice. Notably, the multiparty aware and unaware implementations 
> need to coexist in order to be deployed. If they use the same 
> transport and the differences can be negotiated via O/A of SDP 
> parameters then the coexistence is easy. But with differences in 
> transport the O/A mechanism for discovering the best mechanism 
> supported by both ends becomes messy.

Yes, I agree. Initially I hoped that we could define the multi-party 
capability just with a media attribute and still call the format 
"text/red". But that resulted in a low source switching performance of 
about two sources per second. With the new media subtype "text/rex" that 
can be said to be up to 16 repetitions of the "text/red" format in each 
packet I am sure we have suffircient performance and still easily 
handled negotiation with just one more payload type in the same m-line.

Next thing to include is security negotiation. I have recommended OSRTP 
RFC 8643 to be the way to select between no security and SRTP with DTLS 
if the application has no reason to specify something else. I have not 
yet included an SDP example, but I imagine it is manageable, and more 
realistic than if I needed to combine negotiation between two different 
media lines for different transports and security or no security e.g. by 
means of sdp grouping and RFC 5939.

It would be interesting to have comments to that approach as well.



>     Thanks,
>     Paul
>> Brian
>>> On May 15, 2020, at 2:29 PM, Gunnar Hellström 
>>> < <>> 
>>> wrote:
>>> Hi Brian,
>>> Den 2020-05-15 kl. 17:03, skrev Brian Rosen:
>>>> I think we have to consider who has to do what.
>>>> If we are requiring all implementations to change because of other 
>>>> multi-party issues, then I think we should us an actual reliable 
>>>> protocol, and not just a “repeat enough times that the probability 
>>>> it gets there is high enough.
>>>> If we aren’t asking all implementations to change for multi-party, 
>>>> but only the mixer, then I think that we’re sticking with T.140,
>>>> We’re in the latter case, right?  The point of this work is don’t 
>>>> change the endpoints, only the conference bridge.
>>> We are in both cases. And I hope you agree we should be.  And this 
>>> is in general, not only for the transport. Both cases are there in 
>>> the current draft draft-ietf-avtcore-multi-party-rtt-mix-01
>>> 1) A mechanism that the mixer can use when it is revealed that the 
>>> endpoint does not support proper multi-party presentation. There are 
>>> functional limitations, but it works reasonably well, especially for 
>>> few parties taking turns in reasonably good order. It is specified 
>>> in section 13.2 and is called "Multi-party mixing for multi-party 
>>> unaware endpoints".  You can check the functional limitations at the 
>>> end of that section and tell if you agree that we also need 
>>> something better.
>>> 2) A mechanism to use when both the mixer and the endpoint can 
>>> handle fully functional multi-party presentation of text. That 
>>> requires active action by the endpoint to place received text in 
>>> areas for each participant, and present them in a suitable way, both 
>>> providing a good real-time impression, an impression of 
>>> approximately when in time order the text entries were produced, and 
>>> a collection of text from each participant in suitable chunks, 
>>> phrases, sentences or messages, with source information attached. 
>>> The latest draft has a format for multi-party transport that allows 
>>> up to 16 sources per packet, and can by that provide text from about 
>>> 32 simultaneously typing participants without introducing 
>>> unacceptable delay. Earlier versions of the draft had different and 
>>> much lower performance.  I am glad that the new format will not be 
>>> the bottleneck for a good RTT multi-party experience.
>>> It would have been possible to specify another transport for 
>>> mechanism 2), but my reasoning ended up in the same as before: RTP 
>>> with RFC2198-type redundancy, with one original and two redundant 
>>> transmissions, most often 300 ms apart. You can see the format as 
>>> 16-tuple-RFC-4103. Do you agree in this conclusion for case 2?
>>> Thanks,
>>> Gunnar
>>>> Brian
>>>>> On May 14, 2020, at 11:01 AM, Gunnar Hellström 
>>>>> < 
>>>>> <>> wrote:
>>>>> I have concluded that only two of the discussed transports are 
>>>>> realistic.
>>>>> Comments below
>>>>> Den 2020-05-11 kl. 12:22, skrev Gunnar Hellström:
>>>>>> In a recent e-mail, I listed 9 issues to act on in 
>>>>>> draft-ietf-avtcore-multi-party-rtt-mix-00
>>>>>> I want to deal with them one by one or in small groups. Here is 
>>>>>> number 1:
>>>>>> 1. Consider rapidly if there is any more reliable transport that 
>>>>>> is feasible to move to.
>>>>>>> (e.g. Comedia RFC 4145 and RFC 4572, or the recently approved 
>>>>>>> WebRTC t140 data channel 
>>>>>>> draft-ietf-mmusic-t140-data-channel-usage, or use of SAVPF with 
>>>>>>> NAK and retransmission RFC 4588)
>>>>>> It may look strange with this issue after many months as an 
>>>>>> individual draft. But I want to touch it anyway before we move on 
>>>>>> in one fixed direction.
>>>>>> T.140 and its RTP transport (RFC 2793 - later RFC 4103) were 
>>>>>> created 1998 - 2000 as the third real-time medium for human 
>>>>>> conversations beside voice and video. The idea was to give equal 
>>>>>> opportunities to persons wanting to communicate by text as the 
>>>>>> ones who use voice or video. That means real-time transmission 
>>>>>> while text is created and accepting some rare dropouts just as we 
>>>>>> do with voice and video. However, users are nowadays used to text 
>>>>>> messaging where it is customary to accept a delay and get the 
>>>>>> text complete in most cases, rather than to have loss. That user 
>>>>>> experience might be expected from real-time text as well. I do 
>>>>>> not have any strong user indications that this is the case, it is 
>>>>>> just my own thinking.
>>>>>> The reason to bring this up now, is that we seem to need to 
>>>>>> introduce the multi-party mixed format at least as a new text 
>>>>>> media subtype, text/rex instead of text/red. Then we are anyway 
>>>>>> introducing signaling complexity of similar kind that another 
>>>>>> transport will do.
>>>>>> Are any of the initially mentioned more reliable transports 
>>>>>> realistic and easily implemented in the target implementation 
>>>>>> environments: NG emergency services, 3GPP IMS MTSI, IETF RUM, and 
>>>>>> plain SIP multimedia? Or are there any other not mentioned?
>>>>>> When considering this, we should have in mind that the proposed 
>>>>>> transport should be with security so that we do not need to 
>>>>>> introduce more options to negotiate between.
>>>>>> And we shall also keep in mind that NAT traversal needs to be 
>>>>>> supported as well as multi-party-signaling through the SIP 
>>>>>> central conferencing model RFC 4353.
>>>>>> Another complexity is that current regulation requires RFC 4103 
>>>>>> and it would be best that the finaly specified multi-party 
>>>>>> solution can be perceived as an extension to RFC 4103.
>>>>>> What can be said about the options?
>>>>>>  1. Comedia RFC 4145 and RFC 4572. Makes use of TLS for 
>>>>>> transport, so it is secured. Should use RFC 6544 ICE for TCP for 
>>>>>> NAT traversal. Requires specification of how to arrange the 
>>>>>> streams and code the sources in the multi-party environment. I do 
>>>>>> not know how well these RFCs are supported in the target 
>>>>>> environments. Seems to increase complexity.
>>>>> --Increases complexity - not selected
>>>>>> 2. draft-ietf-mmusic-t140-usage-data-channel. Has security, NAT 
>>>>>> traversal and possibility to code multi-party source. Has good 
>>>>>> opportunity for being supported in endpoint devices, because all 
>>>>>> of them are expected to support WebRTC. Maybe less supported in 
>>>>>> traditional SIP bridges.
>>>>> --A realistic solution. The base is already approved and is for a 
>>>>> popular environment. Multi-party is briefly mentioned but should 
>>>>> probably be a bit further specified. Should however not be the 
>>>>> only solution. The RTP based solution is also needed.
>>>>>> 3. SAVPF with NACK and RFC 4588 retransmission. I assume this can 
>>>>>> be combined with OSRTP RFC 8643 for security negotiation. When 
>>>>>> the immediate or early feedback option can be used, this method 
>>>>>> can likely be used without redundancy to achieve a reliability 
>>>>>> enhancement. That will not work well over networks with high 
>>>>>> latency. Further study needed if redundancy or FEC is needed as 
>>>>>> complement for high latency networks. Easy to achieve up to 5 
>>>>>> simultaneously sending users.
>>>>> --Increases complexity - not selected
>>>>>> 4. (Not mentioned in the introduction above) Use RFC 4103 plus 
>>>>>> one of the RTP based methods for multi-party source indication 
>>>>>> but just increase redundancy to one original and three (instead 
>>>>>> of two) redundant generations. Can easily be done if reliability 
>>>>>> increase is really a concern. Has low overhead. Easily applicable 
>>>>>> to OSRTP security, SIP conferencing model and ICE NAT traversal.
>>>>> --Easily done by local recommendations if 3 generations redundancy 
>>>>> (including the original) would not be felt sufficient somewhere.
>>>>>> 5. Accept reliability that is quite good as it is with RTP with 
>>>>>> one original and two redundant generations in the RFC 2198 - 
>>>>>> style ( with one of the additional methods discussed for 
>>>>>> increasing switching performance)
>>>>> --Realistic and regarded sufficient. By move to a mixer method 
>>>>> allowing 300 ms transmission interval, the protection against 
>>>>> burtsy packet loss is quite good. Continue on this track.
>>>>> The conclusion is reflected in version -01 of the draft, just 
>>>>> published.
>>>>> Regards
>>>>> Gunnar
>>>>>> Comments please so we can take a rapid decision and move on with 
>>>>>> one solution.
>>>>>> Regards
>>>>>> Gunnar
>>>>> -- 
>>>>> Gunnar Hellström
>>>>> GHAccess
>>>>> <>
>>>>> _______________________________________________
>>>>> Audio/Video Transport Core Maintenance
>>>>> <>
>>> -- 
>>> Gunnar Hellström
>>> GHAccess
>> _______________________________________________
>> Audio/Video Transport Core Maintenance
> _______________________________________________
> Audio/Video Transport Core Maintenance

Gunnar Hellström