Re: [AVTCORE] Review of draft-ietf-avtcore-multi-party-rtt-mix

Gunnar Hellström <> Tue, 03 November 2020 22:31 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DE1023A126B for <>; Tue, 3 Nov 2020 14:31:41 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.146
X-Spam-Status: No, score=-2.146 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.247, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Wed9RWcVQi9e for <>; Tue, 3 Nov 2020 14:31:38 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id A95A03A126D for <>; Tue, 3 Nov 2020 14:31:37 -0800 (PST)
Received: from [] ( []) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: by (Postfix) with ESMTPSA id 3533020097; Tue, 3 Nov 2020 23:31:35 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=dkim; t=1604442695; bh=c+Y5uwdf9t7XutKV9hAaBejfgXPZ88MBu5ZC87vpirk=; h=Subject:To:References:From:Date:In-Reply-To:From; b=IDF7usXDYQBwf9Hcah2OkAVfIZWwbDALXSs6GxvnN3EmUqO1/45lfv3j/h2uunUhY LMp76VSvOaXnJ5Vx1hxqKIruYtFgVkEDh5iLTn1BSLfA2fvIfWeepki8tw6T/w9Epj F3RyopWimUYZz1scMdumQgyEbyj+Y3jKcG/ZrsbU=
To: Brian Rosen <>,
References: <>
From: =?UTF-8?Q?Gunnar_Hellstr=c3=b6m?= <>
Message-ID: <>
Date: Tue, 3 Nov 2020 23:31:34 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: sv
Archived-At: <>
Subject: Re: [AVTCORE] Review of draft-ietf-avtcore-multi-party-rtt-mix
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 03 Nov 2020 22:31:42 -0000

Hi Brian,

Thanks for the review,

see brief answers below, I will follow up with text proposals.

Den 2020-11-02 kl. 22:22, skrev Brian Rosen:
> I have reviewed -multi-party-mix-09 and have the following comments:
> 1. Section 1 Introduction provides the solution, in detail.  I think it would be better to have a high level description of the problem, and then introduce the solution in a subsequent section.  I would also like to see a little text on what “mixing” for rtt means.


This is a proposal for the mixing description.

"Real-time text mixers for multi-party sessions identifies the
source of each transmitted group of text from a conference participant 
so that the text can be transmitted interleaved with text groups from 
different sources in the rate they are created. This enables the text 
groups to be presented by endpoints in suitable grouping with other text 
from the same source. The presentation can then be arranged so that text 
from different sources can be presented in real-time and easily read 
while it is possible for a reading user to perceive how the text was 
created in real time by the different parties. The transmission and 
mixing is intended to be done in a general way so that presentation can 
be arranged in a layout decided by the endpoint."

> 2. In the Selected solution and considered alternatives section, the first alternative (One RTP stream per source in same RTP session), please make it clearer that the clients have to support this, not just the mixer, and that’s the issue.
Will do.
> 3. I don’t really understand the text at the end of that section where it describes when multiple typers send text simultaneously.  ISTM that RTT behaves just like voice in that turn taking is often jumbled when more than one user responds to the previous user.  It takes time to recognize it, and for agreement on who has the “floor”.

If you would let text from different parties merge as they are received 
in real-time, you would create a lot of problems. It takes much longer 
time with text than with voice to say "sorry you were garbled, please 
retype". And the mixer does not send your own text back to you, so it is 
hard for you to detect the problem.

Still, in a conference, very little simultaneous typing will occur. It 
is only practical to have one party at a time to send any longer texts. 
Others can be expected to say

The presentation layer ITU-T T.140 requires text from different parties 
to be presented readible and in a way that the order of text 
transmission can be approximately perceived. So we have no choice. We 
are specifying the transport layer for T.140 and need to enable its 

I hope the clarification of mixing in point 1 will help.

> 4.  Need a better heading for “Specified Solutions”.  Actually, I wonder if you could delete this whole section, and add a definition for “multiparty aware” and “multiparty unaware” in 1.2.  Some of this text might move to 3 and 4.2, but that’s okay
I think section 2 can be good as an intro to the two solutions. Some 
words under the main header might help.  I will do the proposed addition 
to 1.2 and propose a new header for 2.
> 5. If you did remove Section 2, you could rename Section 3 to Procedures for Multi-party aware mixing
> 6. The first two paragraphs of 3.1 are O/A text.  The last 3 aren’t - they are how parties act after O/A
Will modify.
> 7. In 3.3, it says “As soon as a participant is known to participate in a session and being available for text reception, a Unicode BOM character SHALL be sent to it according to the procedures in this section.”  I think it is unclear who sends this to whom.  I believe based on the text in 3.8, that the participant sends BOM to the mixer, because it says “and deletion of 'BOM' characters from each participant”.  Does the participant send it to the mixer?  Does the mixer send it to the new participant?  To existing participants?

I will clarify. All units who get in contact with another unit shall 
send BOM. Thus, the mixer to each participant, the particpants to the 
mixer, and in a p2p call, the endpoints to each other.

> 8. I don’t remember why you have to send the redundant text before switching sources.  You might want to explain why in 3.11.  It’s obvious you would like to do Particpant 1, initial, Participant 2, initial, Participant 1, redundant 1, Participant 2 redundant 2, Participant 1, redundant 2, Participant 2, redundant 2.
Good point. Discussed in an earlier response.
> 9. Might want to add text before 3.18.1 that says why you want to send CSRC length 0.

One case is in a point-to-point connection between two multi-party aware 
devices, the negotiation will result in multi-party awareness, but both 
parties will only have one source of its text. The source will be 
related to the SSRC, and no CSRC list will be included.

This is also the case from an endpoint to a mixer, so the mixer in that 
case will not need to split received text from that endpoint in 
different sources.

If instead it is two chained mixers, they will negotiate multi-party 
awareness and code text with multi-party sources and use the CSRC-list.

> 10. Typo “evenso” in 3.19
> 11. Not sure I understand what “a negotiation between security and no security SHOULD be applied. ” (3.20) is supposed to mean.
Would  "negotiation of encryption or no encryption" be better? See RFC 
> 12. Showing examples of SHA-1 may be realistic, but not desirable.
Why not. Please explain.
> 13. Possible discussion of malicious participants in Security Considerations.  Not much can be done, other than require secure signaling and media (so authentication can be relied upon).

Will try

One type of ddos attack may be mentioned, where many users send text 
simultaneously for a short time. Without protection that will block the 
mixer for some time because of the serialization of text transmission 
done by the mixer 8with 100 ms between source switch. Some limitation of 
number of simultaneously sending users may be mentioned as a precaution.



> _______________________________________________
> Audio/Video Transport Core Maintenance

Gunnar Hellström