Re: [MMUSIC] draft-holmberg-mmusic-t140-usage-data-channel - multi-party

As I just replied to Christer, I think this discussion ought to pertain 
to a different document that we don't have yet.

Meanwhile, having a channel per stream is hopefully sufficient while 
exploring these other things.

	Thanks,
	Paul

On 8/28/19 2:26 PM, Gunnar Hellström wrote:
> Hi Paul, see comment at the end,
> 
> Den 2019-08-28 kl. 16:49, skrev Paul Kyzivat:
>> On 8/27/19 2:57 PM, Gunnar Hellström wrote:
>>> Hi Paul,
>>>
>>> Please see inline,
>>>
>>> Den 2019-08-27 kl. 20:39, skrev Paul Kyzivat:
>>>> On 8/26/19 3:59 PM, Gunnar Hellström wrote:
>>>>
>>>>> 4. A multi-party server S, combining a number of sources into one 
>>>>> call to a participant A, with real-time text from each other 
>>>>> participant (B,C,...) communicated in just one T140 data channel 
>>>>> between S and A. There is a need to indicate source for each 
>>>>> T140block sent to A. We currently have no way specified for that. 
>>>>> An extension of T.140 could do it.
>>>>
>>>> Is there a reason to ever do this?
>>>>
>>>> In audio and video the "mixing" actually does something 
>>>> irreversible. And it takes real work to do it, and doing so reduces 
>>>> the bandwith considerably over what is required to transmit the 
>>>> individual streams.
>>>>
>>>> For RTT none of that is true. There is very little impact in 
>>>> bandwidth or processing in transmitting them all as separate channels.
>>>>
>>>> So why not just say "don't do that"?
>>>
>>> Yes, interesting and realistic thought. It would likely be the best 
>>> choice for many practical cases.
>>>
>>> I am not sure however how it will work with a huge conference with 
>>> hundreds of participants, and some of them occasionally asking for 
>>> the floor and send a bit of RTT text. A server using one data channel 
>>> per participant would either be required to establish an enormous 
>>> amount of T140 data channels being prepared to send what has been 
>>> received, or take the effort to establish a new T140 data channel to 
>>> all users at the moment a user gets the floor, and then possibly 
>>> close it again.
>>>
>>> What do you think about that case?
>>
>> Initially each user would only have one channel. Then he gets another 
>> one added each time there is a new speaker who hasn't spoken before. I 
>> don't know if there would be formal floor control, or if everyone 
>> would always be allowed to talk. Formal floor control would provide an 
>> early hint to establish the new channels, possibly preventing delays. 
>> But adding a channel isn't an expensive operation and delaying 
>> messages until it is done shouldn't be a problem.
>>
>> But is this a realistic issue? Do RTT conferences with hundreds of 
>> speakers happen in practice?
> 
> It is realistic to the same degree as use of video and audio in 
> multi-party multi-media conferences.
> 
> It is not realistic to let hundreds of participants talk simultaneously, 
> but one or a very low number. It is realistic to let one at a time have 
> the floor and speak, sending the audio to all the others, and have the 
> opportunity to hand over the floor to someone else with a lose or strict 
> protocol.
> 
> There is very little use for transmitting live video from hundreds of 
> participants simultaneously, but it is realistic to show a low number of 
> users and be prepared to switch who are seen.
> 
> Similarly it is not realistic to let hundreds of participants present 
> new RTT simultaneously, but one or a very low number. It is realistic to 
> let a few at a time transmit, sending the RTT to all the others. Anyone 
> should have the opportunity to transmit RTT, controlled by either no 
> protocol or a lose or a strict protocol.
> 
> RTT is sadly often characterized as media for accessibility and thereby 
> expected to be used only in special cases. This is wrong from two points 
> of view:
> 
> 1: If used for accessibility, in cases when one or some users have no or 
> little use of audio or video for language communication because of a 
> disability, then it is very important that all other participants can 
> use the same media, so that communication can go directly to or from the 
> one(s) who need RTT. This is the whole idea of accessibility, that users 
> are not forced to limited technology corners, but are able to 
> participate anywhere with the same technology as others and have full 
> and equal participation. (sometimes a translating service between 
> different modalities are still needed so that everybody are enabled to 
> use the modality they prefer)
> 
> 2. In any multimedia conference there appears reasons to communicate 
> something in text. The three real-time media: audio, video and RTT are 
> needed together for alternating or simultaneous use. Talk, show and 
> write for a complete and efficient communication. In that scenario, text 
> messaging is a slow tool, causing delays and risk for losing the 
> interest from  viewers. When anyone starts texting RTT, it is possible 
> to start following the thoughts as they are expressed in text 
> immediately from start, while for messaging the typing user needs to 
> complete the message and risking to enter the message far too late to be 
> valid in the discussion.
> 
> 3. Automatic subtitling by speech-to-text services and even language 
> translation before presentation would be useless without a form of RTT. 
> It is now realistic to use such services to enhance multimedia 
> conferences and make the contents available for both people with other 
> languages, people in noisy areas and for deaf or hard-of-hearing users. 
> This is a rapidly increasing use. Also for this case it is realistic 
> only with a few actively transmitting users, while the result can be if 
> interest to distribute to many.
> 
> This said, I anyway agree that the usage that will develop most rapidly 
> will likely be for meetings with not more than five participants having 
> audio, video and RTT available and most commonly only three taking turns 
> on sending RTT. Some of these calls will be in emergency services and 
> interpreting (or relay) services, while there is also a need for 
> user-to-user conferences.
> 
> I would also like to know which MCU model is most realistic for these 
> applications. Is it the one with separate text channels per active 
> source opened and closed when there is a need, or is it the single text 
> channel with in-line identification of the source for each new data 
> channel message?
> 
> 
> Regards
> 
> Gunnar
> 
>>
>>
>>     Thanks,
>>     Paul
>>
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
> 
>