Re: [MMUSIC] draft-holmberg-mmusic-t140-usage-data-channel - multi-party

Den 2019-08-28 kl. 20:58, skrev Christer Holmberg:
> Hi,
>
>> ISTM that this discussion is getting pretty far afield from the proper scope of the draft.
>> I think much of what is being discussed here belongs in a different draft, probably in a
>> different wg, like AVT.

We needed to have this discussion here in order to understand how the 
limitations of the Data Channels compared to RTP (the lack of SSRC,CSRC, 
CNAME and NAME) influences what we can do for multi-party cases.

Now we have seen and understood that in a few situations we can 
compensate these shortcomings by use of session information, the label 
and the stream ID.  For the case of a single data channel carrying 
T140blocks from many sources, we have a solution for RTP, but not for 
the T.140 data channel.

In order to solve that, we would need an SDP attribute for indicating 
capability for a solution, and then a coding in the data channel 
messages for the case when the solution is used.

Both the attribute and the message coding could be part of the work with 
draft-holmberg-mmusic-t140-usage-data-channel. The need appeared because 
of the limitation in the data channel, and the requirement is specificed 
in the webrtc data channel specification.

However, we have not specified the attribute yet for negotiating 
multi-party functionality in the RTP case. I agree that it would be wise 
to use the same attribute.  Should that work be done in AVT then?

And the coding could be a T.140 extension. Or we could say that for RTP, 
the first CSRC in the packet is the source, and for the T140 data 
channel, the data channel message is a structure with two parts, the 
source (in a form similar to the stream ID) and the T140blocks. (are 
there any habits established for providing structures as payload in data 
channel messages?
I assume that this kind of limited work on the data format can be 
included in the current work in mmusic.

If we do it with a T.140 extension, we need to decide if we require all 
T140 data channel messages must begin with a source indicator, and if 
T140blocks only from one source is allowed in each T140 data channel 
message or if we can switch source within the message and the receiving 
presentation need to scan for new source indicators and let that 
influence the presentation (or further transmission).

Summary, there are things in this that belongs to mmusic or can be done 
here, but depending on some decisions work in other groups or bodies may 
also be needed.

Gunnar

> Using a single channel for multiple users *IS* outside the scope of the draft. There will also be an explicit note about that in the next version of the draft. Gunnar still wanted to continue the discussion about using a single channel, but I guess we should have changed the subject :)
>
> I have also indicated that data plane extensions should be discussed elsewhere, e.g., in AVT. T.140 extensions should probably be taken to ITU-T.
>
> Regards,
>
> Christer
>
> 	Thanks,
> 	Paul
>
> On 8/28/19 2:37 PM, Christer Holmberg wrote:
>> ....and, a T140 extension would work both for a single data channel and multiple data channel, because the receiver will only need to look at the T140 data, and not care about what data channel it was received on.
>>
>> Regards,
>>
>> Christer
>>
>> -----Alkuperäinen viesti-----
>> Lähettäjä: mmusic <mmusic-bounces@ietf.org> Puolesta Christer Holmberg
>> Lähetetty: keskiviikko 28. elokuuta 2019 21.36
>> Vastaanottaja: Gunnar Hellström <gunnar.hellstrom@omnitor.se>; Paul
>> Kyzivat <pkyzivat@alum.mit.edu>; mmusic@ietf.org
>> Aihe: Re: [MMUSIC] draft-holmberg-mmusic-t140-usage-data-channel -
>> multi-party
>>
>>
>>> Also, whatever the solution is, I assume you also want it to work in
>>> interworking cases, where the MCU might be using RTP-based text (RFC
>>> 4103)? From that perspective a T.140 extension would be the best solution, because it would work both for T140 data channels and RTP streams without having to touch the T140 data.
>> .....and here I am talking about the case where you would use a single data channel for multiple participants.
>>
>> Regards,
>>
>> Christer
>>
>>
>> -----Alkuperäinen viesti-----
>> Lähettäjä: mmusic <mmusic-bounces@ietf.org> Puolesta Gunnar Hellström
>> Lähetetty: keskiviikko 28. elokuuta 2019 21.26
>> Vastaanottaja: Paul Kyzivat <pkyzivat@alum.mit.edu>; mmusic@ietf.org
>> Aihe: Re: [MMUSIC] draft-holmberg-mmusic-t140-usage-data-channel -
>> multi-party
>>
>> Hi Paul, see comment at the end,
>>
>> Den 2019-08-28 kl. 16:49, skrev Paul Kyzivat:
>>> On 8/27/19 2:57 PM, Gunnar Hellström wrote:
>>>> Hi Paul,
>>>>
>>>> Please see inline,
>>>>
>>>> Den 2019-08-27 kl. 20:39, skrev Paul Kyzivat:
>>>>> On 8/26/19 3:59 PM, Gunnar Hellström wrote:
>>>>>
>>>>>> 4. A multi-party server S, combining a number of sources into one
>>>>>> call to a participant A, with real-time text from each other
>>>>>> participant (B,C,...) communicated in just one T140 data channel
>>>>>> between S and A. There is a need to indicate source for each
>>>>>> T140block sent to A. We currently have no way specified for that.
>>>>>> An extension of T.140 could do it.
>>>>> Is there a reason to ever do this?
>>>>>
>>>>> In audio and video the "mixing" actually does something
>>>>> irreversible. And it takes real work to do it, and doing so reduces
>>>>> the bandwith considerably over what is required to transmit the
>>>>> individual streams.
>>>>>
>>>>> For RTT none of that is true. There is very little impact in
>>>>> bandwidth or processing in transmitting them all as separate channels.
>>>>>
>>>>> So why not just say "don't do that"?
>>>> Yes, interesting and realistic thought. It would likely be the best
>>>> choice for many practical cases.
>>>>
>>>> I am not sure however how it will work with a huge conference with
>>>> hundreds of participants, and some of them occasionally asking for
>>>> the floor and send a bit of RTT text. A server using one data
>>>> channel per participant would either be required to establish an
>>>> enormous amount of T140 data channels being prepared to send what
>>>> has been received, or take the effort to establish a new T140 data
>>>> channel to all users at the moment a user gets the floor, and then
>>>> possibly close it again.
>>>>
>>>> What do you think about that case?
>>> Initially each user would only have one channel. Then he gets another
>>> one added each time there is a new speaker who hasn't spoken before.
>>> I don't know if there would be formal floor control, or if everyone
>>> would always be allowed to talk. Formal floor control would provide
>>> an early hint to establish the new channels, possibly preventing delays.
>>> But adding a channel isn't an expensive operation and delaying
>>> messages until it is done shouldn't be a problem.
>>>
>>> But is this a realistic issue? Do RTT conferences with hundreds of
>>> speakers happen in practice?
>> It is realistic to the same degree as use of video and audio in multi-party multi-media conferences.
>>
>> It is not realistic to let hundreds of participants talk simultaneously, but one or a very low number. It is realistic to let one at a time have the floor and speak, sending the audio to all the others, and have the opportunity to hand over the floor to someone else with a lose or strict protocol.
>>
>> There is very little use for transmitting live video from hundreds of participants simultaneously, but it is realistic to show a low number of users and be prepared to switch who are seen.
>>
>> Similarly it is not realistic to let hundreds of participants present new RTT simultaneously, but one or a very low number. It is realistic to let a few at a time transmit, sending the RTT to all the others. Anyone should have the opportunity to transmit RTT, controlled by either no protocol or a lose or a strict protocol.
>>
>> RTT is sadly often characterized as media for accessibility and thereby expected to be used only in special cases. This is wrong from two points of view:
>>
>> 1: If used for accessibility, in cases when one or some users have no
>> or little use of audio or video for language communication because of
>> a disability, then it is very important that all other participants
>> can use the same media, so that communication can go directly to or
>> from the
>> one(s) who need RTT. This is the whole idea of accessibility, that
>> users are not forced to limited technology corners, but are able to
>> participate anywhere with the same technology as others and have full
>> and equal participation. (sometimes a translating service between
>> different modalities are still needed so that everybody are enabled to
>> use the modality they prefer)
>>
>> 2. In any multimedia conference there appears reasons to communicate something in text. The three real-time media: audio, video and RTT are needed together for alternating or simultaneous use. Talk, show and write for a complete and efficient communication. In that scenario, text messaging is a slow tool, causing delays and risk for losing the interest from  viewers. When anyone starts texting RTT, it is possible to start following the thoughts as they are expressed in text immediately from start, while for messaging the typing user needs to complete the message and risking to enter the message far too late to be valid in the discussion.
>>
>> 3. Automatic subtitling by speech-to-text services and even language translation before presentation would be useless without a form of RTT.
>> It is now realistic to use such services to enhance multimedia conferences and make the contents available for both people with other languages, people in noisy areas and for deaf or hard-of-hearing users.
>> This is a rapidly increasing use. Also for this case it is realistic only with a few actively transmitting users, while the result can be if interest to distribute to many.
>>
>> This said, I anyway agree that the usage that will develop most rapidly will likely be for meetings with not more than five participants having audio, video and RTT available and most commonly only three taking turns on sending RTT. Some of these calls will be in emergency services and interpreting (or relay) services, while there is also a need for user-to-user conferences.
>>
>> I would also like to know which MCU model is most realistic for these applications. Is it the one with separate text channels per active source opened and closed when there is a need, or is it the single text channel with in-line identification of the source for each new data channel message?
>>
>>
>> Regards
>>
>> Gunnar
>>
>>>
>>>       Thanks,
>>>       Paul
>>>
>>> _______________________________________________
>>> mmusic mailing list
>>> mmusic@ietf.org
>>> https://www.ietf.org/mailman/listinfo/mmusic
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
>>
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
>>
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
>>
> _______________________________________________
> mmusic mailing list
> mmusic@ietf.org
> https://www.ietf.org/mailman/listinfo/mmusic
> _______________________________________________
> mmusic mailing list
> mmusic@ietf.org
> https://www.ietf.org/mailman/listinfo/mmusic

-- 
-----------------------------------------
Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se
+46 708 204 288