Re: [Slim] Issue 43: How to know the modality of a language indication?

On 10/15/17 2:24 AM, Gunnar Hellström wrote:
> Paul,
> Den 2017-10-15 kl. 01:19, skrev Paul Kyzivat:
>> On 10/14/17 2:03 PM, Bernard Aboba wrote:
>>> Gunnar said:
>>>
>>> "Applications not implementing such specific notations may use the 
>>> following simple deductions.
>>>
>>> - A language tag in audio media is supposed to indicate spoken modality.
>>>
>>> [BA] Even a tag with "Sign Language" in the description??
>>>
>>> - A language tag in text media is supposed to indicate  written 
>>> modality.
>>>
>>> [BA] If the tag has "Sign Language" in the description, can this 
>>> document really say that?
>>>
>>> - A language tag in video media is supposed to indicate visual sign 
>>> language modality except for the case when it is supposed to indicate 
>>> a view of a speaking person mentioned in section 5.2 characterized by 
>>> the exact same language tag also appearing in an audio media 
>>> specification.
>>>
>>> [BA] It seems like an over-reach to say that a spoken language tag in 
>>> video media should instead be interpreted as a request for Sign 
>>> Language.  If this were done, would it always be clear which Sign 
>>> Language was intended?  And could we really assume that both sides, 
>>> if negotiating a spoken language tag in video media, were really 
>>> indicating the desire to sign?  It seems like this could easily 
>>> result interoperability failure.
>>
>> IMO the right way to indicate that two (or more) media streams are 
>> conveying alternative representations of the same language content is 
>> by grouping them with a new grouping attribute. That can tie together 
>> an audio with a video and/or text. A language tag for sign language on 
>> the video stream then clarifies to the recipient that it is sign 
>> language. The grouping attribute by itself can indicate that these 
>> streams are conveying language.
> <GH>Yes, and that is proposed in 
> draft-hellstrom-slim-modality-grouping    with two kinds of grouping: 
> One kind of grouping to tell that two or more languages in different 
> streams are alternatives with the same content and a priority order is 
> assigned to them to guide the selection of which one to use during the 
> call. The other kind of grouping telling that two or more languages in 
> different streams are desired together with the same language content 
> but different modalities ( such as the use for captioned telephony with 
> the same content provided in both speech and text, or sign language 
> interpretation where you see the interpreter,  or possibly spoken 
> language interpretation with the languages provided in different audio 
> streams ). I hope that that draft can be progressed. I see it as a 
> needed complement to the pure language indications per media.

Oh, sorry. I did read that draft but forgot about it.

> The discussion in this thread is more about how an application would 
> easily know that e.g. "ase" is a sign language and "en" is a spoken (or 
> written) language, and also a discussion about what kinds of languages 
> are allowed and indicated by default in each media type. It was not at 
> all about falsely using language tags in the wrong media type as Bernard 
> understood my wording. It was rather a limitation to what modalities are 
> used in each media type and how to know the modality with cases that are 
> not evident, e.g. "application" and "message" media types.

What do you mean by "know"? Is it for the *UA* software to know, or for 
the human user of the UA to know? Presumably a human user that cares 
will understand this if presented with the information in some way. But 
typically this isn't presented to the user.

For the software to know must mean that it will behave differently for a 
tag that represents a sign language than for one that represents a 
spoken or written language. What is it that it will do differently?

	Thanks,
	Paul

> Right now we have returned to a very simple rule: we define only use of 
> spoken language in audio media, written language in text media and sign 
> language in video media.
> We have discussed other use, such as a view of a speaking person in 
> video, text overlay on video, a sign language notation in text media, 
> written language in message media, written language in WebRTC data 
> channels, sign written and spoken in bucket media maybe declared as 
> application media. We do not define these cases. They are just not 
> defined, not forbidden. They may be defined in the future.
> 
> My proposed wording in section 5.4 got too many misunderstandings so I 
> gave up with it. I think we can live with 5.4 as it is in version -16.
> 
> Thanks,
> Gunnar
> 
> 
>>
>> (IIRC I suggested something along these lines a long time ago.)
>>
>>     Thanks,
>>     Paul
>>
>> _______________________________________________
>> SLIM mailing list
>> SLIM@ietf.org
>> https://www.ietf.org/mailman/listinfo/slim
>