Re: [Slim] Issue 43: How to know the modality of a language indication?

On 10/15/17 1:49 PM, Bernard Aboba wrote:
> Paul said:
> 
> "For the software to know must mean that it will behave differently for 
> a tag that represents a sign language than for one that represents a 
> spoken or written language. What is it that it will do differently?"
> 
> [BA] In terms of behavior based on the signed/non-signed distinction, in 
> -17 the only reference appears to be in Section 5.4, stating that 
> certain combinations are not defined in the document (but that 
> definition of those combinations was out of scope):

I'm asking whether this is a distinction without a difference. I'm not 
asking whether this makes a difference in the *protocol*, but whether in 
the end it benefits the participants in the call in any way. For instance:

- does it help the UA to decide how to alert the callee, so that the
   callee can better decide whether to accept the call or instruct the
   UA about how to handle the call?

- does it allow the UA to make a decision whether to accept the media?

- can the UA use this information to change how to render the media?

And if there is something like this, will the UA be able to do this 
generically based on whether the media is sign language or not, or will 
the UA need to already understand *specific* sign language tags?

E.g., A UA serving a deaf person might automatically introduce a sign 
language interpreter into an incoming audio-only call. If the incoming 
call has both audio and video then the video *might* be for conveying 
sign language, or not. If not then the UA will still want to bring in a 
sign language interpreter. But is knowing the call generically contains 
sign language sufficient to decide against bringing in an interpreter? 
Or must that depend on it being a sign language that the user can use? 
If the UA is configured for all the specific sign languages that the 
user can deal with then there is no need to recognize other sign 
languages generically.

	Thanks,
	Paul

>       5.4
>       <https://tools.ietf.org/html/draft-ietf-slim-negotiating-human-language-17#section-5.4>.
>       Undefined Combinations
> 
> 
> 
>     The behavior when specifying a non-signed language tag for a video
>     media stream, or a signed language tag for an audio or text media
>     stream, is not defined in this document.
> 
>     The problem of knowing which language tags are signed and which are
>     not is out of scope of this document.
> 
> 
> 
> On Sun, Oct 15, 2017 at 10:13 AM, Paul Kyzivat <pkyzivat@alum.mit.edu 
> <mailto:pkyzivat@alum.mit.edu>> wrote:
> 
>     On 10/15/17 2:24 AM, Gunnar Hellström wrote:
> 
>         Paul,
>         Den 2017-10-15 kl. 01:19, skrev Paul Kyzivat:
> 
>             On 10/14/17 2:03 PM, Bernard Aboba wrote:
> 
>                 Gunnar said:
> 
>                 "Applications not implementing such specific notations
>                 may use the following simple deductions.
> 
>                 - A language tag in audio media is supposed to indicate
>                 spoken modality.
> 
>                 [BA] Even a tag with "Sign Language" in the description??
> 
>                 - A language tag in text media is supposed to indicate 
>                 written modality.
> 
>                 [BA] If the tag has "Sign Language" in the description,
>                 can this document really say that?
> 
>                 - A language tag in video media is supposed to indicate
>                 visual sign language modality except for the case when
>                 it is supposed to indicate a view of a speaking person
>                 mentioned in section 5.2 characterized by the exact same
>                 language tag also appearing in an audio media specification.
> 
>                 [BA] It seems like an over-reach to say that a spoken
>                 language tag in video media should instead be
>                 interpreted as a request for Sign Language.  If this
>                 were done, would it always be clear which Sign Language
>                 was intended?  And could we really assume that both
>                 sides, if negotiating a spoken language tag in video
>                 media, were really indicating the desire to sign?  It
>                 seems like this could easily result interoperability
>                 failure.
> 
> 
>             IMO the right way to indicate that two (or more) media
>             streams are conveying alternative representations of the
>             same language content is by grouping them with a new
>             grouping attribute. That can tie together an audio with a
>             video and/or text. A language tag for sign language on the
>             video stream then clarifies to the recipient that it is sign
>             language. The grouping attribute by itself can indicate that
>             these streams are conveying language.
> 
>         <GH>Yes, and that is proposed in
>         draft-hellstrom-slim-modality-grouping    with two kinds of
>         grouping: One kind of grouping to tell that two or more
>         languages in different streams are alternatives with the same
>         content and a priority order is assigned to them to guide the
>         selection of which one to use during the call. The other kind of
>         grouping telling that two or more languages in different streams
>         are desired together with the same language content but
>         different modalities ( such as the use for captioned telephony
>         with the same content provided in both speech and text, or sign
>         language interpretation where you see the interpreter,  or
>         possibly spoken language interpretation with the languages
>         provided in different audio streams ). I hope that that draft
>         can be progressed. I see it as a needed complement to the pure
>         language indications per media.
> 
> 
>     Oh, sorry. I did read that draft but forgot about it.
> 
>         The discussion in this thread is more about how an application
>         would easily know that e.g. "ase" is a sign language and "en" is
>         a spoken (or written) language, and also a discussion about what
>         kinds of languages are allowed and indicated by default in each
>         media type. It was not at all about falsely using language tags
>         in the wrong media type as Bernard understood my wording. It was
>         rather a limitation to what modalities are used in each media
>         type and how to know the modality with cases that are not
>         evident, e.g. "application" and "message" media types.
> 
> 
>     What do you mean by "know"? Is it for the *UA* software to know, or
>     for the human user of the UA to know? Presumably a human user that
>     cares will understand this if presented with the information in some
>     way. But typically this isn't presented to the user.
> 
>     For the software to know must mean that it will behave differently
>     for a tag that represents a sign language than for one that
>     represents a spoken or written language. What is it that it will do
>     differently?
> 
>              Thanks,
>              Paul
> 
> 
>         Right now we have returned to a very simple rule: we define only
>         use of spoken language in audio media, written language in text
>         media and sign language in video media.
>         We have discussed other use, such as a view of a speaking person
>         in video, text overlay on video, a sign language notation in
>         text media, written language in message media, written language
>         in WebRTC data channels, sign written and spoken in bucket media
>         maybe declared as application media. We do not define these
>         cases. They are just not defined, not forbidden. They may be
>         defined in the future.
> 
>         My proposed wording in section 5.4 got too many
>         misunderstandings so I gave up with it. I think we can live with
>         5.4 as it is in version -16.
> 
>         Thanks,
>         Gunnar
> 
> 
> 
>             (IIRC I suggested something along these lines a long time ago.)
> 
>                  Thanks,
>                  Paul
> 
>             _______________________________________________
>             SLIM mailing list
>             SLIM@ietf.org <mailto:SLIM@ietf.org>
>             https://www.ietf.org/mailman/listinfo/slim
>             <https://www.ietf.org/mailman/listinfo/slim>
> 
> 
> 
>     _______________________________________________
>     SLIM mailing list
>     SLIM@ietf.org <mailto:SLIM@ietf.org>
>     https://www.ietf.org/mailman/listinfo/slim
>     <https://www.ietf.org/mailman/listinfo/slim>
> 
>