Re: [Slim] Issue 43: How to know the modality of a language indication?

Paul Kyzivat <pkyzivat@alum.mit.edu> Sun, 15 October 2017 17:13 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 177881331D2 for <slim@ietfa.amsl.com>; Sun, 15 Oct 2017 10:13:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z_yn9FHwZVo9 for <slim@ietfa.amsl.com>; Sun, 15 Oct 2017 10:13:24 -0700 (PDT)
Received: from alum-mailsec-scanner-3.mit.edu (alum-mailsec-scanner-3.mit.edu [18.7.68.14]) by ietfa.amsl.com (Postfix) with ESMTP id BF99E1270AB for <slim@ietf.org>; Sun, 15 Oct 2017 10:13:23 -0700 (PDT)
X-AuditID: 1207440e-bf9ff70000007085-95-59e397309328
Received: from outgoing-alum.mit.edu (OUTGOING-ALUM.MIT.EDU [18.7.68.33]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by alum-mailsec-scanner-3.mit.edu (Symantec Messaging Gateway) with SMTP id FB.A3.28805.03793E95; Sun, 15 Oct 2017 13:13:21 -0400 (EDT)
Received: from PaulKyzivatsMBP.localdomain (c-24-62-227-142.hsd1.ma.comcast.net [24.62.227.142]) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.13.8/8.12.4) with ESMTP id v9FHDIJa016801 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 15 Oct 2017 13:13:19 -0400
To: Gunnar Hellström <gunnar.hellstrom@omnitor.se>, slim@ietf.org
References: <CAOW+2dtSOgp3JeiSVAttP+t0ZZ-k3oJK++TS71Xn7sCOzMZNVQ@mail.gmail.com> <p06240606d607257c9584@172.20.60.54> <fb9e6b79-7bdd-9933-e72e-a47bc8c93b58@omnitor.se> <CAOW+2dtteOadptCT=yvfmk01z-+USfE4a7JO1+u_fkTp72ygNA@mail.gmail.com> <da5cfaea-75f8-3fe1-7483-d77042bd9708@alum.mit.edu> <b2611e82-2133-0e77-b72b-ef709b1bba3c@omnitor.se>
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
Message-ID: <1b0380ef-b57d-3cc7-c649-5351dc61f878@alum.mit.edu>
Date: Sun, 15 Oct 2017 13:13:18 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <b2611e82-2133-0e77-b72b-ef709b1bba3c@omnitor.se>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupileLIzCtJLcpLzFFi42IRYndR1DWc/jjS4MUnCYsd78+wWMz80Mnm wOSxZMlPJo+Jiz8xBzBFcdmkpOZklqUW6dslcGW8+tfOVLBIvWLe1YPsDYwz5bsYOTkkBEwk Tk5+ztjFyMUhJLCDSaJv9gcmkISQwEMmiakbw0BsYYEgiU2nT7CB2CIC0RKTdlxmhKi5wyTR e4ELxGYT0JKYc+g/C4jNK2AvsWFHH1gNi4CqxOk5s5hBbFGBNIk7Mx4yQdQISpyc+QSsnlPA TuL0pU52EJtZwExi3uaHzBC2uMStJ/OZIGx5ieats5knMPLPQtI+C0nLLCQts5C0LGBkWcUo l5hTmqubm5iZU5yarFucnJiXl1qka6yXm1mil5pSuokREqh8Oxjb18scYhTgYFTi4RXIeBQp xJpYVlyZe4hRkoNJSZT3XOvDSCG+pPyUyozE4oz4otKc1OJDjBIczEoivHMaHkcK8aYkVlal FuXDpKQ5WJTEedWWqPsJCaQnlqRmp6YWpBbBZGU4OJQkeBOmATUKFqWmp1akZeaUIKSZODhB hvMADbcHqeEtLkjMLc5Mh8ifYrTn6Om58YeJY8f920Dy0Y27QPLL7vt/mIRY8vLzUqXEeb1A 2gRA2jJK8+Amw5LQK0ZxoEeFeeNAqniACQxu9iugtUxAa99FPABZW5KIkJJqYJz8d+/D7gDG J2b7jFnWn31c03q776eeUI/Ae//gY+2f68tF979k+1+26/o1eUcTH/OVk3wkHTdVX80RTd27 2srSoFh2i+/c6ekGZh06rHJWy27GHfvV/CyLmzNzV/PiBSsv7fETnqmtvb1evyEqTXTuhssx P38X+e1waSxy0svyPSDJNWtClBJLcUaioRZzUXEiAL6gTNkdAwAA
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/UVANZiQoDE9zjSlETLWYdZfO9oY>
Subject: Re: [Slim] Issue 43: How to know the modality of a language indication?
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Oct 2017 17:13:26 -0000

On 10/15/17 2:24 AM, Gunnar Hellström wrote:
> Paul,
> Den 2017-10-15 kl. 01:19, skrev Paul Kyzivat:
>> On 10/14/17 2:03 PM, Bernard Aboba wrote:
>>> Gunnar said:
>>>
>>> "Applications not implementing such specific notations may use the 
>>> following simple deductions.
>>>
>>> - A language tag in audio media is supposed to indicate spoken modality.
>>>
>>> [BA] Even a tag with "Sign Language" in the description??
>>>
>>> - A language tag in text media is supposed to indicate  written 
>>> modality.
>>>
>>> [BA] If the tag has "Sign Language" in the description, can this 
>>> document really say that?
>>>
>>> - A language tag in video media is supposed to indicate visual sign 
>>> language modality except for the case when it is supposed to indicate 
>>> a view of a speaking person mentioned in section 5.2 characterized by 
>>> the exact same language tag also appearing in an audio media 
>>> specification.
>>>
>>> [BA] It seems like an over-reach to say that a spoken language tag in 
>>> video media should instead be interpreted as a request for Sign 
>>> Language.  If this were done, would it always be clear which Sign 
>>> Language was intended?  And could we really assume that both sides, 
>>> if negotiating a spoken language tag in video media, were really 
>>> indicating the desire to sign?  It seems like this could easily 
>>> result interoperability failure.
>>
>> IMO the right way to indicate that two (or more) media streams are 
>> conveying alternative representations of the same language content is 
>> by grouping them with a new grouping attribute. That can tie together 
>> an audio with a video and/or text. A language tag for sign language on 
>> the video stream then clarifies to the recipient that it is sign 
>> language. The grouping attribute by itself can indicate that these 
>> streams are conveying language.
> <GH>Yes, and that is proposed in 
> draft-hellstrom-slim-modality-grouping    with two kinds of grouping: 
> One kind of grouping to tell that two or more languages in different 
> streams are alternatives with the same content and a priority order is 
> assigned to them to guide the selection of which one to use during the 
> call. The other kind of grouping telling that two or more languages in 
> different streams are desired together with the same language content 
> but different modalities ( such as the use for captioned telephony with 
> the same content provided in both speech and text, or sign language 
> interpretation where you see the interpreter,  or possibly spoken 
> language interpretation with the languages provided in different audio 
> streams ). I hope that that draft can be progressed. I see it as a 
> needed complement to the pure language indications per media.

Oh, sorry. I did read that draft but forgot about it.

> The discussion in this thread is more about how an application would 
> easily know that e.g. "ase" is a sign language and "en" is a spoken (or 
> written) language, and also a discussion about what kinds of languages 
> are allowed and indicated by default in each media type. It was not at 
> all about falsely using language tags in the wrong media type as Bernard 
> understood my wording. It was rather a limitation to what modalities are 
> used in each media type and how to know the modality with cases that are 
> not evident, e.g. "application" and "message" media types.

What do you mean by "know"? Is it for the *UA* software to know, or for 
the human user of the UA to know? Presumably a human user that cares 
will understand this if presented with the information in some way. But 
typically this isn't presented to the user.

For the software to know must mean that it will behave differently for a 
tag that represents a sign language than for one that represents a 
spoken or written language. What is it that it will do differently?

	Thanks,
	Paul

> Right now we have returned to a very simple rule: we define only use of 
> spoken language in audio media, written language in text media and sign 
> language in video media.
> We have discussed other use, such as a view of a speaking person in 
> video, text overlay on video, a sign language notation in text media, 
> written language in message media, written language in WebRTC data 
> channels, sign written and spoken in bucket media maybe declared as 
> application media. We do not define these cases. They are just not 
> defined, not forbidden. They may be defined in the future.
> 
> My proposed wording in section 5.4 got too many misunderstandings so I 
> gave up with it. I think we can live with 5.4 as it is in version -16.
> 
> Thanks,
> Gunnar
> 
> 
>>
>> (IIRC I suggested something along these lines a long time ago.)
>>
>>     Thanks,
>>     Paul
>>
>> _______________________________________________
>> SLIM mailing list
>> SLIM@ietf.org
>> https://www.ietf.org/mailman/listinfo/slim
>