Re: [Slim] Moving forward on draft-ietf-slim-negotiating-human-language

Paul Kyzivat <pkyzivat@alum.mit.edu> Tue, 21 November 2017 20:54 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C918F129BCD for <slim@ietfa.amsl.com>; Tue, 21 Nov 2017 12:54:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OOulBsPVpyiW for <slim@ietfa.amsl.com>; Tue, 21 Nov 2017 12:54:03 -0800 (PST)
Received: from alum-mailsec-scanner-1.mit.edu (alum-mailsec-scanner-1.mit.edu [18.7.68.12]) by ietfa.amsl.com (Postfix) with ESMTP id C36451270AC for <slim@ietf.org>; Tue, 21 Nov 2017 12:54:02 -0800 (PST)
X-AuditID: 1207440c-7fdff7000000143e-4c-5a14926893a5
Received: from outgoing-alum.mit.edu (OUTGOING-ALUM.MIT.EDU [18.7.68.33]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by alum-mailsec-scanner-1.mit.edu (Symantec Messaging Gateway) with SMTP id EF.11.05182.862941A5; Tue, 21 Nov 2017 15:54:01 -0500 (EST)
Received: from PaulKyzivatsMBP.localdomain (c-24-62-227-142.hsd1.ma.comcast.net [24.62.227.142]) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.13.8/8.12.4) with ESMTP id vALKrwHs015304 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for <slim@ietf.org>; Tue, 21 Nov 2017 15:53:59 -0500
To: slim@ietf.org
References: <CAOW+2dsZtuciPiKMfif=ZmUqBcUd9TyYtL5gPYDp7ZfLOHHDBA@mail.gmail.com> <p06240600d637c6f98ecc@99.111.97.136> <CAOW+2dv5NSiCbW=p1exvPV=PF8YCVdiz2gi-OCxmaUB-jGe22w@mail.gmail.com> <p06240600d6389cd2043f@99.111.97.136> <97d9a6b8-de3b-9f79-483b-18376fcf0ced@omnitor.se> <CAOW+2dtpRoeYkMJzX9vyNUojJDax4DQUU2F4PauBwt1sm-83Hg@mail.gmail.com> <55f2b336-3f14-f49a-ec78-f00b0373db00@omnitor.se> <bc5b7aa5-7c6a-5096-47d0-01e5ee079e93@omnitor.se> <CAOW+2duUPsTY=Ygzwfu0eOYbHaBAwMqm+oxA6AdMXinxTMNM1A@mail.gmail.com> <165d2c07-19ca-f2b1-2aac-3aac842b97e9@omnitor.se> <CAOW+2dsrRoCE4YQ1U+y448C4qmMY1Hb+8jM=aTyzvsPBYA0akg@mail.gmail.com> <14297efe-82c5-a6d9-94f7-fe82be6b423a@alum.mit.edu> <CAOW+2dtej1p3pD5-FbVjnkWbXH3O6TYO7zVBqy0PAfReA1Lh4A@mail.gmail.com> <e70a716e-0e43-c39e-9ac2-71b9d4afe20a@omnitor.se>
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
Message-ID: <4efeeff4-bcd7-661e-93c9-23e6fec3704b@alum.mit.edu>
Date: Tue, 21 Nov 2017 15:53:58 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <e70a716e-0e43-c39e-9ac2-71b9d4afe20a@omnitor.se>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrNIsWRmVeSWpSXmKPExsUixO6iqJs5SSTKYP4mOYuZHzrZHBg9liz5 yRTAGMVlk5Kak1mWWqRvl8CV0b48v2CRWsWqtmOsDYxXZbsYOTkkBEwkvr5dytTFyMUhJLCD SeLf1bOMEM5XJom2/YfYQaqEBYIkLk59xgZiiwgISnzvmQHVsZdN4siEz0wgCTYBLYk5h/6z gNi8AvYSfadWs4LYLAKqEk+nfABrFhVIk7gz4yETRI2gxMmZT8DqOQXsJC4fug5WwyxgJjFv 80NmCFtc4taT+UwQtrxE89bZzBMY+WchaZ+FpGUWkpZZSFoWMLKsYpRLzCnN1c1NzMwpTk3W LU5OzMtLLdI11MvNLNFLTSndxAgJS54djN/WyRxiFOBgVOLhdUgQiRJiTSwrrsw9xCjJwaQk yitpChTiS8pPqcxILM6ILyrNSS0+xCjBwawkwuseA5TjTUmsrEotyodJSXOwKInzqi5R9xMS SE8sSc1OTS1ILYLJynBwKEnwVk4EahQsSk1PrUjLzClBSDNxcIIM5wEargJSw1tckJhbnJkO kT/FaMnR03PjDxPHoxt3geSzma8bmIVY8vLzUqXEeVlBGgRAGjJK8+BmwtLMK0ZxoBeFee9O AKriAaYouKmvgBYyAS38eVwYZGFJIkJKqoGRf+qfAw96/Z7lv50obx135vKiLWVxD3Vm8iaJ TzDcVpkVPGXHNIaUzF/1z+3Doy68qy3h2C615kp2lJzQIbtPlvpLZBX1sj7M2/alKkSsPH5l bITU4r7Xi/zu7C4wNBB7KfE+aXH8xLpngvcmLXrL/SuewW/ib+c3Fxyqnl9ie3zC8MvToLgk JZbijERDLeai4kQAioBayw4DAAA=
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/yIw8VgP5-kI7a3mFqLFWzz267kc>
Subject: Re: [Slim] Moving forward on draft-ietf-slim-negotiating-human-language
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Nov 2017 20:54:05 -0000

On 11/21/17 2:45 PM, Gunnar Hellström wrote:
> Den 2017-11-21 kl. 18:06, skrev Bernard Aboba:
>> Paul said:
>>
>> "When using lip sync, is there any necessity to put the language tag 
>> on the video?"
>>
>> [BA] Good point.
>>
>> "   By including a language tag for spoken language in an audio
>>    description and using the "lip sync" grouping mechanism defined
>>    in [RFC5888] to group it with a video media stream it is possible
>>    to indicate synchronized audio and video so as to support lip
>>  reading."
>>
>> [BA] This seems like an improvement.
> <GH>I do not think that an indication of lip reading synch grouping can 
> be assumed to mean that the user promises to be seen in video. I guess 
> that most products implementing the lip synch grouping do it generally 
> for all calls regardless of if the user want to provide or see lips in 
> synch.

I wonder if they do it even when it isn't true.

For instance, consider a movie that was created in English. So the 
actors are speaking English, the audio is labeled as English, and there 
is lip sync.

Now take the same movie and dub it in Spanish. Now the audio ought to be 
labeled as Spanish. But lip sync should no longer be indicated. Is that 
what happens in practice?

Of course, a totally deaf English speaking lip reader will understand 
both equally well. But there may not be anything in the signaling to 
indicate that he will have English lip motion.

The above isn't assuming that the content is being created explicitly to 
accommodate deaf people. If the goal is specifically to focus on that 
then some different policies might help.

	Thanks,
	Paul

> But it is a good feature to use if you desire to see a speaker.
> The 'hlang' attribute in a video description is on the other hand clear 
> indications that you want to provide or receive language in the video 
> media stream.
> Therefore I think we should return either to say that a spoken/written 
> language tag in video media description means a view of the speaker if 
> there is also a lip synch grouping, or even skip the dependency on lip 
> synch grouping.  (there is a risk that we introduce tricky corner cases 
> by the bundling of lip synch and language use. How about if we by 
> further work agree on a way to indicate written captions in MPEG4 video, 
> and want to indicate that in a product that always provides lip synch 
> grouping. That will cause conflicts.)
> 
> Randall recently commented that use of text captions in the video stream 
> is a far fetched use case. MPEG4 has caption elements defined and it can 
> be provided in media declared as video, but it may be right that it is 
> rarely or never used in conversational calls. If we can agree on that we 
> could simply return to saying that a spoken/written language tag in 
> video description means a view of a speaker, and skip the requirement to 
> link it to the language in the audio stream.
> 
> Gunnar
>>
>> On Tue, Nov 21, 2017 at 8:44 AM, Paul Kyzivat <pkyzivat@alum.mit.edu 
>> <mailto:pkyzivat@alum.mit.edu>> wrote:
>>
>>     On 11/21/17 10:59 AM, Bernard Aboba wrote:
>>
>>         [BA] LGTM.  Do you recall what the objection was to the term
>>         "spoken/written language"?
>>
>>         Gunnar had said:
>>
>>         By including a language tag for spoken language in a video
>>         description and using the "lip sync" grouping mechanism
>>         defined in [RFC5888] it is possible to indicate synchronized
>>         audio and video so as to support lip reading.
>>
>>
>>     When using lip sync, is there any necessity to put the language
>>     tag on the video? ISTM that is irrelevant, as long as it is on the
>>     synced audio media. ISTM it would be better to say:
>>
>>        By including a language tag for spoken language in an audio
>>        description and using the "lip sync" grouping mechanism defined
>>        in [RFC5888] to group it with a video media stream it is possible
>>        to indicate synchronized audio and video so as to support lip
>>        reading.
>>
>>             Thanks,
>>             Paul
>>
>>
>>     _______________________________________________
>>     SLIM mailing list
>>     SLIM@ietf.org <mailto:SLIM@ietf.org>
>>     https://www.ietf.org/mailman/listinfo/slim
>>     <https://www.ietf.org/mailman/listinfo/slim>
>>
>>
>>
>>
>> _______________________________________________
>> SLIM mailing list
>> SLIM@ietf.org
>> https://www.ietf.org/mailman/listinfo/slim
> 
> -- 
> -----------------------------------------
> Gunnar Hellström
> Omnitor
> gunnar.hellstrom@omnitor.se
> +46 708 204 288
> 
> 
> 
> _______________________________________________
> SLIM mailing list
> SLIM@ietf.org
> https://www.ietf.org/mailman/listinfo/slim
>