Re: [Slim] Proposed 5.4 text

Gunnar Hellström <gunnar.hellstrom@omnitor.se> Thu, 23 November 2017 11:23 UTC

Return-Path: <gunnar.hellstrom@omnitor.se>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D378B128959 for <slim@ietfa.amsl.com>; Thu, 23 Nov 2017 03:23:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RNOvFv1q94am for <slim@ietfa.amsl.com>; Thu, 23 Nov 2017 03:23:39 -0800 (PST)
Received: from bin-vsp-out-02.atm.binero.net (bin-mail-out-05.binero.net [195.74.38.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A3A62126CE8 for <slim@ietf.org>; Thu, 23 Nov 2017 03:23:38 -0800 (PST)
X-Halon-ID: b7205403-d040-11e7-96ae-005056917f90
Authorized-sender: gunnar.hellstrom@omnitor.se
Received: from [192.168.2.136] (unknown [77.53.37.96]) by bin-vsp-out-02.atm.binero.net (Halon) with ESMTPSA id b7205403-d040-11e7-96ae-005056917f90; Thu, 23 Nov 2017 12:23:23 +0100 (CET)
To: Bernard Aboba <bernard.aboba@gmail.com>
Cc: slim@ietf.org, Randall Gellens <rg+ietf@randy.pensive.org>, Paul Kyzivat <pkyzivat@alum.mit.edu>, Brian Rosen <br@brianrosen.net>
References: <55f2b336-3f14-f49a-ec78-f00b0373db00@omnitor.se> <97d9a6b8-de3b-9f79-483b-18376fcf0ced@omnitor.se> <CAOW+2dtpRoeYkMJzX9vyNUojJDax4DQUU2F4PauBwt1sm-83Hg@mail.gmail.com> <6812d89a-ba10-0947-5320-07374b8c071d@comcast.net> <CAOW+2dtodRVOyGg_Q83TCPXwL3jBccA-hpBhYfrPCAUjSm5zkQ@mail.gmail.com> <E83689D8-DF61-4A3A-A5B2-8B3C05AFFB1E@brianrosen.net> <p06240607d63a5312bbbe@99.111.97.136> <72f7975c-91f5-91c2-6d8c-4f66aec63cf9@omnitor.se> <p06240609d63a644ec5b6@99.111.97.136> <CAOW+2dsP3EB8OogBU4NO917isBsOWs3VWbXK-AG88XhK7ROu4A@mail.gmail.com> <c1c24d1b-5dcc-4f55-2100-1c4d70a6e49f@omnitor.se> <CAOW+2duBgh7znEn0_bEqUhsWLrB9=8ndeDr+3j+JPnbAanvGkg@mail.gmail.com> <2b229df4-faef-65a0-e4ed-8b44a9f2e6a4@omnitor.se> <3F911C22-CF96-402C-AF4F-22CD761DFD5C@gmail.com> <c1db5d9e-f57e-dbf2-f8f0-cad32d515570@omnitor.se> <2DF564B4-88CD-4758-8F84-39BFCBD72B0E@gmail.com> <584cf317-ad5e-3261-ba88-8409610899d7@omnitor.se> <4391A04B-519E-4E3F-ACB8-1695D7B7B0F2@gmail.com>
From: =?UTF-8?Q?Gunnar_Hellstr=c3=b6m?= <gunnar.hellstrom@omnitor.se>
Message-ID: <e12bba40-9fd1-7d63-1cd4-75e373d5e019@omnitor.se>
Date: Thu, 23 Nov 2017 12:23:31 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <4391A04B-519E-4E3F-ACB8-1695D7B7B0F2@gmail.com>
Content-Type: multipart/alternative; boundary="------------BDD63C5FABCE8A350F7FB8F0"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/fKZQ0en2Vlt1Qsgx88cNFiX17pQ>
Subject: Re: [Slim] Proposed 5.4 text
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Nov 2017 11:23:41 -0000

Den 2017-11-23 kl. 03:02, skrev Bernard Aboba:
> On Nov 22, 2017, at 3:47 PM, Gunnar Hellström 
> <gunnar.hellstrom@omnitor.se <mailto:gunnar.hellstrom@omnitor.se>> wrote:
>>
>> <GH>Yes, I agree completely that speech recognition is a reality for 
>> real-time captioning today. But do you know implementations that 
>> transmit it as part of a video media stream? What coding?
>
> Here is an example that used machine learning to provide realtime 
> captions in multiple languages:
> https://azure.microsoft.com/en-us/blog/live-real-time-captions-with-azure-media-services-and-player/?cdn=disable

<GH>Yes, a nice example.  Even if it is not from our application area - 
conversational calls, and not initiated by SDP, and not using a video 
media stream, but a multiplexed RTMP TCP based message stream with 
interleaved video chunks and text chunks, it is an example that reminds 
us that something similar could be set up in a conversational call with 
SDP and using video media with MPEG4 according to RFC 3640, 4337 or 6381.

That indicates to us that we have cases that are even less supported by 
our current draft. The video/mp4 media can contain video and audio and 
text. If that is used for a conversational call, we would need to 
collect 'hlang' attributes for all three modalities in the video media 
description, and possibly get them all agreed, and that is against one 
of our basic statements in section 3:

    (Negotiating multiple simultaneous languages within a media stream is
    out of scope of this document.)

So, ok, let us for now assume that we limit the application to 
traditional conversational calls with one media and language per stream 
and no multiplexing.

Gunnar

>
>
>
> _______________________________________________
> SLIM mailing list
> SLIM@ietf.org
> https://www.ietf.org/mailman/listinfo/slim

-- 
-----------------------------------------
Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se
+46 708 204 288