Re: [Slim] Extended functionality for the real-time language negotiation

Gunnar Hellström <gunnar.hellstrom@omnitor.se> Wed, 15 March 2017 06:46 UTC

Return-Path: <gunnar.hellstrom@omnitor.se>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EA2FF1295EB for <slim@ietfa.amsl.com>; Tue, 14 Mar 2017 23:46:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.578
X-Spam-Level:
X-Spam-Status: No, score=-1.578 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, MISSING_HEADERS=1.021, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ibAMgCYBsA8A for <slim@ietfa.amsl.com>; Tue, 14 Mar 2017 23:46:17 -0700 (PDT)
Received: from bin-vsp-out-02.atm.binero.net (vsp-unauthed02.binero.net [195.74.38.227]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 972C11295A0 for <slim@ietf.org>; Tue, 14 Mar 2017 23:46:16 -0700 (PDT)
X-Halon-ID: 12f28aed-094b-11e7-af93-005056917f90
Authorized-sender: gunnar.hellstrom@omnitor.se
Received: from [192.168.2.136] (unknown [77.53.231.21]) by bin-vsp-out-02.atm.binero.net (Halon Mail Gateway) with ESMTPSA for <slim@ietf.org>; Wed, 15 Mar 2017 07:46:10 +0100 (CET)
References: <084a066e-ea68-d614-58e1-08c904f477ea@omnitor.se> <60797269-4dad-5f48-3184-b8fbca42c30c@realtimetext.org> <FFABE6D6-316E-40E1-B923-4C44A05F39B7@brianrosen.net> <ab03fe35-048d-be00-5a7f-bb9268d0fefb@omnitor.se> <a339b1c2-e493-0dba-28f7-77ee499f5042@omnitor.se>
Cc: "slim@ietf.org" <slim@ietf.org>
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se>
Message-ID: <fba735e2-1351-7acc-3e79-b4ae949533e4@omnitor.se>
Date: Wed, 15 Mar 2017 07:46:01 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <a339b1c2-e493-0dba-28f7-77ee499f5042@omnitor.se>
Content-Type: multipart/alternative; boundary="------------C99DB21BA5D3A7F687C0EB1D"
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/wR7iFD_gEINQh3RMGRFnj5kzGvA>
Subject: Re: [Slim] Extended functionality for the real-time language negotiation
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Mar 2017 06:46:20 -0000

This is an amended overview of possible solutions to the functionality 
extensions for the slim real-time draft.

It includes the alternative to use the language subtag -t- to indicate 
transformed language and thereby enable simultaneity.

The selected solutions can be included in the current draft or specified 
in a separate draft. If it is specified in a separate draft, the current 
draft needs to be checked and possibly adjusted so that it allows the 
selected syntax of the extensions.

The letters e) f) etc refer to the summary of issues from LC.

*Indication of preference between media, and of simultanous versus 
alternative languages.*

  Discussion:
Issue e) says that there is a need to be able to indicate which of a set 
of language/media indications are more preferred alternatives than others.
Examples are:
1. A user A want to get written English in text, A can as a less 
preferred alternative accept to get spoken English.  An answering party 
B who can use text will then respond with written text and get good 
satisfaction, while another answering user C without text capability 
will answer in spoken English and have a possibility for a reasonably 
successful call.
Without this indication, the first answering party B may have seen the 
spoken and written alternatives as true equal preference alternatives 
and answered with spoken English that will result in less satisfied users.
2. A prefer to receive spoken language, and can accept to receive 
text.   When answering party B can use spoken language, that will be 
satisfied, otherwise written language will be used.
3. A prefer to use spoken language in both directions, and can accept to 
use sign language in video in both directions. Answering party B has a 
clear indication of why both signed and written is indicated and can 
answer according to its capabilities trying to satisfy the preference 
for spoken language.
4. A prefer to use  sign language in both directions and can accept to 
use written language in both directions. Sign language users will use 
sign language, others will use text.
5. A prefer to send sign language and receive text (deaf-blind user) and 
can accept to send text.  In a call with a person with similar 
preferences, text will be used both ways, otherwise sign one way and 
text the other.

etc.

Issue f) requires a way to indicate use of captioning and other 
situations where use of simultaneous languages in different modalities 
are needed:

1. Preference for hearing spoken language and simultaneously read 
written language in text. ( captioning) .   The time is here when this 
can be provided automatically in some settings, but also traditionally 
by a manned service.
2. Preference for hearing spoken language and simultaneously seeing the 
speaker in video.   (lip-reading).  Easily and naturally provided once 
the need is known.
3. Preference for seeing sign language and simultaneously hear spoken 
language in audio.  ( for multiple users at the terminal )    One of the 
streams is provided by an interpreter.
4. Preference for hearing spoken language and simultaneously view 
written language in video. (captioning if we accept to specify text as 
overlay on video, otherwise it is same as number 1.)

Some of these can be acceptable also if just one of the language/media 
combinations can be provided, but is much more preferred if both can be 
provided together. In other cases it is essential to get both 
simultaneously. There is a need to differentiate in the indication that 
this preference for getting the languages together is preferred.

Alternative coding proposals:
1. Preference between modalities
1.1 Based on draft -08, add the coding of an asterisk last in an 
attribute to mean lower preference for a lanugage/media combination than 
the one(s) without an asterisk.
example where audio and text are alternatives and text preferred
m=audio
a=hlang-recv:en*
m=text
a=hlang-recv:en


1.2. Change to the Accept-Language syntax and let the q-values have 
scope over the whole SDP.

Example where sign language is higher preferred than text.

  m=video 51372 RTP/AVP 31 32
  a=hlang-recv:ase;q=0.9
  a=hlang-send:ase;q=0.9


  m=text 49250 RTP/AVP 98,99
  a=hlang-send:en;q=0.5,*;q=0.1
  a=hlang-recv:en;q=0.5,*;q=0.1

1.3. Introduce a new a=modality attribute on media level, with 
parameters: <modality>, <direction>, <preference>

example:

m=text
a=modality:written,recv,hi
a=hlang-recv:en*
m=audio
a=modality:spoken,recv,med
a=hlang-recv:en*


2. Preference for simultaneous languages vs alternative languages:

2.1. Based on draft -08, add another notation to the use of the 
asterisk, e.g. an optional character to be used together with or without 
the asterisk to mark media that are wanted together. (ugly)   example:

m=audio
a=hlang-recv:en*$c
m=text
a=hlang-recv:en$c

The $ is a simultaneity indication, the c is a grouping indicator 
telling that all modalities marked with the $c are wanted together. (we 
might be able to restrict the indication to just one set of languages 
that are wanted simultaneously.)


2.2. Use the Accept-Language syntax for the hlang attributes and add the 
usage rules that q-values with less than .1 difference mean languages 
with a preference to be used together. Higher differences indicate that 
they are alternatives.  Thereby it is both possible to indicate 
simultaneity and preference if the simultaneity cannot be satisfied.

  m=audio 51372 RTP/AVP 0
  a=hlang-recv:ase;q=0.5
  a=hlang-send:ase;q=0.5


  m=text 49250 RTP/AVP 98,99
  a=hlang-send:en;q=0.51,*;q=0.1
  a=hlang-recv:en;q=0.51,*;q=0.1

The q-values differences are within 0.1 so it is a preference for getting both together, but if that is not possible, text is preferred.


2.3. Add to the new a=modality attribute from solution 1.3 a fourth, 
optional parameter [simultaneity]   with value any single letter, 
indicating a preference for having that modality simultaneously with 
another modality indicated with the same value in the [simultaneity] 
parameter.   Without this parameter, the modalities are alternatives. 
Use this solution together with solution 1.3

Example: Indicate that written English and spoken English are desired 
together but the call shall not be denied if that combination is not 
possible, and then written English is preferred.

m=text
a=modality:written,recv,hi,d
a=hlang-recv:en
m=audio
a=modality:spoken,recv,hi,d
a=hlang-recv:en*

The "d" is a grouping identifier.

2.4. Use the -t- subtag for transformed content on a language indication 
defined in RFC 6497 as an indication that this language can be provided 
or is desired together with a language in another modality. Use this 
indication together with solution 1.1

Example: Indicate that written English and spoken English are desired 
together and written English expected to be transformed.  but the call 
shall not be denied if that combination is not possible, and then 
written English is preferred.

m=text
a=hlang-recv:en-t-en
m=audio
a=hlang-recv:en*

The -t- indicated with the -recv direction shall not be understood that 
the indicated language needs to be transformed. It is just an 
expectation that enables it to be provided simultaneously.

My judgement of the alternatives:
I have a slight preference for solutions 1.2 and 2.2 because they are 
logically cleanest , but they require that we accept the LC review 
proposal to move to the Accept-Language syntax.
1.1 and 2.4 are easily added to the syntax of the current draft and have 
sufficient functionality for most cases.
1.3 and 2.3 cause more work and longer SDP
2.1 is ugly and kept here just for reference.


Gunnar






-- 
-----------------------------------------
Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se
+46 708 204 288