Re: [Slim] I-D Action: draft-ietf-slim-negotiating-human-language-07.txt

Gunnar Hellström <gunnar.hellstrom@omnitor.se> Sun, 26 February 2017 06:46 UTC

Return-Path: <gunnar.hellstrom@omnitor.se>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 46E5012989D for <slim@ietfa.amsl.com>; Sat, 25 Feb 2017 22:46:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sf6OqXWX5POD for <slim@ietfa.amsl.com>; Sat, 25 Feb 2017 22:46:11 -0800 (PST)
Received: from bin-vsp-out-01.atm.binero.net (bin-mail-out-06.binero.net [195.74.38.229]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 83681129630 for <slim@ietf.org>; Sat, 25 Feb 2017 22:46:10 -0800 (PST)
X-Halon-ID: 3eb5b963-fbef-11e6-ad4a-005056917a89
Authorized-sender: gunnar.hellstrom@omnitor.se
Received: from [192.168.2.136] (unknown [77.53.231.21]) by bin-vsp-out-01.atm.binero.net (Halon Mail Gateway) with ESMTPSA; Sun, 26 Feb 2017 07:46:02 +0100 (CET)
To: Randall Gellens <rg+ietf@randy.pensive.org>, slim@ietf.org, Natasha Rooney <nrooney@gsma.com>, Bernard Aboba <bernard.aboba@gmail.com>
References: <148782279664.31054.8793649134696520241.idtracker@ietfa.amsl.com> <p0624060cd4d4111cd79a@[99.111.97.136]> <49fd730e-6e90-1a49-eae8-80f8b1285a76@omnitor.se> <p06240604d4d6169921b5@[99.111.97.136]> <83152ba7-c3fb-25d8-f97d-59c7840cad56@omnitor.se> <p06240601d4d790fb8bb3@[99.111.97.136]>
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se>
Message-ID: <4b36f347-955e-e2b9-12f2-f426d47d3d33@omnitor.se>
Date: Sun, 26 Feb 2017 07:46:01 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1
MIME-Version: 1.0
In-Reply-To: <p06240601d4d790fb8bb3@[99.111.97.136]>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/jgSmFu0FnRs-qSKeZl1ck8bLQdE>
Subject: Re: [Slim] I-D Action: draft-ietf-slim-negotiating-human-language-07.txt
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Feb 2017 06:46:13 -0000

Den 2017-02-25 kl. 20:58, skrev Randall Gellens:
> Hi Gunnar,
>
> At 11:04 AM +0100 2/25/17, Gunnar Hellström wrote:
>
>>  Fine, I find that we have only issues 5, 6 and 12 still to discuss.
>>
>>  You did not answer issue 6, use of asymmetrical language rather than 
>> unidirectional media. I assume you accepted it.
>
> Yes, I thought I had indicated that, sorry if I didn't.
Good.
>
>>
>>  On 5, the request to reinsert wording about seeing the speaker in 
>> video, it is still a huge difference in specifying a preference to 
>> see the speaker for language perception reasons, versus only 
>> specifying that I want a video stream for supplementary purposes. 
>> With the current wording in version -07, section 5.3 says that that 
>> combination is undefined. Nothing in the LC discussion indicated that 
>> it should be undefined. Why did you suddenly want to delete it? It is 
>> useful. Please reinsert with the wording changes I propose.
>
> The email discussion led me to believe that the text was 
> controversial.  We need to get the draft finished, so it's better to 
> delete controversial text than to spend months fighting about it.
The comments were first about the uncertainty about how the "silly 
states" were to be interpreted.
We described them all but decided to only keep the view of the speaker 
because it is a real and useful case.
The idea to differentiate spoken and written cases by script tags caused 
discussions and was dropped. The remaining real case with the view of 
the speaker was mentioned twice in the draft, so it was recommended that 
one of them should be deleted, but not both.
>
>>
>>  On 12, the meaning of the placement of the asterisk, you ask:
>>  "Making the asterisk a purely-advisory hint as to the 
>> least-preferred media/language combination seems harmless enough, as 
>> it would not be required to support it; however, I'm not sure it 
>> provides any benefit: if an offer contains some set of media with 
>> language, and the answerer can support all of them, should the 
>> answerer only include in its answer those without an asterisk? It 
>> seems simpler for the answerer to include everything in the offer 
>> that it can support."
>>
>>  The answering party should aim at answering with one of the 
>> languages that is without the asterisk in the offer. Only if the 
>> answering party does not have capability in a language without an 
>> asterisk, one with asterisk should be selected. Thereby you get the 
>> best opportunity to start the call in a language combination that 
>> satisfies both users.
>>
>>  Example: A hard-of-hearing user can just barely conduct spoken calls 
>> with persons she knows. From others it is much more reliable to get 
>> text.  She calls and declares:
>>
>>  m=audio
>>  a=huml-send:en
>>  a=huml-recv:en*
>>  m=text
>>  a=huml-recv:en
>>
>>  The answering party with text capabilities sees that matching text 
>> for sending is higher preferred than talking, and thus responds:
>>
>>  m=audio
>>  a=huml-recv:en
>>  m=text
>>  a=huml-send:en
>>
>>  The answering party sends the initial greeting in text and the call 
>> continues smoothly in well managed langauage/modality combinations.
>>
>>  Another called party may not have text capabilities, and may 
>> therefore select the less favoured alternative with using speech both 
>> ways, answering:
>>
>>  m=audio
>>  a=huml-recv:en
>>  a=huml-send:en
>>  m=text 0
>>
>>  The answering party starts taking and the parties try as well as 
>> possible to manage the call in this less preferred combination that 
>> may be less reliable.
>>
>>  If the placement of the asterisk had no special meaning as it is in 
>> version -07, it is a high risk that the answering party in the first 
>> example would select to answer with spoken language that would be 
>> unreliably received. Time and effort would be spent by speech to make 
>> the answering party switch to sending text instead of talking in 
>> order to arrange for a more reliable call situation.
>>
>>  If instead the caller only indicated the most favoured combinations,
>>
>>  m=audio
>>  a=huml-send:en
>>  m=text
>>  a=huml-recv:en
>>
>>  Then the answering parties without text capability would not dare to 
>> try to answer, and a reasonably successful call would be missed.
>>
>>  Many other similar realistic examples can be created, where 
>> placement of the asterisk(s) would be a sufficient indication of 
>> lower preference for language match among alternatives that would 
>> make call establishment successful and smooth in many more cases than 
>> without this indication opportunity.
>>
>>  Do you want more examples?
>>
>>  Please accept proposal 12.
>
> This convinces me that we cannot accept the proposed text, as it would 
> introduce complexity that the WG explicitly decided to not pursue in 
> this draft.  In the examples you provided, it seems better for the 
> answerer to include all media and languages from the offer that it can 
> support.  This is much simpler, has only trivial drawbacks (extra 
> media negotiated that might not be used), and is what the WG agreed to.
Yes, you could let the answer SDP contain one common language per media 
and direction, but the answering human need guidance on which language 
is best suited to start the conversation. Therefore the placement of the 
asterisk is used to hint the answering party how to start the call.

The first example above can be modified to:

  Example: A hard-of-hearing user can just barely conduct spoken calls 
with persons she knows. From others it is much more reliable to get 
text.  She calls and declares:

  m=audio
  a=huml-send:en
  a=huml-recv:en*
  m=text
  a=huml-recv:en

  The answering party with capabilities for both written and spoken 
English sees that matching text for sending is higher preferred than 
talking and sends the answer indicating the capabilities:

  m=audio
  a=huml-recv:en
a=huml-send:en
  m=text
  a=huml-send:en

  The answering party makes use of the hint that the caller prefers to 
receive written text and therefore sends the initial greeting in text 
and the call continues smoothly in well managed langauage/modality 
combinations.

----------
There is no complexity left in this solution, it helps to motivate why 
we have the asterisk on media level, and it helps to successful call 
initiations, so I think it should be acceptable.

Gunnar

>
> --Randy
>
>>  Den 2017-02-25 kl. 01:32, skrev Randall Gellens:
>>>  At 5:35 PM +0100 2/24/17, Gunnar Hellström wrote:
>>>
>>>>   Den 2017-02-23 kl. 05:15, skrev Randall Gellens:
>>>>>   Version -07 addresses all comments except for the unresolved 
>>>>> issue of renaming the two attributes which is currently being 
>>>>> discussed on the list, and adding a new attribute for 
>>>>> bidirectionality.
>>>>>
>>>>>   Per Dale's suggestion, the draft adds advice that if a call is 
>>>>> rejected due to no languages in common, SIP response code 488 (Not 
>>>>> Acceptable Here) or 606 (Not Acceptable) be used, along with a 
>>>>> Warning header field indicating the supported languages.  The 
>>>>> draft registers a new entry in the warn-code sub-registry of SIP 
>>>>> parameters for this purpose.  The draft also has an expanded set 
>>>>> of examples.
>>>>>
>>>>   Good progress. Good to see the enriched examples chapter 5.5.
>>>>   I have a few comments on version -07:
>>>>
>>>>   1.  Section  4. second line
>>>>   ------------old text----------------------
>>>>   but is not sufficiently sufficiently
>>>>   ------------new text--------------------------
>>>>   but is not sufficiently
>>>>   ----------end of change 1-----------------
>>>>   Motivation: New typo in version -07
>>>
>>>  Thanks.
>>>
>>>>
>>>>   2. Section 5.2, first line
>>>>   ----------------old text-----------------
>>>>   This document defines two new media-level ..
>>>>   ----------------new text----------------------
>>>>   This document defines two media-level ...
>>>>   ----------------end of change 2----------------
>>>>   Motivation: It was commented that when the draft is published, 
>>>> this is not new anymore.
>>>>   There are three more occasions of "new" in the document that may 
>>>> be modified as well.
>>>
>>>  OK.
>>>
>>>>
>>>>   3.  5.2 second paragraph
>>>>   -------------------old text--------------------------------
>>>>   In an offer, the 'humintlang-send' values indicates the language(s)
>>>>      the offerer is willing to use when sending using the media, 
>>>> and the
>>>>      'humintlang-recv' values indicates the language(s) the offerer is
>>>>      willing to use when receiving using the media.
>>>>   -----------------new text---------------------------------
>>>>   In an offer, the 'humintlang-send' values indicate the language(s)
>>>>   the offerer is willing to select from for use when sending using the
>>>>   media, and the 'humintlang-recv' values indicate the language(s) the
>>>>   offerer is willing to receive one of in the media stream.
>>>>   ----------------end of change----------------------------------
>>>>   Motivation 1:) change from "indicates" to "indicate" in two 
>>>> places to match the new use of plural "values".
>>>>   Motivation 2:) Be sure to indicate that we only intend to 
>>>> negotiate one language per media and direction, so that we do not 
>>>> end up as unspecified regarding number of matches required as the 
>>>> sdp "lang" attribute is.
>>>
>>>  Reworded.
>>>
>>>>
>>>>   4.  5.2 Second paragraph
>>>>   -----------------old text-----------------------
>>>>   When a media is intended
>>>>      for use in one direction only
>>>>   ----------------new text---------------------
>>>>   When a media is intended
>>>>      for use for language communication in one direction only
>>>>   ----------------end of change---------------------------
>>>>   Motivation: Deletion of a note in this sentence made it less 
>>>> obvious that we are only talking about directions of use of 
>>>> language communication, and not about establishing asymmetric media 
>>>> connections. Therefore add this clarification.
>>>
>>>  Reworded.
>>>
>>>>
>>>>   5.  5.2 Deleted paragraph 6 before "Clients acting on behalf..."
>>>>   ----------reinsert modified paragraph----------------------------
>>>>   While signed language tags are used with a video stream to
>>>>   indicate sign language, a spoken language tag for a video stream
>>>>   indicates a request or offer to see the speaker, when that is of
>>>>   importance for language perception.
>>>>   -------------end of 
>>>> change-------------------------------------------
>>>>   Motivation: There was in the LC mail exchange a discussion about 
>>>> sharpening up the specification of use of "unusual combinations".
>>>>   There was no agreement to delete them all. The one described in 
>>>> this paragraph is the main one that has widespread use and needs to 
>>>> be clearly specified for use by a large number of hard-of-hearing 
>>>> and deaf users.
>>>
>>>  The text as it is now does not prohibit anything and explicitly 
>>> mentions negotiating supplemental video by omitting language 
>>> attributes on a video media.
>>>
>>>>
>>>>   6.  5.2 Sixth paragraph
>>>>   --------------------current text--------------------
>>>>   (or for unidirectional streams, one of)
>>>>   ------------------new text ------------------------
>>>>   (or for asymmetrical use of languages, one of)
>>>>   -----------------end of change----------------------
>>>>   Motivation: We are not primarily talking about enabled 
>>>> transmission directions of the streams, but about language use in 
>>>> the streams. We do not want to limit the media stream directions 
>>>> just because we do not specify an initial language to use for that 
>>>> direction. There are other usage of media, and there may even be 
>>>> occasional use of language in the direction, just not worth 
>>>> mentioning as an initial and preferred use. The suggested change 
>>>> should make that clear.
>>>>
>>>>   7.   5.3 Next to last paragraph
>>>>   ------------------old text------------------------------
>>>>   a list of supported languages.
>>>>   -------------------new text-------------------------
>>>>   a list of supported languages, media and directions.
>>>>   -------------------end of change----------------
>>>>   Motivation: It is not sufficient to know which languages are 
>>>> supported, it is also essential to know in which media they are 
>>>> supported and in which directions. (media could be replaced with 
>>>> modality, but the media can become ambigous then, so use media here 
>>>> to be brief.
>>>
>>>  I don't know that we can require this, but I'll add SHOULD kist 
>>> supported languages and media. Demanding direction as well might be 
>>> too unwieldy.
>>>
>>>>
>>>>   8.      5.3, last line
>>>>   --------------old text----------------------------------
>>>>    Supported languages are: es, en"
>>>>   --------------new text-------------------------------
>>>>    Supported languages are: es, en transmission in audio; es, en 
>>>> reception in audio"
>>>>   ----------------------------------------------------------
>>>>   Motivation: Same as for 7.
>>>
>>>  Fixed as above.
>>>
>>>>
>>>>   9.  5.4 Undefined combinations
>>>>   ----------------------------old 
>>>> text--------------------------------------
>>>>      The behavior when specifying a non-signed language tag for a 
>>>> video
>>>>      media stream, or a signed language tag for an audio or text media
>>>>      stream, is not defined.
>>>>   ---------------------------new 
>>>> text-----------------------------------------
>>>>   There is no way specified for indicating use of text based 
>>>> language in a video media stream.
>>>>   There is no meaning assigned to specification of  sign language 
>>>> in an audio or text media stream.
>>>>   --------------------------end of 
>>>> change-------------------------------
>>>>   Motivation: Seeing the speaker in video is an important 
>>>> combination reinserted above in section 5.2.
>>>>   This section therefore needed rewording to not include that 
>>>> combination.
>>>
>>>  The draft explicitly mentions video for supplemental purposes.
>>>
>>>>
>>>>
>>>>   10.     6.2 Last sentence
>>>>   -----------------current text---------------------
>>>>   Supported languages are: [list of supported languages]."
>>>>   -----------------new text------------------------
>>>>   Supported languages and media and transmission directions 
>>>> are:[list of supported languages and media and transmission 
>>>> directions.]"
>>>>   -----------------end of change--------------------------
>>>>   Motivation: Same as for 7.
>>>
>>>  Fixed as above.
>>>
>>>>
>>>>   11.  6.1 MUX Category
>>>>   ----------old text in two locations-------------------
>>>>   MUX Category:  normal
>>>>   ---------new text in same two locations--------------
>>>>   Mux Category:  NORMAL
>>>>   ---------end of change-----------------
>>>>   Motivation: Follow RFC 4566bis and IANA habits regarding use of 
>>>> capitals
>>>
>>>  Fixed.
>>>
>>>>
>>>>   12.  5.3
>>>>   -------------old text-----------------
>>>>   5.3 No Language in Common
>>>>   -------------new text----------------
>>>>   5.3 Preference parameter
>>>>   ------------end of change 1 in 5.3---------------
>>>
>>>  The section is more than just the asterisk, it also advises use of 
>>> specific SIP response codes if the call is failed.
>>>
>>>>
>>>>   -------------old text-in 5.3, second 
>>>> paragraph-------------------------------
>>>>   The mechanism for indicating this preference is that, in an 
>>>> offer, if
>>>>   the last character of any of the 'humintlang-recv' or 'humintlang-
>>>>   send' values is an asterisk, this indicates a request to not fail 
>>>> the call.
>>>>   --------------------------new text-------------------------------
>>>>   The mechanism for indicating this preference is that, in an 
>>>> offer, if
>>>>      the last character of any of the 'humintlang-recv' or 
>>>> 'humintlang-
>>>>      send' values is an asterisk, this indicates a request to not 
>>>> fail the call.
>>>>   The asterisk should be attached to attributes with languages of 
>>>> lower
>>>>   preference to be matched if such difference can be specified. 
>>>> Thereby
>>>>   the location of the asterisk can be used to support the decision on
>>>>   which languages to use in the call.
>>>>   ---------------------------end of change 2 in 
>>>> 5.3--------------------------------------
>>>>   Motivation: There has not yet been any conclusion for my proposal 
>>>> no 5 in the IETF LC comments of Feb 12.
>>>>   This is a dramatically reduced version that may be easier to 
>>>> accept at this stage, still covering one of the missing 
>>>> functionalities in the draft.
>>>>   The asterisk is used as a preference parameter in the attributes. 
>>>> Thereby the proposed title change on 5.3
>>>>   With this additional rule about where the asterisk(s) are placed, 
>>>> the answering parties get good clues about the preferences between 
>>>> alternatives presented by the offeror. The chance to set up calls 
>>>> with satisfied users increase dramatically compared to letting the 
>>>> answering party select by chance between alternatives.
>>>
>>>  Making the asterisk a purely-advisory hint as to the 
>>> least-preferred media/language combination seems harmless enough, as 
>>> it would not be required to support it; however, I'm not sure it 
>>> provides any benefit: if an offer contains some set of media with 
>>> language, and the answerer can support all of them, should the 
>>> answerer only include in its answer those without an asterisk? It 
>>> seems simpler for the answerer to include everything in the offer 
>>> that it can support.
>>>
>>
>>  --
>>  -----------------------------------------
>>  Gunnar Hellström
>>  Omnitor
>>  gunnar.hellstrom@omnitor.se
>>  +46 708 204 288
>
>

-- 
-----------------------------------------
Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se
+46 708 204 288