Re: [Slim] I-D Action: draft-ietf-slim-negotiating-human-language-07.txt

Randall Gellens <rg+ietf@randy.pensive.org> Mon, 27 February 2017 00:56 UTC

Return-Path: <rg+ietf@randy.pensive.org>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C81112957F for <slim@ietfa.amsl.com>; Sun, 26 Feb 2017 16:56:52 -0800 (PST)
X-Quarantine-ID: <B8VxuCP3T-HV>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "MIME-Version"
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B8VxuCP3T-HV for <slim@ietfa.amsl.com>; Sun, 26 Feb 2017 16:56:50 -0800 (PST)
Received: from turing.pensive.org (turing.pensive.org [99.111.97.161]) by ietfa.amsl.com (Postfix) with ESMTP id 3B4D6129488 for <slim@ietf.org>; Sun, 26 Feb 2017 16:56:50 -0800 (PST)
Received: from [99.111.97.136] (99.111.97.161) by turing.pensive.org with ESMTP (EIMS X 3.3.9); Sun, 26 Feb 2017 16:46:43 -0800
Mime-Version: 1.0
Message-Id: <p06240608d4d927eaec67@[99.111.97.136]>
In-Reply-To: <4b36f347-955e-e2b9-12f2-f426d47d3d33@omnitor.se>
References: <148782279664.31054.8793649134696520241.idtracker@ietfa.amsl.com> <p0624060cd4d4111cd79a@[99.111.97.136]> <49fd730e-6e90-1a49-eae8-80f8b1285a76@omnitor.se> <p06240604d4d6169921b5@[99.111.97.136]> <83152ba7-c3fb-25d8-f97d-59c7840cad56@omnitor.se> <p06240601d4d790fb8bb3@[99.111.97.136]> <4b36f347-955e-e2b9-12f2-f426d47d3d33@omnitor.se>
X-Mailer: Eudora for Mac OS X
Date: Sun, 26 Feb 2017 16:55:59 -0800
To: Gunnar Hellström <gunnar.hellstrom@omnitor.se>, slim@ietf.org, Natasha Rooney <nrooney@gsma.com>, Bernard Aboba <bernard.aboba@gmail.com>
From: Randall Gellens <rg+ietf@randy.pensive.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Random-Sig-Tag: 1.0b28
X-Random-Sig-Tag: 1.0b28
X-Random-Sig-Tag: 1.0b28
X-Random-Sig-Tag: 1.0b28
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/SEiRJ3lS7pQW0EBNRY-vMPfmEK0>
Subject: Re: [Slim] I-D Action: draft-ietf-slim-negotiating-human-language-07.txt
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Feb 2017 00:56:52 -0000

At 7:46 AM +0100 2/26/17, Gunnar Hellström wrote:

>  Den 2017-02-25 kl. 20:58, skrev Randall Gellens:
>>  Hi Gunnar,
>>
>>  At 11:04 AM +0100 2/25/17, Gunnar Hellström wrote:
>>
>>>   Fine, I find that we have only issues 5, 6 and 12 still to discuss.
>>>
>>>   You did not answer issue 6, use of 
>>> asymmetrical language rather than 
>>> unidirectional media. I assume you accepted 
>>> it.
>>
>>  Yes, I thought I had indicated that, sorry if I didn't.
>  Good.
>>
>>>
>>>   On 5, the request to reinsert wording about 
>>> seeing the speaker in video, it is still a 
>>> huge difference in specifying a preference to 
>>> see the speaker for language perception 
>>> reasons, versus only specifying that I want a 
>>> video stream for supplementary purposes. With 
>>> the current wording in version -07, section 
>>> 5.3 says that that combination is undefined. 
>>> Nothing in the LC discussion indicated that 
>>> it should be undefined. Why did you suddenly 
>>> want to delete it? It is useful. Please 
>>> reinsert with the wording changes I propose.
>>
>>  The email discussion led me to believe that 
>> the text was controversial.  We need to get 
>> the draft finished, so it's better to delete 
>> controversial text than to spend months 
>> fighting about it.
>  The comments were first about the uncertainty 
> about how the "silly states" were to be 
> interpreted.
>  We described them all but decided to only keep 
> the view of the speaker because it is a real 
> and useful case.
>  The idea to differentiate spoken and written 
> cases by script tags caused discussions and was 
> dropped. The remaining real case with the view 
> of the speaker was mentioned twice in the 
> draft, so it was recommended that one of them 
> should be deleted, but not both.

OK, I'll restore the text in section 5.2:

    Note that while signed language tags are used with a video stream to
    indicate sign language, a spoken language tag for a video stream in
    parallel with an audio stream with the same spoken language tag
    indicates a request for a supplemental video stream to see the
    speaker.

And modify section 5.4 to exclude this case:

    With the exception of the case mentioned in Section 5.2 (a spoken
    language tag for a video stream in parallel with an audio stream with
    the same spoken language tag), the behavior when specifying a spoken/
    written language tag for a video media stream, or a signed language
    tag for an audio or text media stream, is not defined.


>>
>>>
>>>   On 12, the meaning of the placement of the asterisk, you ask:
>>>   "Making the asterisk a purely-advisory hint 
>>> as to the least-preferred media/language 
>>> combination seems harmless enough, as it 
>>> would not be required to support it; however, 
>>> I'm not sure it provides any benefit: if an 
>>> offer contains some set of media with 
>>> language, and the answerer can support all of 
>>> them, should the answerer only include in its 
>>> answer those without an asterisk? It seems 
>>> simpler for the answerer to include 
>>> everything in the offer that it can support."
>>>
>>>   The answering party should aim at answering 
>>> with one of the languages that is without the 
>>> asterisk in the offer. Only if the answering 
>>> party does not have capability in a language 
>>> without an asterisk, one with asterisk should 
>>> be selected. Thereby you get the best 
>>> opportunity to start the call in a language 
>>> combination that satisfies both users.
>>>
>>>   Example: A hard-of-hearing user can just 
>>> barely conduct spoken calls with persons she 
>>> knows. From others it is much more reliable 
>>> to get text.  She calls and declares:
>>>
>>>   m=audio
>>>   a=huml-send:en
>>>   a=huml-recv:en*
>>>   m=text
>>>   a=huml-recv:en
>>>
>>>   The answering party with text capabilities 
>>> sees that matching text for sending is higher 
>>> preferred than talking, and thus responds:
>>>
>>>   m=audio
>>>   a=huml-recv:en
>>>   m=text
>>>   a=huml-send:en
>>>
>>>   The answering party sends the initial 
>>> greeting in text and the call continues 
>>> smoothly in well managed langauage/modality 
>>> combinations.
>>>
>>>   Another called party may not have text 
>>> capabilities, and may therefore select the 
>>> less favoured alternative with using speech 
>>> both ways, answering:
>>>
>>>   m=audio
>>>   a=huml-recv:en
>>>   a=huml-send:en
>>>   m=text 0
>>>
>>>   The answering party starts taking and the 
>>> parties try as well as possible to manage the 
>>> call in this less preferred combination that 
>>> may be less reliable.
>>>
>>>   If the placement of the asterisk had no 
>>> special meaning as it is in version -07, it 
>>> is a high risk that the answering party in 
>>> the first example would select to answer with 
>>> spoken language that would be unreliably 
>>> received. Time and effort would be spent by 
>>> speech to make the answering party switch to 
>>> sending text instead of talking in order to 
>>> arrange for a more reliable call situation.
>>>
>>>   If instead the caller only indicated the most favoured combinations,
>>>
>>>   m=audio
>>>   a=huml-send:en
>>>   m=text
>>>   a=huml-recv:en
>>>
>>>   Then the answering parties without text 
>>> capability would not dare to try to answer, 
>>> and a reasonably successful call would be 
>>> missed.
>>>
>>>   Many other similar realistic examples can be 
>>> created, where placement of the asterisk(s) 
>>> would be a sufficient indication of lower 
>>> preference for language match among 
>>> alternatives that would make call 
>>> establishment successful and smooth in many 
>>> more cases than without this indication 
>>> opportunity.
>>>
>>>   Do you want more examples?
>>>
>>>   Please accept proposal 12.
>>
>>  This convinces me that we cannot accept the 
>> proposed text, as it would introduce 
>> complexity that the WG explicitly decided to 
>> not pursue in this draft.  In the examples you 
>> provided, it seems better for the answerer to 
>> include all media and languages from the offer 
>> that it can support.  This is much simpler, 
>> has only trivial drawbacks (extra media 
>> negotiated that might not be used), and is 
>> what the WG agreed to.
>  Yes, you could let the answer SDP contain one 
> common language per media and direction, but 
> the answering human need guidance on which 
> language is best suited to start the 
> conversation. Therefore the placement of the 
> asterisk is used to hint the answering party 
> how to start the call.

I believe the WG discussed proposals to provide 
information to the humans regarding the languages 
and media that were negotiated, and decided it is 
out of scope of the draft as an implementation 
issue, not a protocol issue.

>
>  The first example above can be modified to:
>
>   Example: A hard-of-hearing user can just 
> barely conduct spoken calls with persons she 
> knows. From others it is much more reliable to 
> get text.  She calls and declares:
>
>   m=audio
>   a=huml-send:en
>   a=huml-recv:en*
>   m=text
>   a=huml-recv:en
>
>   The answering party with capabilities for both 
> written and spoken English sees that matching 
> text for sending is higher preferred than 
> talking and sends the answer indicating the 
> capabilities:
>
>   m=audio
>   a=huml-recv:en
>  a=huml-send:en
>   m=text
>   a=huml-send:en
>
>   The answering party makes use of the hint that 
> the caller prefers to receive written text and 
> therefore sends the initial greeting in text 
> and the call continues smoothly in well managed 
> langauage/modality combinations.
>
>  ----------
>  There is no complexity left in this solution, 
> it helps to motivate why we have the asterisk 
> on media level, and it helps to successful call 
> initiations, so I think it should be acceptable.

My suggestion is that you write this idea as a 
new draft offering implementation guidance, and 
ask the WG to adopt it as either Informational or 
BCP.  It doesn't affect the protocol in the 
current draft, but provides guidance on how to 
use the protocol in both an offer and an answer 
and potentially in the UI.



>>
>>>   Den 2017-02-25 kl. 01:32, skrev Randall Gellens:
>>>>   At 5:35 PM +0100 2/24/17, Gunnar Hellström wrote:
>>>>
>>>>>    Den 2017-02-23 kl. 05:15, skrev Randall Gellens:
>>>>>>    Version -07 addresses all comments 
>>>>>> except for the unresolved issue of 
>>>>>> renaming the two attributes which is 
>>>>>> currently being discussed on the list, and 
>>>>>> adding a new attribute for 
>>>>>> bidirectionality.
>>>>>>
>>>>>>    Per Dale's suggestion, the draft adds 
>>>>>> advice that if a call is rejected due to 
>>>>>> no languages in common, SIP response code 
>>>>>> 488 (Not Acceptable Here) or 606 (Not 
>>>>>> Acceptable) be used, along with a Warning 
>>>>>> header field indicating the supported 
>>>>>> languages.  The draft registers a new 
>>>>>> entry in the warn-code sub-registry of SIP 
>>>>>> parameters for this purpose.  The draft 
>>>>>> also has an expanded set of examples.
>>>>>>
>>>>>    Good progress. Good to see the enriched examples chapter 5.5.
>>>>>    I have a few comments on version -07:
>>>>>
>>>>>    1.  Section  4. second line
>>>>>    ------------old text----------------------
>>>>>    but is not sufficiently sufficiently
>>>>>    ------------new text--------------------------
>>>>>    but is not sufficiently
>>>>>    ----------end of change 1-----------------
>>>>>    Motivation: New typo in version -07
>>>>
>>>>   Thanks.
>>>>
>>>>>
>>>>>    2. Section 5.2, first line
>>>>>    ----------------old text-----------------
>>>>>    This document defines two new media-level ..
>>>>>    ----------------new text----------------------
>>>>>    This document defines two media-level ...
>>>>>    ----------------end of change 2----------------
>>>>>    Motivation: It was commented that when 
>>>>> the draft is published, this is not new 
>>>>> anymore.
>>>>>    There are three more occasions of "new" 
>>>>> in the document that may be modified as 
>>>>> well.
>>>>
>>>>   OK.
>>>>
>>>>>
>>>>>    3.  5.2 second paragraph
>>>>>    -------------------old text--------------------------------
>>>>>    In an offer, the 'humintlang-send' values indicates the language(s)
>>>>>       the offerer is willing to use when sending using the media, and the
>>>>>       'humintlang-recv' values indicates the language(s) the offerer is
>>>>>       willing to use when receiving using the media.
>>>>>    -----------------new text---------------------------------
>>>>>    In an offer, the 'humintlang-send' values indicate the language(s)
>>>>>    the offerer is willing to select from for use when sending using the
>>>>>    media, and the 'humintlang-recv' values indicate the language(s) the
>>>>>    offerer is willing to receive one of in the media stream.
>>>>>    ----------------end of change----------------------------------
>>>>>    Motivation 1:) change from "indicates" to 
>>>>> "indicate" in two places to match the new 
>>>>> use of plural "values".
>>>>>    Motivation 2:) Be sure to indicate that 
>>>>> we only intend to negotiate one language 
>>>>> per media and direction, so that we do not 
>>>>> end up as unspecified regarding number of 
>>>>> matches required as the sdp "lang" 
>>>>> attribute is.
>>>>
>>>>   Reworded.
>>>>
>>>>>
>>>>>    4.  5.2 Second paragraph
>>>>>    -----------------old text-----------------------
>>>>>    When a media is intended
>>>>>       for use in one direction only
>>>>>    ----------------new text---------------------
>>>>>    When a media is intended
>>>>>       for use for language communication in one direction only
>>>>>    ----------------end of change---------------------------
>>>>>    Motivation: Deletion of a note in this 
>>>>> sentence made it less obvious that we are 
>>>>> only talking about directions of use of 
>>>>> language communication, and not about 
>>>>> establishing asymmetric media connections. 
>>>>> Therefore add this clarification.
>>>>
>>>>   Reworded.
>>>>
>>>>>
>>>>>    5.  5.2 Deleted paragraph 6 before "Clients acting on behalf..."
>>>>>    ----------reinsert modified paragraph----------------------------
>>>>>    While signed language tags are used with a video stream to
>>>>>    indicate sign language, a spoken language tag for a video stream
>>>>>    indicates a request or offer to see the speaker, when that is of
>>>>>    importance for language perception.
>>>>>    -------------end of change-------------------------------------------
>>>>>    Motivation: There was in the LC mail 
>>>>> exchange a discussion about sharpening up 
>>>>> the specification of use of "unusual 
>>>>> combinations".
>>>>>    There was no agreement to delete them 
>>>>> all. The one described in this paragraph is 
>>>>> the main one that has widespread use and 
>>>>> needs to be clearly specified for use by a 
>>>>> large number of hard-of-hearing and deaf 
>>>>> users.
>>>>
>>>>   The text as it is now does not prohibit 
>>>> anything and explicitly mentions negotiating 
>>>> supplemental video by omitting language 
>>>> attributes on a video media.
>>>>
>>>>>
>>>>>    6.  5.2 Sixth paragraph
>>>>>    --------------------current text--------------------
>>>>>    (or for unidirectional streams, one of)
>>>>>    ------------------new text ------------------------
>>>>>    (or for asymmetrical use of languages, one of)
>>>>>    -----------------end of change----------------------
>>>>>    Motivation: We are not primarily talking 
>>>>> about enabled transmission directions of 
>>>>> the streams, but about language use in the 
>>>>> streams. We do not want to limit the media 
>>>>> stream directions just because we do not 
>>>>> specify an initial language to use for that 
>>>>> direction. There are other usage of media, 
>>>>> and there may even be occasional use of 
>>>>> language in the direction, just not worth 
>>>>> mentioning as an initial and preferred use. 
>>>>> The suggested change should make that clear.
>>>>>
>>>>>    7.   5.3 Next to last paragraph
>>>>>    ------------------old text------------------------------
>>>>>    a list of supported languages.
>>>>>    -------------------new text-------------------------
>>>>>    a list of supported languages, media and directions.
>>>>>    -------------------end of change----------------
>>>>>    Motivation: It is not sufficient to know 
>>>>> which languages are supported, it is also 
>>>>> essential to know in which media they are 
>>>>> supported and in which directions. (media 
>>>>> could be replaced with modality, but the 
>>>>> media can become ambigous then, so use 
>>>>> media here to be brief.
>>>>
>>>>   I don't know that we can require this, but 
>>>> I'll add SHOULD kist supported languages and 
>>>> media. Demanding direction as well might be 
>>>> too unwieldy.
>>>>
>>>>>
>>>>>    8.      5.3, last line
>>>>>    --------------old text----------------------------------
>>>>>     Supported languages are: es, en"
>>>>>    --------------new text-------------------------------
>>>>>     Supported languages are: es, en 
>>>>> transmission in audio; es, en reception in 
>>>>> audio"
>>>>>    ----------------------------------------------------------
>>>>>    Motivation: Same as for 7.
>>>>
>>>>   Fixed as above.
>>>>
>>>>>
>>>>>    9.  5.4 Undefined combinations
>>>>>    ----------------------------old 
>>>>> text--------------------------------------
>>>>>       The behavior when specifying a non-signed language tag for a video
>>>>>       media stream, or a signed language tag for an audio or text media
>>>>>       stream, is not defined.
>>>>>    ---------------------------new 
>>>>> text-----------------------------------------
>>>>>    There is no way specified for indicating 
>>>>> use of text based language in a video media 
>>>>> stream.
>>>>>    There is no meaning assigned to 
>>>>> specification of  sign language in an audio 
>>>>> or text media stream.
>>>>>    --------------------------end of change-------------------------------
>>>>>    Motivation: Seeing the speaker in video 
>>>>> is an important combination reinserted 
>>>>> above in section 5.2.
>>>>>    This section therefore needed rewording 
>>>>> to not include that combination.
>>>>
>>>>   The draft explicitly mentions video for supplemental purposes.
>>>>
>>>>>
>>>>>
>>>>>    10.     6.2 Last sentence
>>>>>    -----------------current text---------------------
>>>>>    Supported languages are: [list of supported languages]."
>>>>>    -----------------new text------------------------
>>>>>    Supported languages and media and 
>>>>> transmission directions are:[list of 
>>>>> supported languages and media and 
>>>>> transmission directions.]"
>>>>>    -----------------end of change--------------------------
>>>>>    Motivation: Same as for 7.
>>>>
>>>>   Fixed as above.
>>>>
>>>>>
>>>>>    11.  6.1 MUX Category
>>>>>    ----------old text in two locations-------------------
>>>>>    MUX Category:  normal
>>>>>    ---------new text in same two locations--------------
>>>>>    Mux Category:  NORMAL
>>>>>    ---------end of change-----------------
>>>>>    Motivation: Follow RFC 4566bis and IANA 
>>>>> habits regarding use of capitals
>>>>
>>>>   Fixed.
>>>>
>>>>>
>>>>>    12.  5.3
>>>>>    -------------old text-----------------
>>>>>    5.3 No Language in Common
>>>>>    -------------new text----------------
>>>>>    5.3 Preference parameter
>>>>>    ------------end of change 1 in 5.3---------------
>>>>
>>>>   The section is more than just the asterisk, 
>>>> it also advises use of specific SIP response 
>>>> codes if the call is failed.
>>>>
>>>>>
>>>>>    -------------old text-in 5.3, second 
>>>>> paragraph-------------------------------
>>>>>    The mechanism for indicating this preference is that, in an offer, if
>>>>>    the last character of any of the 'humintlang-recv' or 'humintlang-
>>>>>    send' values is an asterisk, this 
>>>>> indicates a request to not fail the call.
>>>>>    --------------------------new text-------------------------------
>>>>>    The mechanism for indicating this preference is that, in an offer, if
>>>>>       the last character of any of the 'humintlang-recv' or 'humintlang-
>>>>>       send' values is an asterisk, this 
>>>>> indicates a request to not fail the call.
>>>>>    The asterisk should be attached to attributes with languages of lower
>>>>>    preference to be matched if such difference can be specified. Thereby
>>>>>    the location of the asterisk can be used to support the decision on
>>>>>    which languages to use in the call.
>>>>>    ---------------------------end of change 
>>>>> 2 in 
>>>>> 5.3--------------------------------------
>>>>>    Motivation: There has not yet been any 
>>>>> conclusion for my proposal no 5 in the IETF 
>>>>> LC comments of Feb 12.
>>>>>    This is a dramatically reduced version 
>>>>> that may be easier to accept at this stage, 
>>>>> still covering one of the missing 
>>>>> functionalities in the draft.
>>>>>    The asterisk is used as a preference 
>>>>> parameter in the attributes. Thereby the 
>>>>> proposed title change on 5.3
>>>>>    With this additional rule about where the 
>>>>> asterisk(s) are placed, the answering 
>>>>> parties get good clues about the 
>>>>> preferences between alternatives presented 
>>>>> by the offeror. The chance to set up calls 
>>>>> with satisfied users increase dramatically 
>>>>> compared to letting the answering party 
>>>>> select by chance between alternatives.
>>>>
>>>>   Making the asterisk a purely-advisory hint 
>>>> as to the least-preferred media/language 
>>>> combination seems harmless enough, as it 
>>>> would not be required to support it; 
>>>> however, I'm not sure it provides any 
>>>> benefit: if an offer contains some set of 
>>>> media with language, and the answerer can 
>>>> support all of them, should the answerer 
>>>> only include in its answer those without an 
>>>> asterisk? It seems simpler for the answerer 
>>>> to include everything in the offer that it 
>>>> can support.
>>>>
>>>
>>>   --
>>>   -----------------------------------------
>>>   Gunnar Hellström
>>>   Omnitor
>>>   gunnar.hellstrom@omnitor.se
>>>   +46 708 204 288
>>
>>
>
>  --
>  -----------------------------------------
>  Gunnar Hellström
>  Omnitor
>  gunnar.hellstrom@omnitor.se
>  +46 708 204 288


-- 
Randall Gellens
Opinions are personal;    facts are suspect;    I speak for myself only
-------------- Randomly selected tag: ---------------
Tell the pretty ones they're smart, and tell the smart ones they're pretty.
                                       --Mae West's advice on handling men