Re: [Slim] IETF last call for draft-ietf-slim-negotiating-human-language (Section 5.4)

Randall Gellens <rg+ietf@randy.pensive.org> Thu, 16 February 2017 00:26 UTC

Return-Path: <rg+ietf@randy.pensive.org>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B406E12997C; Wed, 15 Feb 2017 16:26:52 -0800 (PST)
X-Quarantine-ID: <vHVyhV1HPm8Q>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "MIME-Version"
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vHVyhV1HPm8Q; Wed, 15 Feb 2017 16:26:50 -0800 (PST)
Received: from turing.pensive.org (turing.pensive.org [99.111.97.161]) by ietfa.amsl.com (Postfix) with ESMTP id 826101296EE; Wed, 15 Feb 2017 16:26:50 -0800 (PST)
Received: from [99.111.97.136] (99.111.97.161) by turing.pensive.org with ESMTP (EIMS X 3.3.9); Wed, 15 Feb 2017 16:18:41 -0800
Mime-Version: 1.0
Message-Id: <p0624060ad4caa20f847c@[99.111.97.136]>
In-Reply-To: <CAOW+2dsQuWnF8r_1LMsKFf9WLa=r5vN=oZfQHZdLz2c9E8xkgQ@mail.gmail.com>
References: <ddc5af1d-f084-f57e-d6c9-5963e4fe98d3@omnitor.se> <4c4ef65a-a907-cf5e-4b2c-835fb55d0146@omnitor.se> <p06240603d4c8f105055e@99.111.97.136> <434a4f06-f034-46ca-9df7-f59059e67e41@alumni.stanford.edu> <843f0cc1-2686-162d-25dc-0075847579bc@omnitor.se> <p06240609d4c937dc9ff8@99.111.97.136> <84760193-19e6-1f53-43cc-32b0493a1844@alumni.stanford.edu> <p0624060dd4c9523fcf2a@99.111.97.136> <4f1f3a72-d8a9-4f41-4133-0e6d54aadec8@omnitor.se> <CAOW+2dsQuWnF8r_1LMsKFf9WLa=r5vN=oZfQHZdLz2c9E8xkgQ@mail.gmail.com>
X-Mailer: Eudora for Mac OS X
Date: Wed, 15 Feb 2017 16:26:45 -0800
To: Bernard Aboba <bernard.aboba@gmail.com>, Gunnar Hellström <gunnar.hellstrom@omnitor.se>
From: Randall Gellens <rg+ietf@randy.pensive.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Random-Sig-Tag: 1.0b28
X-Random-Sig-Tag: 1.0b28
X-Random-Sig-Tag: 1.0b28
X-Random-Sig-Tag: 1.0b28
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/pCvSOrxZ1KNBFN0SE85moWR61ok>
Cc: "slim@ietf.org" <slim@ietf.org>, ietf@ietf.org
Subject: Re: [Slim] IETF last call for draft-ietf-slim-negotiating-human-language (Section 5.4)
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Feb 2017 00:26:53 -0000

At 3:52 PM -0800 2/15/17, Bernard Aboba wrote:

>  Gunnar Hellstrom said:
>
>  "The SDP Lang attribute in RFC 4566, where you 
> (Randall) say it is intended for specifying a 
> set of languages that all must be used in a 
> session, while I say that it is intended for 
> negotiation of at least one initial language."
>
>  [BA] At IETF 96 in Berlin, we had a discussion 
> of the history of the SDP Lang attribute within 
> the MMUSIC WG.
>
>  The Lang attribute was originally specified in 
> RFC 2327, which was published in April 1998, 
> more than four years prior to the publication 
> of Offer/Answer RFC 3264 (June 2002), and three 
> years prior to publication of the initial 
> draft-rosenberg-mmusic-sdp-offer-answer-00 
> (April 26, 2001).
>
>  As a result, the Lang attribute could not have 
> been designed for use in Offer/Answer 
> negotiation, but instead was intended for use 
> in the declarative SDP of multicast 
> conferencing.  Note that the Lang attribute was 
> not mentioned in RFC 3264, and noone at the 
> MMUSIC WG session was aware of a subsequent SIP 
> Offer/Answer implementation of it.

Which is what I was saying: it is descriptive of 
the media, which is very different from 
negotiation.  However, this is all moot now.



>  On Wed, Feb 15, 2017 at 1:41 AM, Gunnar 
> Hellström 
> <<mailto:gunnar.hellstrom@omnitor.se>gunnar.hellstrom@omnitor.se> 
> wrote:
>
>  Den 2017-02-15 kl. 01:39, skrev Randall Gellens:
>
>  At 4:21 PM -0800 2/14/17, Randy Presuhn wrote:
>
>   Hi -
>
>   On 2/14/2017 2:43 PM, Randall Gellens wrote:
>
>   At 8:59 PM +0100 2/14/17, Gunnar Hellström wrote:
>
>    Den 2017-02-14 kl. 19:05, skrev Randy Presuhn:
>
>    Hi -
>
>    On 2/14/2017 9:40 AM, Randall Gellens wrote:
>
>    At 11:01 AM +0100 2/14/17, Gunnar Hellström wrote:
>
>     My proposal for a reworded section 5.4 is:
>
>     5.4.  Unusual language indications
>
>     It is possible to specify an unusual indication where the language
>     specified may look unexpected for the media type.
>
>     For such cases the following guidance SHALL be applied for the
>    humintlang attributes used in these situations.
>
>     1.    A view of a speaking person in the video stream SHALL, when it
>    has relevance for speech perception, be indicated by a Language-Tag
>    for spoken/written language with the "Zxxx" script subtag to indicate
>    that the contents is not written.
>
>     2.    Text captions included in the video stream SHALL be indicated
>    by a Language-Tag for spoken/written language.
>
>     3.    Any approximate representation of sign language or
>    fingerspelling in the text media stream SHALL be indicated by a
>    Language-Tag for a sign language in text media.
>
>     4.    When sign language related audio from a person using sign
>    language is of importance for language communication, this SHALL be
>    indicated by a Language-Tag for a sign language in audio media.
>
>
>    [RG] As I said, I think we should avoid specifying this until we have
>    deployment experience.
>
>    ...
>
>    From a process perspective, it's far easier to remove constraints
>    as a specification advances than it is to add them.
>
>    I agree. It is often better to specify normatively as far as you can
>   imagine, so that interoperability and good functionality is achieved.
>   Stopping halfway and have MAY in the specifications creates
>   uncertainty and less useful specifications.
>
>
>   My reading of what Randy says is the opposite of Gunnar's. In my
>   reading, Randy points out that is it easier to remove the SHOULD NOT in
>   the future then it is to change the meaning of the combinations or
>   switch to a different mechanism.
>
>   In my experience, it's better to specify only what we know we need and
>   what we know we understand.  Speculative specifications "as far as you
>   can imagine" more often lead to interoperability problems, unnecessary
>   complexity, limitations on what's needed in the future, and divergent
>   implementations.
>
>
>   I think the difference in your positions comes down to
>
>     (1) your respective notions of "what we know we need and what we
>         know we understand";
>
>     (2) whether you believe that the interoperability and conformance
>         consequences of removing a "SHOULD NOT" could be the same
>         as those merely retaining a "MUST" or "SHALL" - this determines
>         whether Randy G.'s proposal provides a path for some future
>         revision to mandate (if deployment experience substantiates the
>         need/understanding) the behavior proposed by Gunnar. That path
>         is not at all obvious to me.
>
>
>  The purpose of the draft is to enable the two 
> endpoints of a real-time communication session 
> to agree which languages and media to use for 
> interactive communication.  We have a mechanism 
> of adding language tags to media stream 
> negotiations.  In most cases, the language and 
> media modality are an obvious fit.  There are 
> combinations of media and language where the 
> meaning is not so obvious, specifically, signed 
> language tags with a audio or text, and 
> non-signed language tags with video.  My 
> proposal is that we say offerer SHOULD NOT send 
> such combinations and answerer MAY ignore 
> language. This allows future specifications for 
> the underlying uses Gunnar wants (such as 
> real-time subtitles in video and signed 
> equivalents in text).  Such future 
> specifications could define a use for the 
> language and media combinations and remove the 
> SHOULD NOT send and MAY ignore, or could define 
> a new mechanism.  I don't think we know enough 
> now to dictate what the solution should be.
>
>  We have a fresh example from our own 
> discussions in the SLIM group how unfortunate 
> it is to not be sufficiently explicit in the 
> first edition of a standard. The SDP Lang 
> attribute in RFC 4566, where you (Randall) say 
> it is intended for specifying a set of 
> languages that all must be used in a session, 
> while I say that it is intended for negotiation 
> of at least one initial language. By having 
> that uncertainty in a specification that has 
> been published makes it very hard to sharpen up 
> the specification afterwards because it would 
> possibly make some implementations non 
> conformant. And it makes potential implementors 
> hesitant to use the current specifications, as 
> it was with the SLIM work.
>
>  For 5.4.
>
>  I am OK with modifying from my latest proposal, but we need to be specific.
>  I am also OK with reducing the SHALLs to SHOULDs as Addison requested.
>
>  The situation is not that we lack knowledge. 
> Here is what we know about the 4 cases of 
> "unusual" indications:
>
>  1. View of the speaker in video. Very important 
> for speech perception. Quality requirements are 
> documented in ITU-T H-series Supplement 1. Of 
> real use only as a complement to the same 
> spoken language in audio. Now, when we know 
> about the Zxxx notation for non-written, we 
> also have a good way of specifying it precisely.
>  This case was also described in section 5.2 already.
>
>  2. Text captions in the video stream.
>  This can be either text merged into video and 
> communicated as true part of the video image, 
> or it can be a text component of a multimedia 
> system, as MPEG-4, declared in SDP as m=video.
>  It has been used in some videophone products, 
> but I have not seen it used lately.
>  It is a clearly defined case, and we can 
> specify coding for it, but we do not at the 
> moment know if it will be important to specify 
> it.
>
>  3. Sign language or fingerspelling in the text stream.
>  I have seen a product using it for claimed sign 
> language conversation. It is also in use in the 
> simple text form with words in capitals 
> approximately representing signs between 
> persons involved in preparation of sign 
> language productions and translations. But in 
> that case it is in a session where they agree 
> in other ways to start using the text stream 
> for that purpose. So I think we can say that 
> this is rare, and its use can be agreed by 
> other means between the users. Still it is a 
> clearly defined case.
>
>  4. Audio from signing person related to sign 
> language. This is more vague than the others. 
> It may be a person signing in video and adding 
> spoken words in audio to signing, but 
> influenced by the word order and grammar of 
> sign language with some ambition to make it 
> reasonably understandable for both deaf and 
> hearing participants. There are even some 
> spoken words created from sign language that 
> are commonly used by hearing persons in such 
> situations. But for that case I anyway think it 
> is better to define the audio part as the 
> spoken language it is derived from, because of 
> its intention to be understandable for hearing 
> persons. All other variants I can imagine are 
> even closer to the spoken language and should 
> be specified with spoken language tag. If we 
> only want to have the audio stream established 
> to hear the background in the signing 
> situation, then we should not specify language 
> use of the audio stream.
>  Even if we know what sign language tag in audio 
> stream would be, it may be just as good to 
> leave it undefined.
> 
> ------------------------------------------------------------------------------------------------------------------------------------------------
>  So, new proposal:
>
>  5.4.  Unusual language indications
>
>     It is possible to specify an unusual indication where the language
>     specified may look unexpected for the media type.
>
>     For such cases the following guidance SHOULD be applied for the
>    humintlang attributes used in these situations.
>
>     1.    A view of a speaking person in the video stream SHOULD, when it
>    has relevance for speech perception, be 
> indicated by a humintlang attribute with a 
> Language-Tag
>    for a spoken/written language with the "Zxxx" script subtag to indicate
>    that the contents is not written.
>
>     2.    Text captions included in the video stream SHOULD be indicated
>    by a humintlang attribute with Language-Tag for spoken/written language.
>
>     3.    A Language-Tag for a sign language 
> specified in a humintlang attribute for a text 
> stream MAY be interpreted as use of an 
> approximate representation of sign language or 
> fingerspelling in the text media stream. The 
> use of such representation is rare and usually 
> conveniently agreed by other means between the 
> users during an established session. Common 
> support of this indication SHOULD NOT be 
> assumed or required.
>
>     4.    A Language-Tag for a sign language 
> specified in a humintlang attribute for an 
> audio stream SHOULD NOT be indicated and MAY be 
> ignored on reception. Any use of spoken words 
> or spoken language in the audio stream SHOULD, 
> when it can be of importance for language 
> communication, be indicated by the 
> corresponding Language-Tag for spoken language 
> in a humintlang attribute for the audio stream.
>
>
>
>
>  Gunnar
>
>
>  --
>  -----------------------------------------
>  Gunnar Hellström
>  Omnitor
>  <mailto:gunnar.hellstrom@omnitor.se>gunnar.hellstrom@omnitor.se
>  <tel:%2B46%20708%20204%20288>+46 708 204 288
>
>  _______________________________________________
>  SLIM mailing list
>  <mailto:SLIM@ietf.org>SLIM@ietf.org
> 
> <https://www.ietf.org/mailman/listinfo/slim>https://www.ietf.org/mailman/listinfo/slim


-- 
Randall Gellens
Opinions are personal;    facts are suspect;    I speak for myself only
-------------- Randomly selected tag: ---------------
Computers are not intelligent.  They only think they are.