[Slim] Requirements for the real-time case

Gunnar Hellström <gunnar.hellstrom@omnitor.se> Sat, 10 June 2017 08:44 UTC

Return-Path: <gunnar.hellstrom@omnitor.se>
X-Original-To: slim@ietfa.amsl.com
Delivered-To: slim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5E7C5127077 for <slim@ietfa.amsl.com>; Sat, 10 Jun 2017 01:44:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hPhAoyIQPdUJ for <slim@ietfa.amsl.com>; Sat, 10 Jun 2017 01:44:10 -0700 (PDT)
Received: from bin-vsp-out-02.atm.binero.net (bin-mail-out-06.binero.net [195.74.38.229]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3364D1200F3 for <slim@ietf.org>; Sat, 10 Jun 2017 01:44:09 -0700 (PDT)
X-Halon-ID: f4985249-4db8-11e7-bcc8-005056917f90
Authorized-sender: gunnar.hellstrom@omnitor.se
Received: from [192.168.2.136] (unknown [77.53.230.196]) by bin-vsp-out-02.atm.binero.net (Halon Mail Gateway) with ESMTPSA for <slim@ietf.org>; Sat, 10 Jun 2017 10:44:02 +0200 (CEST)
To: "slim@ietf.org" <slim@ietf.org>
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se>
Message-ID: <8c3bb9d1-b446-446a-29ce-40092f3f649d@omnitor.se>
Date: Sat, 10 Jun 2017 10:44:03 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/slim/LFVyATGtPTyZSr86xPZ_ovwNahs>
Subject: [Slim] Requirements for the real-time case
X-BeenThere: slim@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Selection of Language for Internet Media <slim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/slim>, <mailto:slim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/slim/>
List-Post: <mailto:slim@ietf.org>
List-Help: <mailto:slim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/slim>, <mailto:slim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 10 Jun 2017 08:44:13 -0000

I have tried to express the requirements for the SLIM real-time case, 
and grade them.

It is expressed as a chapter that can be inserted in the use case document.

I hope it can help to assess what kind of extensibility we need, and 
also if we need to adjust for any missing functionality.

Excuse the length of this mail. I hope to provide this in a document 
form soon.

--------------------------Requirements----------------------------

3.  Requirements

    Requirements are expressed broadly in this document, both within
    scope and out-of-scope.  They are expressed for the whole concept of
    negotiating language, regardless of the current scope of the drafts.
    An indication is provided if the requirement is regarded to be within
    scope for the initial document or of interest for possible
    extensions.

    This indication is shown by an initial capital letter before the
    requirement.  The indications are:

    B = basic, included in initial specification (version -10)

    X = urgent extension

    P = of interest for further extensions

    O = Not foreseen to be of high interest

3.1.  High Level Requirements

    X. The specified mechanism shall enable smooth initiation of a
    conversational call with the opportunity for the participants to
    start their language expressions in the call in a language and
    modality (or a combination of languages in different modalities) that
    has best opportunities for them to be convenient to use and for the
    other participants to understand well.

    P. The mechanism shall also make it possible for the participants to be
    informed about in which language and modality the other participants
    will start their language expression in the call, so that they can
    tune their expectations accordingly.

    B. The mechanism shall also enable the participants or their service
    providers to take decisions on invoking extra resources into the call
    to handle the language combination well or assign the most suitable
    call-taker to the call.  The mechanisms for taking these decisions
    and making these actions are however out-of-scope.

    B. Once the session has started, the participants are free to use
    whatever language and modality that they together may find useful
    during the conversation.

3.2.  Session Types

    B.  Conversational person-to-person call

    B.  Conversational multi-party call.  (Note: The basic functionality
    is sufficient for providing simple indications suitable for most
    multi-party calls.  It is negotiation that may be complicated, but
    the details about how to negotiate is out of scope.)

    B.  Conversational person-to-multimedia anwering-machine call

    B.  Multimedia conference.  (Note: The basic functionality is
    sufficient for providing simple indications suitable for most multi-
    party calls.  It is negotiation that may be complicated, but the
    details about how to negotiate is out of scope.)

    X.  Conversational call supported by captioned telephony relay
    service.

    B.  Conversational call supported by text relay service

    B.  Conversational call supported by video relay service (for sign
    language)

    P.  Conversational call supported by speech-to-speech relay service.
    (no coding defined for service requirements)

    B.  Conversational call supported by spoken language interpretation.
    (translation may be invoked by either party as a result of language
    and modality matching)

    O.  Streaming media session.  (Nothing blocks from using the same
    mechanism for streaming sessions, however, extensions may be needed,
    e.g. for text in video overlay)

3.3.  Modalities

    B.  Spoken

    B.  Written

    B.  Signed

    P.  Text messaging

    X.  Spoken and written simultaneously in same direction

    X.  Spoken and signed simultaneously in same direction

    X.  Written and spoken simultaneously in same direction

    O.  Written and signed simultaneously in same direction

3.4.  Directions

    B.  Sent

    B.  Received

    O.  Both sent and received in same attribute ( for simplifying SDP)

3.5.  Languages

    B.  Languages shall be specified by Language tags of BCP 47 [RFC5646]

    B.  It shall be possible to select from alternative single languages
    per media and direction.

    O.  Range of possible languages per media and direction indicated by
    language tag ranges as specified in BCP 47. (may be of use for
    streaming media with spoken and signed languages simultaneously in
    video, or possibly for more reliable language matching)

    X.  Combination of simultaneous languages in different media and same
    direction.  (Required for captioned telephony and for persons with
    limited language capabilities who need to combine perceptions of two
    modalities)

    B.  Language details by adding subtags wisely, thus using subtags
    when they really add anything of value for the language selection.
    Language matching shall be made according to the mechanisms described
    in BCP 47.

3.6.  Media

    B.  Audio

    B.  Text

    B.  Video

    P.  Message

    B.  Multiple alternative media

    B.  Single media streams of each type

    B.  Multiple media streams of same type ( may need combination with
    a=Content attribute)

3.7.  Modality and Media Relations

    B.  Written in Text

    B.  Spoken in Audio

    B.  Signed in Video

    P.  Text messaging in Message

    B.  Spoken in video = view of speaker

    O.  Written in video = text overlay in video (may be of interest for
    streaming)

3.8.  Preferences and capabilities

3.8.1.  Preference by offering party for sending language

    B.  Preference between sending languages in same modality

    X.  Preference between different modalities for sending language

    P.  Preference between languages for sending in different modalities
    (this level of detail is not of high interest)

3.8.2.  Preference by offering party for receiving language

    B.  Preference between languages to receive in same modality

    X.  Preference between different modalities for receiving language

    P.  Preference between languages for receiving in different
    modalities (this level of detail is not of high interest)

3.8.3.  Preference by answering party for sending language

    B.  Preference between languages to send in same modality

    X.  Preference between different modalities for sending language

    P.  Preference between sending language in different modalities (this
    level of detail is not of high interest)

3.8.4.  Preference by answering party for receiving language

    B.  Preference between receiving different languages in same modality

    X.  Preference between different modalities for receiving language
    modalities

    P.  Preference between different languages for receiving in different
    modalities (this level of detail is not of high interest)

3.8.5.  Preference detail level

    B.  Detailed preference per language, direction and modality - no
    preference between modalities

    P.  Detailed preference per language and direction in different
    modalities

    P.  Coarse preference Hi/Lo per language and direction in different
    modalities

    X.  Coarse preference Hi/Lo per modality and direction P. Preference
    only for modality, without specifying language at all.  (May e.g. be
    of interest for invocation of relay service.)

3.9.  Call Denial

    B.  Preference from caller to get the call denied if no languages
    match.  (May cause unwanted denials because language matching may be
    hard to do reliably)

    B.  Preference of the caller to not get the call denied if no
    languages match

3.10.  Information to users

3.10.1.  Information to answering party about negotiation result

    B.  Information to answering party about selected languages in
    negotiation result

    B.  Information to answering party about both common capabilities and
    selected languages in negotiation result

3.10.2.  Information to offering party about negotiation result

    B.  Information to offering party about selected languages in
    negotiation result

    P.  Information to offering party about both common capabilities and
    selected languages in negotiation result

3.11.  Technical Base

    B.  The mechanism shall be based on SDP [RFC4566] but may be
    adaptable to other environments.

------------------------------------------------------------------------------

Evaluation:

The B-marked requirements are as specified in 
draft-ietf-slim-negotiating-human-language-10

The X-marked are the often discussed urgently required extensions for:

     - Preference indication between different modalities. (note that I 
regard it sufficient to indicate preference per modality, not language. 
An indication of preference per language and modality is however also 
feasible if that is what we end up in. )

     - Preference for receiving two modalities simultaneously. (note 
that I regard it sufficient to indicate preference per modality, not 
language. An indication of preference per language is however also 
feasible if that is what we end up in. )

Among the P- and O-marked requirements, there are some worth considering 
for early inclusion:

     -3.2 Use in streaming. (no real change needed except possibly to 
include text overlay in video)

     -3.3, 3.6, 3.8 Text Messaging (discussed before but dropped)

     -3.7 Text overlay in video (mainly for streaming, so not in our 
main focus)

     -3.10.2 Information to offering party about both common language 
capabilities and language selected to be used initially in the session. 
(the current draft is fuzzy about this, it may provide information about 
the language to actually use, or it may provide information about a 
range of suitable languages in different modalities. It has no notation 
for difference between actually decided language and language possible 
to use.)

Of these, I only feel a bit strong about the information to the offering 
party in 3.10.2.

I hope this can be helpful for finalizing the basic draft and the 
extension mechanism.

/Gunnar

-- 
-----------------------------------------
Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se
+46 708 204 288