Re: [MMUSIC] draft-gellens-negotiating-human-language-02
Randall Gellens <randy@qti.qualcomm.com> Thu, 14 March 2013 14:41 UTC
Return-Path: <randy@qti.qualcomm.com>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C96D11E8199 for <mmusic@ietfa.amsl.com>; Thu, 14 Mar 2013 07:41:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -104.603
X-Spam-Level:
X-Spam-Status: No, score=-104.603 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, J_CHICKENPOX_14=0.6, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zi6h0ovesS1S for <mmusic@ietfa.amsl.com>; Thu, 14 Mar 2013 07:41:43 -0700 (PDT)
Received: from wolverine01.qualcomm.com (wolverine01.qualcomm.com [199.106.114.254]) by ietfa.amsl.com (Postfix) with ESMTP id 9DC3911E8175 for <mmusic@ietf.org>; Thu, 14 Mar 2013 07:41:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=qti.qualcomm.com; i=@qti.qualcomm.com; q=dns/txt; s=qcdkim; t=1363272103; x=1394808103; h=message-id:in-reply-to:references:date:to:from:subject: cc:mime-version:content-transfer-encoding; bh=Ll+2Phker5gUZGoq2FfaiYiOuP181uus2eahhpzCtw4=; b=QaGB2x7mZwOaPSXY5SaRW1tL9aRvxHYUBgavb9sHVRQHN3zlkrkaqt7t cUptJyx6uDgHX6gCLdAq57ZwfXRcOHo4obwN1NglZuu4kb825z8NLP6jg aGFVk3+oQuv7czMcQ1i+BKVIPKYMzqEvKPQ+kDI/eQTENHaupfXnDlTb4 c=;
X-IronPort-AV: E=Sophos;i="4.84,845,1355126400"; d="scan'208";a="29436553"
Received: from ironmsg04-r.qualcomm.com ([172.30.46.18]) by wolverine01.qualcomm.com with ESMTP; 14 Mar 2013 07:41:43 -0700
X-IronPort-AV: E=Sophos;i="4.84,845,1355126400"; d="scan'208";a="505098478"
Received: from nasanexhc04.na.qualcomm.com ([172.30.48.17]) by Ironmsg04-R.qualcomm.com with ESMTP/TLS/RC4-SHA; 14 Mar 2013 07:41:43 -0700
Received: from dhcp-42ec.meeting.ietf.org (172.30.48.1) by qcmail1.qualcomm.com (172.30.48.17) with Microsoft SMTP Server (TLS) id 14.2.318.4; Thu, 14 Mar 2013 07:41:41 -0700
Message-ID: <p06240603cd67911ef4d5@dhcp-42ec.meeting.ietf.org>
In-Reply-To: <5141D725.3030706@cisco.com>
References: <p0624060ecd63af26fe28@dhcp-42ec.meeting.ietf.org> <513E504F.1010209@omnitor.se> <5141D725.3030706@cisco.com>
X-Mailer: Eudora for Mac OS X
Date: Thu, 14 Mar 2013 07:41:37 -0700
To: Flemming Andreasen <fandreas@cisco.com>, Gunnar Hellstrom <gunnar.hellstrom@omnitor.se>
From: Randall Gellens <randy@qti.qualcomm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Random-Sig-Tag: 1.0b28
X-Originating-IP: [172.30.48.1]
Cc: mmusic@ietf.org
Subject: Re: [MMUSIC] draft-gellens-negotiating-human-language-02
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Mar 2013 14:41:45 -0000
At 9:56 AM -0400 3/14/13, Flemming Andreasen wrote: > On 3/11/13 5:44 PM, Gunnar Hellstrom wrote: > >> Before this discussion got its home in mmusic, >> we discussed quite similar topics as you, Dale >> brought up now. >> >> It was about what was needed to be expressed >> by the parameters and if SDP or SIP was the >> right place. And in the case of SIP, if RFC >> 3840 / 3841 could be a suitable mechanism for >> routing and decisions on the parameters. >> > I agree more discussion is needed on this. > There seems to be two problems considered in > the draft: > 1) Routing of a request to an answerer that has > the language capabilities the caller desires. > 2) Negotiation of the language properties to > use on a per-stream basis once the call has > been routed to a particular answerer. > > Problem 1 seems to fall in the RFC 3840/3841 > space, whereas problem 2 is more of an SDP > issue. > > -- Flemming Language and media both need to be considered together when choosing how to process the call. For example, a call with a request for video with ASL, or text in English, may be handled differently than a call that requests only voice with English. >> Here is part of that discussion that we need to capture. >> >> >> I see some complication that might be needed >> in order to reflect reality. At least they >> should be discussed. >> >> And I am also seeing some different ways to specify it. >> >> The complications to discuss are: >> >> 1. Level of preference. >> >> There may be a need for specifying levels of >> preference for languages. I might strongly >> prefer to talk English, but have some useful >> capability in French. I want to display that >> preference and that capability with that >> difference, so that I get English whenever >> possible, but get the call connected even if >> English is not available at all but French is. >> >> I would assume that two levels are sufficient, >> but that can be discussed: Preferred and >> capable. >> >>> >>> The draft already proposes that languages be >>> listed in order of preference, which should >>> handle the example you mention: you list >>> English first and French second. The called >>> party selects English if it is capable and >>> falls back to French if English is not and >>> French is. This seems much simpler and is a >>> common way of handling situations where there >>> is a preference. It would be good to keep >>> the mechanism as simple as possible. >>> >>> Yes, I am afraid of complicating this beyond >>> the point when users do not manage to get >>> their settings right. >>> >>> Still, I do not think that the order is >>> sufficient as level of preference indicator. >>> You may want to indicate capability for one >>> modality but preference for another. ( as my >>> example, capability for ASL, but preference >>> for talking and reading ) >>> >> >> If you have a capability for ASL but >> preference for talking and reading, you could >> initially offer two media streams: a voice >> with English and a text with English. If >> accepted, you have your preferred >> communications. If those are rejected you >> could then offer video with ASL. Would that >> handle the case? >> No, video is still very valuable for judging >> the emergency case. Or seeing a friend. So, if >> you support it you want to offer it. But the >> decision on languages and modalities may end >> up in video being not important for language >> communication. >> >>> >>> >>>>>> 2. Directionality >>>>>> There is a need for a direction of the >>>>>> language preference. "Transmit, receive or >>>>>> both". or "Produce, perceive or both". >>>>>> That is easy to understand for the relay >>>>>> service examples. >>>>>> A hard-of-hearing user may declare: >>>>>> >>>>>> Text, capable, produce, English >>>>>> Text, prefer, perceive, English >>>>>> Audio, prefer, produce, English >>>>>> Audio, capable, perceive, English ( >>>>>> tricky, a typical hard-of-hearing user may >>>>>> have benefit of receiving audio, while it >>>>>> is not usable enough for reliable >>>>>> perception. I do not want to see this >>>>>> eternally complex, but I see a need for >>>>>> refined expressions here) >>>>>> video, capable, both, ASL >>>>>> >>>>>> This should be understood as that the user >>>>>> prefers to speak and get text back, and >>>>>> has benefit of getting voice in parallel >>>>>> with text. ASL signing can be an >>>>>> alternative if the other party has >>>>>> corresponding capability or preference. >>>>>> >>>>>> >>>>> >>>>> The draft does support this (and even >>>>> mentions some of these specific uses) >>>>> because it proposes an SDP media attribute, >>>>> and media can be specified to be send, >>>>> receive, or both. >>>>> >>>> No, that is not the same. You want the media >>>> to flow, but by the parameter you want to >>>> indicate your preference for how to use it. >>>> You do not want to turn off incoming audio >>>> just because you prefer to talk but read >>>> text. >>>> >>> >>> Yes, I see, thanks for the clarification. >>> Does this need to be part of the session >>> setup? If you establish all media streams >>> that you wish to use, can you then just use >>> them as you prefer? I will consult with the >>> NENA accessibility committee on this. >>> >> No, there are specific services providing >> service with one direction but not the other. >> The information is needed for decision on what >> assisting service to invoke. One such service >> is the captioned telephony, that adds rapidly >> created speech-to-text in parallel with the >> voice. They provide just that. A user will >> have a very strong preference for getting just >> that service, but could accept with much lower >> preference to get a direct conversation with >> the far end in combined text and voice. >> >>> >>> >>>>>> I think it would be useful to move most >>>>>> of the introduction to a structured use >>>>>> case chapter and express the different >>>>>> cases according to a template. Thast can >>>>>> then be used to test if proposed >>>>>> approaches will work. >>>>>> >>>>> >>>>> I'm not sure I fully understand what you >>>>> mean by "structured" in "structured use >>>>> case" or "template." Can you be more >>>>> specific? >>>>> >>>> I mean just a simple template for how the >>>> use case descriptions are written. >>>> >>>> E.g. >>>> A title indicating what case we have. >>>> Description of the calling user and its capabilities and preferences. >>>> Description of the answering user and its capabilities and preferences >>>> Description of a possible assisting service >>>> and its capabilities and preferences >>>> Description of the calling user's indications. >>>> Description of the answering user's indications. >>>> The resulting decision and outcome >>>> >>>>> >>>>> >>>>>> 3. Specify language and modality at SIP Media tag level instead. >>>>>> There could be some benfits to declare >>>>>> these parameters at the SIP media tag >>>>>> level instead of SDP level. >>>>>> A call center can then register with their >>>>>> capabilities already at the SIP REGISTER >>>>>> time, and the caller preferences / callee >>>>>> capabilities mechanism from RFC 3840/ 3841 >>>>>> can be used to select modalities and >>>>>> languages and route the call to the best >>>>>> capable person or combination of person >>>>>> and assisting interpreting. >>>>>> >>>>> >>>>> Maybe, but one advantage of using SDP is >>>>> that the ESInet can take language and media >>>>> needs into account during policy-based >>>>> routing. For example, in some European >>>>> countries emergency calls placed by a >>>>> speaker of language x in country y may be >>>>> routed to a PSAP in a country where x is >>>>> the native language. Or, there might be >>>>> regional or national text relay or sign >>>>> language interpreter services as opposed to >>>>> PSAP-level capabilities. >>>>> >>>> Is there a complete specification for how >>>> policy based routing is thought to work? >>>> Where? >>>> Does it not use RFC 3840/3841? >>>> That procedure is already supported by SIP >>>> servers. Using SDP requires new SIP server >>>> programming. >>>> >>> >>> NENA has a document under development. I >>> thought it was able to take SDP into account >>> but I'll look into it, and I'm sure Brian >>> will have something to say. >>> >> Yes, I think I have seen that. But it needs to >> come into IETF to be possible to refer to. >> >>> >>> >>>>>> But, on the other hand, then we need a >>>>>> separate specification of what modality >>>>>> the parameters indicate, because the >>>>>> language tags only distinguish between >>>>>> signed and other languages, and "other" >>>>>> seems to mean either spoken or written >>>>>> without any difference. >>>>>> >>>>>> >>>>> >>>>> The SDP media already indicates the type (audio, video, text). >>>>> >>>> Yes, convenient. But there is no knowledge >>>> about the parameters until at call time. It >>>> could be better to know the callee >>>> capabilities in advance if available. Then >>>> middle boxes can do the routing instead of >>>> the far end. There may be many terminals >>>> competing for the call and the comparison >>>> about who to get it should be done by a sip >>>> server instead of an endpoint. >>>> >>> >>> I think call time is the right time. For >>> emergency calls, it isolates the decision >>> making about how to process calls requiring >>> text, sign language, foreign language, etc. >>> to the ESInet and PSAPs, which is I think the >>> right place. The processing rules in the >>> ESInet can then be changed without involving >>> any carrier. The capabilities of an entity >>> may vary based on dynamic factors (time of >>> day, load, etc.) so the decision as to how to >>> support a need may be best made by the ESInet >>> or PSAP in the case of an emergency call, or >>> called party for non-emergency calls. For >>> example, at some times or under some loads, >>> emergency calls may be routed to a specific >>> PSAP that is not the geographically indicated >>> one. Likewise, a non-emergency call to a >>> call center may be routed to a center in a >>> country that has support for the language or >>> media needed. >>> >> The decision is of course made at call time. >> With the RFC 3840/3841 method, the different >> agents and services available register their >> availability and capabilities when they go on >> duty, and unregister when they stop, so that >> their information is available at call time. >> >> >>> >>> Further, it is often the case that the cost >>> of relay, interpretation, or translation >>> services is affected by which entity invokes >>> the service. >>> >> Yes, that is a complicating policy issue. >> >>> >>> >>>>>> 4. Problem that 3GPP specifies that it is >>>>>> the UAs only who specify and act on these >>>>>> parameters. >>>>>> I think it is a problem that 3GPP inserted >>>>>> the restriction that the language and >>>>>> modality negotiation shall only bother the >>>>>> involved UAs. >>>>>> It would be more natural that it is a >>>>>> service provider between them who detect >>>>>> the differences and make the decision to >>>>>> invoke a relay service for the relay case. >>>>>> How do you propose to solve that? Let the >>>>>> service provider behave as a B2BUA, who >>>>>> then can behave as both a UA and a service >>>>>> provider? >>>>>> >>>>> >>>>> What do you mean by "service provider?" In >>>>> the case of a voice service provider such >>>>> as a cellular carrier or a VoIP provider, I >>>>> think this should be entirely transparent. >>>>> The voice service provider knows it is an >>>>> emergency call and routes to an ESInet. It >>>>> is then up to the ESInet and the PSAPs to >>>>> handle the call as they wish. >>>>> >>>> It can be a service provider for just the >>>> function to make advanced call invocation >>>> based on language preferences. The same type >>>> of decisions, call connections and assisting >>>> service invocation are needed in everyday >>>> calls as in emergency calls. But it can also >>>> be a service provider for emergency services >>>> and the user is registered by that service >>>> provider. They can make decisions on the >>>> call. E.g. detect that it is an emergency >>>> call requiring interpreter, and therefore >>>> connect to both the PSAP and interpreter at >>>> the same time to save time. >>>> >>> >>> I think it's best to make these decisions at >>> the end, not the middle. In the case of >>> emergency calls, the ESInet can route to a >>> particular PSAP, the PSAP may bridge in >>> translation or interpretation services, etc. >>> In the case of non-emergency calls, the call >>> center may support some capabilities locally >>> at some hours but route to a different call >>> center at other times. >>> >> The end is not decided until you have >> evaluated the alternative possible ends and >> decided who has the right capability and >> preference. >> >> >> >> There is another issue with using sdp for >> decisions. SIP Message is included in the set >> of methods to handle in emergency calls in RFC >> 6443. It can be used within sessions to carry >> text messages if other media are used as well. >> It is no favored way to have text >> communication, but possible. SIP message has >> no sdp. I know that the 3GPP sections about >> emergency calling in TS 22.101 points towards >> using MSRP for text messaging, so it should >> not be an issue for 3GPP. Can we neglect SIP >> Message from the discussion and aim at solving >> it only for real-time conversational media? I >> do not urge for solving it for SIP Message, I >> just wanted to point out that result by basing >> the mechanism on SDP. >> >> >> >> >> >> >> Will there be a possibility for remote >> participation on Thursday. I am sorry I am not >> there, but would like to participate if >> possible. >> /Gunnar >> >> >> Gunnar Hellström >> Omnitor >> <mailto:gunnar.hellstrom@omnitor.se>gunnar.hellstrom@omnitor.se >> +46708204288 >> >> On 2013-03-11 16:57, Randall Gellens wrote: >> >>> [[[ resending without Cc list ]]] >>> >>> Hi Dale, >>> >>> At 11:00 AM -0500 2/25/13, Dale R. Worley wrote: >>> >>>> (It's not clear to me what the proper mailing list is to discuss this >>>> draft. From the headers of the messages, it appears that the primary >>>> list is >>>> <mailto:ietf@ietf.org>ietf@ietf.org, but the >>>> first message in this thread about that >>>> draft already has a "Re:" in the subject line, so the discussion >>>> started somewhere else.) >>>> >>> >>> There has been some discussion among those >>> listed in the CC header of this message. I >>> think the mmusic list is probably the right >>> place to continue the discussion and was >>> planning on doing so more formally with the >>> next revision of the draft. >>> >>> By the way, the draft was updated and is now >>> at -02: >>> <http://www.ietf.org/internet-drafts/draft-gellens-negotiating-human-language-02.txt>http://www.ietf.org/internet-drafts/draft-gellens-negotiating-human-language-02.txt >>> >>> There is a face-to-face discussion Thursday >>> 11:30-1:00 at The Tropicale (the cafe in the >>> Caribe Royal). Please let me know if you can >>> make it. >>> >>>> (Also, it's not clear why Randall's messages are coming through in >>>> HTML.) >>>> >>> >>> My apologies; I have gotten in the habit when >>> replying to messages that have style to allow >>> Eudora to send my reply styled as well. >>> >>>> But onward to questions of substance: >>>> >>>> - Why SDP and not SIP? >>>> >>>> I'd like to see a more thorough exploration of why language >>>> negotiation is to be handled in SDP rather than SIP. (SIP, like HTTP, >>>> uses the Content-Language header to specify languages.) In principle, >>>> specifying data that may be used in call-routing should be done in the >>>> SIP layer, but it's well-accepted in the SIP world that call routing >>>> may be affected by the SDP content as well (e.g., media types). >>>> >>> >>> I think it fits more naturally in SDP since >>> the language is related to the media, e.g., >>> English for audio and ASL for video. >>> >>>> And some discussion and comparison should be done with the SIP/HTTP >>>> Content-Language header (used to specify the language of the >>>> communications) and the SIP Accept-Language header (used to specify >>>> the language of text components of SIP messages), particularly given >>>> that Accept-Language has different set of language specifiers and a >>>> richer syntax for specifying preferences. In any case, preference >>>> should be given to reusing one of the existing syntaxes for specifying >>>> language preferences. >>>> >>> >>> I think the semantics of Content-Language and >>> Accept-Language are different from the >>> semantics here, especially when setting up a >>> session with, as an example, an audio stream >>> using English and a video stream using ASL. >>> (But I can see clients using a default value >>> to set both the SDP language attribute and >>> the HTTP Content-Language, unless configured >>> differently.) >>> >>> As for reusing existing mechanisms, the draft >>> does contain two alternative proposals, one >>> to re-use the existing 'language' SDP >>> attribute, and one to define a new attribute. >>> >>>> - Dependency between media descriptions? >>>> >>>> Another example would be a user who is able to speak but is deaf or >>>> hard-of-hearing and requires a voice stream plus a text stream >>>> (known as voice carry over). Making language a media attribute >>>> allows the standard session negotiation mechanism to handle this by >>>> providing the information and mechanism for the endpoints to make >>>> appropriate decisions. >>>> >>>> This scenario suggests that there might be dependency or interaction >>>> between language specifications for different media descriptions. >>>> Whether this is needed should be determined and documented. >>>> >>>> - Specifying preference levels? >>>> >>>> For example, some users may be able to speak several languages, but >>>> have a preference. >>>> >>>> This might argue for describing degrees of preference using "q" >>>> parameters (as in the SIP Accept-Language header). >>>> >>>> - Expressing multiple languages in answers >>>> >>>> (While it is true that a conversation among multilingual people >>>> often involves multiple languages, it does not seem useful enough >>>> as a general facility to warrant complicating the desired semantics >>>> of the SDP attribute to allow negotiation of multiple simultaneous >>>> languages within an interactive media stream.) >>>> >>>> Why shouldn't an answer be able to indicate multiple languages? At >>>> the least, this might provide the offerer with useful information. >>>> >>> >>> You raise good questions that I think need >>> more discussion. I am hoping to keep the >>> work as simple as possible and not add >>> additional complexity, which argues for not >>> solving every aspect of the problem, but only >>> those that must be solved immediately. >>> >>>> >>>> - Reusing a=lang >>>> >>>> Searching, I can only find these descriptions of the use of >>>> "a=lang:...": >>>> >>>> RFC 4566 >>>> draft-saintandre-sip-xmpp-chat >>>> draft-gellens-negotiating-human-language >>>> >>>> So it looks like "a=lang:..." is entirely unused at the present and is >>>> safe to be redefined. >>>> >>> >>> >>> >>> >>> >> >> >> >> _______________________________________________ >> mmusic mailing list >> <mailto:mmusic@ietf.org>mmusic@ietf.org >> >> <https://www.ietf.org/mailman/listinfo/mmusic>https://www.ietf.org/mailman/listinfo/mmusic >> > > > _______________________________________________ > mmusic mailing list > mmusic@ietf.org > https://www.ietf.org/mailman/listinfo/mmusic -- Randall Gellens Opinions are personal; facts are suspect; I speak for myself only -------------- Randomly selected tag: --------------- A diva who specializes in risque arias is an off-coloratura soprano...
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Gunnar Hellstrom
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Francois Audet
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Flemming Andreasen
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens