Re: [MMUSIC] draft-gellens-negotiating-human-language-02
Flemming Andreasen <fandreas@cisco.com> Thu, 14 March 2013 13:57 UTC
Return-Path: <fandreas@cisco.com>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 61D8911E80DF for <mmusic@ietfa.amsl.com>; Thu, 14 Mar 2013 06:57:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.998
X-Spam-Level:
X-Spam-Status: No, score=-9.998 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_14=0.6, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fQ7OTZP9x+-X for <mmusic@ietfa.amsl.com>; Thu, 14 Mar 2013 06:57:02 -0700 (PDT)
Received: from rcdn-iport-9.cisco.com (rcdn-iport-9.cisco.com [173.37.86.80]) by ietfa.amsl.com (Postfix) with ESMTP id D244221F8E5A for <mmusic@ietf.org>; Thu, 14 Mar 2013 06:56:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=47706; q=dns/txt; s=iport; t=1363269417; x=1364479017; h=message-id:date:from:mime-version:to:cc:subject: references:in-reply-to; bh=4Y52/aEkWPRKn4J6QVdHwRI6MnYrx0RJZbV+SqBDtLc=; b=Cw5UPn1qo9c7kYvVUgPmsujnEf0vICaz0Y+L4xGLogQhwRB5Jt/PpcUv EIZxpKGyLNiO1BAYCky+WHkZjtMIBAT2AUxsOtZ6YyvWD6b4XpaFhMHGH g7KVmsGGMMJUKh7uNbIj9b/Kzjr89xu0yi3t56Yz/vBP3UJUOc5ZtiNNw c=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgIFAEHWQVGtJV2Y/2dsb2JhbABDxHiBYRZ0giMHAQEBAwEBAQEXAUwHCgEFCwsYCQwKAQENCQMCAQIBFTAGDQEFAgEBBYVAB4I+BgywUpBOiXiFDgoHCoM2A5MTg0WRAoMmIA
X-IronPort-AV: E=Sophos; i="4.84,845,1355097600"; d="scan'208,217"; a="184446536"
Received: from rcdn-core-1.cisco.com ([173.37.93.152]) by rcdn-iport-9.cisco.com with ESMTP; 14 Mar 2013 13:56:54 +0000
Received: from dhcp-170e.meeting.ietf.org ([10.86.246.161]) by rcdn-core-1.cisco.com (8.14.5/8.14.5) with ESMTP id r2EDur84032145; Thu, 14 Mar 2013 13:56:53 GMT
Message-ID: <5141D725.3030706@cisco.com>
Date: Thu, 14 Mar 2013 09:56:53 -0400
From: Flemming Andreasen <fandreas@cisco.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130307 Thunderbird/17.0.4
MIME-Version: 1.0
To: Gunnar Hellstrom <gunnar.hellstrom@omnitor.se>
References: <p0624060ecd63af26fe28@dhcp-42ec.meeting.ietf.org> <513E504F.1010209@omnitor.se>
In-Reply-To: <513E504F.1010209@omnitor.se>
Content-Type: multipart/alternative; boundary="------------000801040009090103020401"
Cc: mmusic@ietf.org
Subject: Re: [MMUSIC] draft-gellens-negotiating-human-language-02
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Mar 2013 13:57:12 -0000
On 3/11/13 5:44 PM, Gunnar Hellstrom wrote: > Before this discussion got its home in mmusic, we discussed quite > similar topics as you, Dale brought up now. > > It was about what was needed to be expressed by the parameters and if > SDP or SIP was the right place. And in the case of SIP, if RFC 3840 / > 3841 could be a suitable mechanism for routing and decisions on the > parameters. > I agree more discussion is needed on this. There seems to be two problems considered in the draft: 1) Routing of a request to an answerer that has the language capabilities the caller desires. 2) Negotiation of the language properties to use on a per-stream basis once the call has been routed to a particular answerer. Problem 1 seems to fall in the RFC 3840/3841 space, whereas problem 2 is more of an SDP issue. -- Flemming > Here is part of that discussion that we need to capture. > > > I see some complication that might be needed in order to reflect > reality. At least they should be discussed. > > And I am also seeing some different ways to specify it. > > The complications to discuss are: > > *1. Level of preference.* > > There may be a need for specifying levels of preference for > languages. I might strongly prefer to talk English, but have some > useful capability in French. I want to display that preference and > that capability with that difference, so that I get English whenever > possible, but get the call connected even if English is not available > at all but French is. > > I would assume that two levels are sufficient, but that can be > discussed: Preferred and capable. >> >> The draft already proposes that languages be listed in order of >> preference, which should handle the example you mention: you list >> English first and French second. The called party selects English if >> it is capable and falls back to French if English is not and French >> is. This seems much simpler and is a common way of handling >> situations where there is a preference. It would be good to keep the >> mechanism as simple as possible. >> Yes, I am afraid of complicating this beyond the point when users do >> not manage to get their settings right. >> Still, I do not think that the order is sufficient as level of >> preference indicator. You may want to indicate capability for one >> modality but preference for another. ( as my example, capability for >> ASL, but preference for talking and reading ) > > If you have a capability for ASL but preference for talking and > reading, you could initially offer two media streams: a voice with > English and a text with English. If accepted, you have your preferred > communications. If those are rejected you could then offer video with > ASL. Would that handle the case? > No, video is still very valuable for judging the emergency case. Or > seeing a friend. So, if you support it you want to offer it. But the > decision on languages and modalities may end up in video being not > important for language communication. > > >> >> >>>>> *2. Directionality* >>>>> There is a need for a direction of the language preference. >>>>> "Transmit, receive or both". or "Produce, perceive or both". >>>>> That is easy to understand for the relay service examples. >>>>> A hard-of-hearing user may declare: >>>>> >>>>> Text, capable, produce, English >>>>> Text, prefer, perceive, English >>>>> Audio, prefer, produce, English >>>>> Audio, capable, perceive, English ( tricky, a typical >>>>> hard-of-hearing user may have benefit of receiving audio, while it >>>>> is not usable enough for reliable perception. I do not want to see >>>>> this eternally complex, but I see a need for refined expressions here) >>>>> video, capable, both, ASL >>>>> >>>>> This should be understood as that the user prefers to speak and >>>>> get text back, and has benefit of getting voice in parallel with >>>>> text. ASL signing can be an alternative if the other party has >>>>> corresponding capability or preference. >>>>> >>>> >>>> The draft does support this (and even mentions some of these >>>> specific uses) because it proposes an SDP media attribute, and >>>> media can be specified to be send, receive, or both. >>> No, that is not the same. You want the media to flow, but by the >>> parameter you want to indicate your preference for how to use it. >>> You do not want to turn off incoming audio just because you prefer >>> to talk but read text. >> >> Yes, I see, thanks for the clarification. Does this need to be part >> of the session setup? If you establish all media streams that you >> wish to use, can you then just use them as you prefer? I will >> consult with the NENA accessibility committee on this. > No, there are specific services providing service with one direction > but not the other. The information is needed for decision on what > assisting service to invoke. One such service is the captioned > telephony, that adds rapidly created speech-to-text in parallel with > the voice. They provide just that. A user will have a very strong > preference for getting just that service, but could accept with much > lower preference to get a direct conversation with the far end in > combined text and voice. >> >> >>>>> I think it would be useful to move most of the introduction to a >>>>> structured use case chapter and express the different cases >>>>> according to a template. Thast can then be used to test if >>>>> proposed approaches will work. >>>> >>>> I'm not sure I fully understand what you mean by "structured" in >>>> "structured use case" or "template." Can you be more specific? >>> I mean just a simple template for how the use case descriptions are >>> written. >>> >>> E.g. >>> A title indicating what case we have. >>> Description of the calling user and its capabilities and preferences. >>> Description of the answering user and its capabilities and preferences >>> Description of a possible assisting service and its capabilities and >>> preferences >>> Description of the calling user's indications. >>> Description of the answering user's indications. >>> The resulting decision and outcome >>>> >>>> >>>>> *3. Specify language and modality at SIP Media tag level instead. >>>>> *There could be some benfits to declare these parameters at the >>>>> SIP media tag level instead of SDP level. >>>>> A call center can then register with their capabilities already at >>>>> the SIP REGISTER time, and the caller preferences / callee >>>>> capabilities mechanism from RFC 3840/ 3841 can be used to select >>>>> modalities and languages and route the call to the best capable >>>>> person or combination of person and assisting interpreting. >>>> >>>> Maybe, but one advantage of using SDP is that the ESInet can take >>>> language and media needs into account during policy-based routing. >>>> For example, in some European countries emergency calls placed by a >>>> speaker of language x in country y may be routed to a PSAP in a >>>> country where x is the native language. Or, there might be >>>> regional or national text relay or sign language interpreter >>>> services as opposed to PSAP-level capabilities. >>> Is there a complete specification for how policy based routing is >>> thought to work? Where? >>> Does it not use RFC 3840/3841? >>> That procedure is already supported by SIP servers. Using SDP >>> requires new SIP server programming. >> >> NENA has a document under development. I thought it was able to take >> SDP into account but I'll look into it, and I'm sure Brian will have >> something to say. > Yes, I think I have seen that. But it needs to come into IETF to be > possible to refer to. >> >> >>>>> But, on the other hand, then we need a separate specification of >>>>> what modality the parameters indicate, because the language tags >>>>> only distinguish between signed and other languages, and "other" >>>>> seems to mean either spoken or written without any difference. >>>>> >>>> >>>> The SDP media already indicates the type (audio, video, text). >>> Yes, convenient. But there is no knowledge about the parameters >>> until at call time. It could be better to know the callee >>> capabilities in advance if available. Then middle boxes can do the >>> routing instead of the far end. There may be many terminals >>> competing for the call and the comparison about who to get it should >>> be done by a sip server instead of an endpoint. >> >> I think call time is the right time. For emergency calls, it >> isolates the decision making about how to process calls requiring >> text, sign language, foreign language, etc. to the ESInet and PSAPs, >> which is I think the right place. The processing rules in the ESInet >> can then be changed without involving any carrier. The capabilities >> of an entity may vary based on dynamic factors (time of day, load, >> etc.) so the decision as to how to support a need may be best made by >> the ESInet or PSAP in the case of an emergency call, or called party >> for non-emergency calls. For example, at some times or under some >> loads, emergency calls may be routed to a specific PSAP that is not >> the geographically indicated one. Likewise, a non-emergency call to >> a call center may be routed to a center in a country that has support >> for the language or media needed. > The decision is of course made at call time. With the RFC 3840/3841 > method, the different agents and services available register their > availability and capabilities when they go on duty, and unregister > when they stop, so that their information is available at call time. > >> >> Further, it is often the case that the cost of relay, interpretation, >> or translation services is affected by which entity invokes the service. > Yes, that is a complicating policy issue. >> >> >>>>> *4. Problem that 3GPP specifies that it is the UAs only who >>>>> specify and act on these parameters. >>>>> *I think it is a problem that 3GPP inserted the restriction that >>>>> the language and modality negotiation shall only bother the >>>>> involved UAs. >>>>> It would be more natural that it is a service provider between >>>>> them who detect the differences and make the decision to invoke a >>>>> relay service for the relay case. >>>>> How do you propose to solve that? Let the service provider behave >>>>> as a B2BUA, who then can behave as both a UA and a service provider? >>>> >>>> What do you mean by "service provider?" In the case of a voice >>>> service provider such as a cellular carrier or a VoIP provider, I >>>> think this should be entirely transparent. The voice service >>>> provider knows it is an emergency call and routes to an ESInet. It >>>> is then up to the ESInet and the PSAPs to handle the call as they wish. >>> It can be a service provider for just the function to make advanced >>> call invocation based on language preferences. The same type of >>> decisions, call connections and assisting service invocation are >>> needed in everyday calls as in emergency calls. But it can also be a >>> service provider for emergency services and the user is registered >>> by that service provider. They can make decisions on the call. E.g. >>> detect that it is an emergency call requiring interpreter, and >>> therefore connect to both the PSAP and interpreter at the same time >>> to save time. >> >> I think it's best to make these decisions at the end, not the >> middle. In the case of emergency calls, the ESInet can route to a >> particular PSAP, the PSAP may bridge in translation or interpretation >> services, etc. In the case of non-emergency calls, the call center >> may support some capabilities locally at some hours but route to a >> different call center at other times. > The end is not decided until you have evaluated the alternative > possible ends and decided who has the right capability and preference. > > > > There is another issue with using sdp for decisions. SIP Message is > included in the set of methods to handle in emergency calls in RFC > 6443. It can be used within sessions to carry text messages if other > media are used as well. It is no favored way to have text > communication, but possible. SIP message has no sdp. I know that the > 3GPP sections about emergency calling in TS 22.101 points towards > using MSRP for text messaging, so it should not be an issue for 3GPP. > Can we neglect SIP Message from the discussion and aim at solving it > only for real-time conversational media? I do not urge for solving it > for SIP Message, I just wanted to point out that result by basing the > mechanism on SDP. > > > > > > > Will there be a possibility for remote participation on Thursday. I am > sorry I am not there, but would like to participate if possible. > /Gunnar > > ------------------------------------------------------------------------ > Gunnar Hellström > Omnitor > gunnar.hellstrom@omnitor.se > +46708204288 > On 2013-03-11 16:57, Randall Gellens wrote: >> [[[ resending without Cc list ]]] >> >> Hi Dale, >> >> At 11:00 AM -0500 2/25/13, Dale R. Worley wrote: >> >>> (It's not clear to me what the proper mailing list is to discuss this >>> draft. From the headers of the messages, it appears that the primary >>> list is ietf@ietf.org, but the first message in this thread about that >>> draft already has a "Re:" in the subject line, so the discussion >>> started somewhere else.) >> >> There has been some discussion among those listed in the CC header of >> this message. I think the mmusic list is probably the right place to >> continue the discussion and was planning on doing so more formally >> with the next revision of the draft. >> >> By the way, the draft was updated and is now at -02: >> http://www.ietf.org/internet-drafts/draft-gellens-negotiating-human-language-02.txt >> >> There is a face-to-face discussion Thursday 11:30-1:00 at The >> Tropicale (the cafe in the Caribe Royal). Please let me know if you >> can make it. >> >>> (Also, it's not clear why Randall's messages are coming through in >>> HTML.) >> >> My apologies; I have gotten in the habit when replying to messages >> that have style to allow Eudora to send my reply styled as well. >> >> >>> But onward to questions of substance: >>> >>> - Why SDP and not SIP? >>> >>> I'd like to see a more thorough exploration of why language >>> negotiation is to be handled in SDP rather than SIP. (SIP, like HTTP, >>> uses the Content-Language header to specify languages.) In principle, >>> specifying data that may be used in call-routing should be done in the >>> SIP layer, but it's well-accepted in the SIP world that call routing >>> may be affected by the SDP content as well (e.g., media types). >> >> I think it fits more naturally in SDP since the language is related >> to the media, e.g., English for audio and ASL for video. >> >> >>> And some discussion and comparison should be done with the SIP/HTTP >>> Content-Language header (used to specify the language of the >>> communications) and the SIP Accept-Language header (used to specify >>> the language of text components of SIP messages), particularly given >>> that Accept-Language has different set of language specifiers and a >>> richer syntax for specifying preferences. In any case, preference >>> should be given to reusing one of the existing syntaxes for specifying >>> language preferences. >> >> I think the semantics of Content-Language and Accept-Language are >> different from the semantics here, especially when setting up a >> session with, as an example, an audio stream using English and a >> video stream using ASL. (But I can see clients using a default value >> to set both the SDP language attribute and the HTTP Content-Language, >> unless configured differently.) >> >> As for reusing existing mechanisms, the draft does contain two >> alternative proposals, one to re-use the existing 'language' SDP >> attribute, and one to define a new attribute. >> >>> - Dependency between media descriptions? >>> >>> Another example would be a user who is able to speak but is deaf or >>> hard-of-hearing and requires a voice stream plus a text stream >>> (known as voice carry over). Making language a media attribute >>> allows the standard session negotiation mechanism to handle this by >>> providing the information and mechanism for the endpoints to make >>> appropriate decisions. >>> >>> This scenario suggests that there might be dependency or interaction >>> between language specifications for different media descriptions. >>> Whether this is needed should be determined and documented. >>> >>> - Specifying preference levels? >>> >>> For example, some users may be able to speak several languages, but >>> have a preference. >>> >>> This might argue for describing degrees of preference using "q" >>> parameters (as in the SIP Accept-Language header). >>> >>> - Expressing multiple languages in answers >>> >>> (While it is true that a conversation among multilingual people >>> often involves multiple languages, it does not seem useful enough >>> as a general facility to warrant complicating the desired semantics >>> of the SDP attribute to allow negotiation of multiple simultaneous >>> languages within an interactive media stream.) >>> >>> Why shouldn't an answer be able to indicate multiple languages? At >>> the least, this might provide the offerer with useful information. >> >> You raise good questions that I think need more discussion. I am >> hoping to keep the work as simple as possible and not add additional >> complexity, which argues for not solving every aspect of the problem, >> but only those that must be solved immediately. >> >>> >>> - Reusing a=lang >>> >>> Searching, I can only find these descriptions of the use of >>> "a=lang:...": >>> >>> RFC 4566 >>> draft-saintandre-sip-xmpp-chat >>> draft-gellens-negotiating-human-language >>> >>> So it looks like "a=lang:..." is entirely unused at the present and is >>> safe to be redefined. >> >> >> >> >> > > > > _______________________________________________ > mmusic mailing list > mmusic@ietf.org > https://www.ietf.org/mailman/listinfo/mmusic
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Gunnar Hellstrom
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Francois Audet
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Flemming Andreasen
- Re: [MMUSIC] draft-gellens-negotiating-human-lang… Randall Gellens