Re: [MMUSIC] draft-gellens-mmusic-negotiating-human-language-01

On 2013-07-24 02:05, Paul Kyzivat wrote:
> On 7/23/13 6:35 PM, Gunnar Hellstrom wrote:
>>
>> On 2013-07-23 20:00, Paul Kyzivat wrote:
>>> I just reviewed this new version. I have a number of comments:
>> Good comments Paul,
>>>
>>> * Section 5:
>>>
>>> You don't go into the details of the O/A rules. But I infer you assume
>>> that the answer will be a subset of the offer, with the priorities
>>> possibly changed. This is a common strategy with SDP O/A, but it is
>>> limiting because it doesn't allow the answerer to indicate
>>> capabilities it has that weren't mentioned by the offerer. Knowing
>>> that could be useful, especially if a mismatch is going to result in
>>> recruiting a relay service.
>>>
>>> I suggest that answer allow the answerer to indicate all of its
>>> capabilities and preferences, and at the same time indicate choices
>>> made. (If a choice was made.)
>> <GH>Yes, very useful to indicate all.
>>>
>>> * Section 7.1:
>>>
>>> Instead of humanintlang-send and humanintlang-recv, how about just
>>> giving the lang tag and a send/recv/sendrecv parameter with it? This
>>> would be more concise when sendrecv can be used.
>>>
>>> Suggest using a parameter (q) to indicate relative preference for
>>> languages, rather than ordering. This can show degree of preference,
>>> and can show when multiple languages have the same preference.
>>>
>> <GH>Have you seen my request for an indication of level of preference
>> between the different modalities? A possiblility to indicate for example
>> that British Sign Language is highly preferred, but English text 
>> possible.
>> The q values could be used for coding of such relative level of
>> preference even between modalities. a q=1 for BSL in video, and a q=0.1
>> for English in text.   Right?
>
> That would work within a single medium.
>
> But I don't think we can use that mechanism to trade off between 
> different media. For that I think some other, more complex, mechanism 
> would be needed.
>
> Or else we would need a more complex definition of which things are 
> being prioritized against one another in order to use the q-values.
I proposed a simpler mechanism in a recent mail. It was to insert at the 
end of the humintlang tag a character a "!" for highly preferred, 
nothing for manageable and "#" for
a non-preferred but possible fall-back language and modality.

I think it will be very important for most users to have a chance to 
indicate this three-level preference order between different media as 
well as between different languages within one medium.

I am hesitating a bit for the type of coding I proposed, and saw your 
proposal of q-value as something cleaner. When coding, you could make 
sure that the last resort medium gets q-values under 0.4, the strongly 
preferred between 0.7 and 1 and the managed ones between 0.4 and 0.6.

You can see the earlier motivation for the preference between modalities 
in the archive at:
http://www.ietf.org/mail-archive/web/mmusic/current/msg11871.html

>
>>> Example:
>>>
>>> a=humanintlang:EN;sendrecv;q=1,ES;recv;q=.5
>>>
>>> To accommodate my comment about section 5, perhaps the answer can
>>> "mark" the alternative(s) that have been chosen for use, if any. (I'm
>>> not sure I like this, but for now, lets assume a "*" suffix on the
>>> direction in the answer.) E.g., assuming the offer above, then an
>>> answer like:
>>>
>>> a=humanintlang:EN;sendrecv*;q=1,FR:recv;q=.5
>>>
>>> Also in this section, the following:
>>>
>>>    Note that while signed language tags are used with a video stream to
>>>    indicate sign language, a spoken language tag for a video stream in
>>>    parallel with an audio stream with the same spoken language tag
>>>    indicates a request for a supplemental video stream to see the
>>>    speaker.
>>>
>>> might be ambiguous in the presence of many streams. It might be good
>>> to define a grouping framework (RFC 5888) usage to group an audio
>>> stream with a video stream that contains the speaker(s) of the audio.
>> <GH>This is both a threat and a solution. There are already other
>> reasons to group media in SDP. We have a risk of ending up in a very
>> complex coding in SDP to get this correctly specified in an environment
>> where there are also other reasons to group m-lines. This is why I am
>> still hesitating if the whole thing would not fit better as tags on the
>> SIP level.
>
> I notice Henning just asked about this today, in a different forum.
>
> Perhaps some things are better done at the sip level, and others at 
> the SDP level.
>
> Its also possible that correlations between different media streams 
> should be left to other mechanisms. E.g., CLUE could signal better 
> correlations between audio and video.
Yes, but here we have three media. video, audio and text. CLUE need to 
take text on board.

But you are right, if this mechanism is moved to SIP level, then we get 
the complex task to point out to what m-lines with the same medium do 
the humintlang-tags for that medium relate to. Simplifications need to 
be taken, so that we do not end up in specifying another CLUE.

I think that part of the background for this draft is that the mechanism 
shall be usable in 3GPP Multimedia Telephony . That might be a reason 
why it ended up specified in SDP.

Gunnar
>
>     Thanks,
>     Paul
>
>>>
>>> Also in this section, the following:
>>>
>>>    Clients acting on behalf of end users are expected to set one or 
>>> both
>>>    'humintlang-send' and 'humintlang-recv' attributes on each media
>>>    stream in an offer when placing an outgoing session but ignore the
>>>    attributes when receiving incoming calls.  Systems acting on behalf
>>>    of call centers and PSAPs are expected to take into account the
>>>    values when processing inbound calls.
>>>
>>> Are you intending to exclude clients taking a more active role? (E.g.,
>>> Isn't it possible that a client receiving an incoming call in a
>>> language it doesn't support might bridge in a relay service?)
>> <GH>I have asked for deletion of this limiting paragraph.
>>>
>>> Should a client that can't do anything useful with the language info
>>> still indicate its language preferences in the answer? I would think
>>> this would still be helpful. It might allow the caller, or a
>>> middlebox, to compensate in some way. What I suggest above allows that.
>> <GH>Yes, I prefer that view. The answering part shall answer if it can.
>> middleboxes or the offering part may do the invocation of services to
>> cover the gaps.
>>>
>>> * Section 7.2 (Advisory vs Required):
>>>
>>> Rather than "*" suffix, would be cleaner to have a separate indicator.
>>> And it seems like this should not be on a particular language tag. If
>>> you speak both Spanish and English, you may want the call to succeed
>>> if the callee can do one or the other, and otherwise fail. A tag per
>>> language doesn't do that.
>>
>> <GH>Yes, tricky.  I am getting a feeling that we should start a draft
>> describing usage in parallel to this one, so that use cases can be
>> verified.
>>
>>>
>>> Need more discussion for how to denote this.
>>>
>>>     Thanks,
>>>     Paul
>>>
>> Gunnar
>>> _______________________________________________
>>> mmusic mailing list
>>> mmusic@ietf.org
>>> https://www.ietf.org/mailman/listinfo/mmusic
>>
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
>>
>
> _______________________________________________
> mmusic mailing list
> mmusic@ietf.org
> https://www.ietf.org/mailman/listinfo/mmusic