Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard

Paul Kyzivat <pkyzivat@alum.mit.edu> Mon, 15 April 2019 23:15 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3AFB3120287 for <mmusic@ietfa.amsl.com>; Mon, 15 Apr 2019 16:15:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2v4_S2pjHfjR for <mmusic@ietfa.amsl.com>; Mon, 15 Apr 2019 16:15:43 -0700 (PDT)
Received: from outgoing-alum.mit.edu (outgoing-alum.mit.edu [18.7.68.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A4C01201BF for <mmusic@ietf.org>; Mon, 15 Apr 2019 16:15:42 -0700 (PDT)
Received: from PaulKyzivatsMBP.localdomain (c-24-62-227-142.hsd1.ma.comcast.net [24.62.227.142]) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.14.7/8.12.4) with ESMTP id x3FNFehF013871 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for <mmusic@ietf.org>; Mon, 15 Apr 2019 19:15:41 -0400
To: mmusic@ietf.org
References: <155326592304.23020.8256337045285295468.idtracker@ietfa.amsl.com> <HE1PR0701MB2522795AFC19AB4A73D65ABA952F0@HE1PR0701MB2522.eurprd07.prod.outlook.com> <6E58094ECC8D8344914996DAD28F1CCD18CDB636@dggemm526-mbx.china.huawei.com> <HE1PR0701MB2522339DBB98ACE3399DB7B295280@HE1PR0701MB2522.eurprd07.prod.outlook.com> <c0859cc2-dd12-d876-6ea6-415c928d90d3@alum.mit.edu> <HE1PR0701MB2522E002632141129BB0F4EA952B0@HE1PR0701MB2522.eurprd07.prod.outlook.com>
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
Message-ID: <908d2e27-b2f6-c14d-373f-e1e5fc900adf@alum.mit.edu>
Date: Mon, 15 Apr 2019 19:15:40 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <HE1PR0701MB2522E002632141129BB0F4EA952B0@HE1PR0701MB2522.eurprd07.prod.outlook.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mmusic/5706tdqnW70kZSWMH_ryadrWNRA>
Subject: Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mmusic/>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Apr 2019 23:15:45 -0000

On 4/15/19 4:02 AM, Magnus Westerlund wrote:
> Hi Paul,
> 
> If I attempt to interpret what you are saying as conclusion it has
> several implications, or am I missunderstanding you?
> 
> 1. Obsolete a=charset attribute. The only other options would be to
> define a small set of attribute to which it applies like "i=".

That is what I am suggesting. I can't see how it ever could have worked 
*in general* for arbitrary charset values. And at this point I don't see 
a need.

Alternatively, restrict the charsets which can work to those which have 
some benign values.

IMO the most troublesome charsets are those that are wider than one 
byte, like UTF-16 or UTF-32, since they would have to coexist with the 
rest of the sdp being the ascii subset of UTF-8. Also troublesome are 
charsets not vaguely related to ascii, like the variants of EBCDIC.

> 2. Clarify section 5 that using UTF-8 characters without NUL, CR, LF in
> attribute values and textual fields are okay.

That would amount to limiting charset to UTF-8, which is same as 
deprecating it.

> 3. Define how attribute values that may contain NUL, CR and LF use a
> specified escaping mechanism

Right now it says that the charset must define an escaping mechanism, 
without saying how that might happen. AFAIK *none* of the charsets 
define an escaping mechanism.

But I also don't see how one could define such a mechanism independent 
of the charset since it must presumably reserve some character(s) from 
that charset to introduce the escape.

> 4. Define an escaping mechanism that applies to SDP and UTF-8 strings
> 
> 5. Legacy warning in attempting to use escaping mechanism in old
> attributes.

Not sure what to say about 4 & 5.

> Is this a fair summary?

Whatever.

I'm hoping I'm missing something about how this was intended to work in 
the first place.

	Thanks,
	Paul

> Cheers
> 
> Magnus
> 
> 
> 
> On 2019-04-12 17:45, Paul Kyzivat wrote:
>> On 4/12/19 4:59 AM, Magnus Westerlund wrote:
>>> Hi,
>>>
>>> To be clear, I am not proposing that any existing attribute would be
>>> forced to suddenly accept UTF-8 strings with no charset restriction.
>>> Simply that new attribute's values can be defined to be UTF-8 in
>>> general. I think the important distinction here is that a parser must be
>>> ready to accept any UTF-8 character as the sender can't know if a
>>> particular charset limitation applies to a specific consumer. However,
>>> in form I don't see a problem of the SDP itself informing the consumer
>>> that there is a charset limitation applied in this document.
>>>
>>> My interpretation of the current situation is that it we hesitate to use
>>> UTF-8 fields due to the uncertainty in the requirements on consumers of
>>> SDP.
>> After re-reviewing the text, I think we may have a can of worms here
>> that has existed a long time, perhaps from the beginning. Let me explain:
>>
>> Section 5 says:
>>
>>      An SDP description is entirely textual.  SDP field names and
>>      attribute names use only the US-ASCII subset of UTF-8, but textual
>>      fields and attribute values MAY use the full ISO 10646 character set
>>      in UTF-8 encoding, or some other character set defined by the
>>      "a=charset:" attribute.
>>
>> Section 6.10 (charset attribute) says:
>>
>>      The charset specified MUST be one of those registered in the IANA
>>      Character Sets registry (http://www.iana.org/assignments/character-
>>      sets), such as ISO-8859-1.
>>      ...
>>      Note that a character set specified MUST still prohibit the use of
>>      bytes 0x00 (Nul), 0x0A (LF), and 0x0d (CR).  Character sets requiring
>>      the use of these characters MUST define a quoting mechanism that
>>      prevents these bytes from appearing within text fields.
>>
>> The charsets registry has a *lot* of entries, including a lot of ancient
>> obsolete ones not vaguely related to ascii. (E.g., EBCDIC-related ones.)
>> Many of these refer to RFC1345, which seems to be a pre-unicode attempt
>> to rationalize character sets. The charsets registry also includes
>> UTF-16 and UTF-32. It is really hard to understand how certain parts of
>> an SDP body might be in UTF-8 while other parts are in UTF-32.
>>
>> I can't make any sense of:
>>
>>      Character sets requiring
>>      the use of these characters MUST define a quoting mechanism that
>>      prevents these bytes from appearing within text fields.
>>
>> AFAIK character sets don't define quoting mechanisms. Also, I don't know
>> what it means for a charset to require use of particular characters. And
>> this text is very sloppy in muddling the use of "character" and "byte".
>>
>> The bottom line is that this is a mess. I'm not sure if it ever could
>> have worked. Nor do I understand what usage it was trying to
>> accommodate. (Note that most of this stuff is largely unchanged since
>> RFC2327.)
>>
>> Perhaps all the charset-dependent stuff should be obsoleted.
>>
>> 	Thanks,
>> 	Paul
>>
>> 	
>>
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
>>
>