Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard

Paul Kyzivat <pkyzivat@alum.mit.edu> Fri, 12 April 2019 15:45 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DB9CE1202DE for <mmusic@ietfa.amsl.com>; Fri, 12 Apr 2019 08:45:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9PGLsnVWZ4mE for <mmusic@ietfa.amsl.com>; Fri, 12 Apr 2019 08:45:27 -0700 (PDT)
Received: from outgoing-alum.mit.edu (outgoing-alum.mit.edu [18.7.68.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EE9F812037B for <mmusic@ietf.org>; Fri, 12 Apr 2019 08:45:26 -0700 (PDT)
Received: from PaulKyzivatsMBP.localdomain (c-24-62-227-142.hsd1.ma.comcast.net [24.62.227.142]) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.14.7/8.12.4) with ESMTP id x3CFjO9I001569 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for <mmusic@ietf.org>; Fri, 12 Apr 2019 11:45:25 -0400
To: mmusic@ietf.org
References: <155326592304.23020.8256337045285295468.idtracker@ietfa.amsl.com> <HE1PR0701MB2522795AFC19AB4A73D65ABA952F0@HE1PR0701MB2522.eurprd07.prod.outlook.com> <6E58094ECC8D8344914996DAD28F1CCD18CDB636@dggemm526-mbx.china.huawei.com> <HE1PR0701MB2522339DBB98ACE3399DB7B295280@HE1PR0701MB2522.eurprd07.prod.outlook.com>
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
Message-ID: <c0859cc2-dd12-d876-6ea6-415c928d90d3@alum.mit.edu>
Date: Fri, 12 Apr 2019 11:45:24 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <HE1PR0701MB2522339DBB98ACE3399DB7B295280@HE1PR0701MB2522.eurprd07.prod.outlook.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mmusic/-8tMey4ZCXJ6fpoziHVbvUeanXk>
Subject: Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mmusic/>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Apr 2019 15:45:29 -0000

On 4/12/19 4:59 AM, Magnus Westerlund wrote:
> Hi,
> 
> To be clear, I am not proposing that any existing attribute would be 
> forced to suddenly accept UTF-8 strings with no charset restriction. 
> Simply that new attribute's values can be defined to be UTF-8 in 
> general. I think the important distinction here is that a parser must be 
> ready to accept any UTF-8 character as the sender can't know if a 
> particular charset limitation applies to a specific consumer. However, 
> in form I don't see a problem of the SDP itself informing the consumer 
> that there is a charset limitation applied in this document.
> 
> My interpretation of the current situation is that it we hesitate to use 
> UTF-8 fields due to the uncertainty in the requirements on consumers of 
> SDP.

After re-reviewing the text, I think we may have a can of worms here 
that has existed a long time, perhaps from the beginning. Let me explain:

Section 5 says:

    An SDP description is entirely textual.  SDP field names and
    attribute names use only the US-ASCII subset of UTF-8, but textual
    fields and attribute values MAY use the full ISO 10646 character set
    in UTF-8 encoding, or some other character set defined by the
    "a=charset:" attribute.

Section 6.10 (charset attribute) says:

    The charset specified MUST be one of those registered in the IANA
    Character Sets registry (http://www.iana.org/assignments/character-
    sets), such as ISO-8859-1.
    ...
    Note that a character set specified MUST still prohibit the use of
    bytes 0x00 (Nul), 0x0A (LF), and 0x0d (CR).  Character sets requiring
    the use of these characters MUST define a quoting mechanism that
    prevents these bytes from appearing within text fields.

The charsets registry has a *lot* of entries, including a lot of ancient 
obsolete ones not vaguely related to ascii. (E.g., EBCDIC-related ones.) 
Many of these refer to RFC1345, which seems to be a pre-unicode attempt 
to rationalize character sets. The charsets registry also includes 
UTF-16 and UTF-32. It is really hard to understand how certain parts of 
an SDP body might be in UTF-8 while other parts are in UTF-32.

I can't make any sense of:

    Character sets requiring
    the use of these characters MUST define a quoting mechanism that
    prevents these bytes from appearing within text fields.

AFAIK character sets don't define quoting mechanisms. Also, I don't know 
what it means for a charset to require use of particular characters. And 
this text is very sloppy in muddling the use of "character" and "byte".

The bottom line is that this is a mess. I'm not sure if it ever could 
have worked. Nor do I understand what usage it was trying to 
accommodate. (Note that most of this stuff is largely unchanged since 
RFC2327.)

Perhaps all the charset-dependent stuff should be obsoleted.

	Thanks,
	Paul