Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard

Paul Kyzivat <> Mon, 15 April 2019 23:15 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3AFB3120287 for <>; Mon, 15 Apr 2019 16:15:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 2v4_S2pjHfjR for <>; Mon, 15 Apr 2019 16:15:43 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 1A4C01201BF for <>; Mon, 15 Apr 2019 16:15:42 -0700 (PDT)
Received: from PaulKyzivatsMBP.localdomain ( []) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by (8.14.7/8.12.4) with ESMTP id x3FNFehF013871 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for <>; Mon, 15 Apr 2019 19:15:41 -0400
References: <> <> <> <> <> <>
From: Paul Kyzivat <>
Message-ID: <>
Date: Mon, 15 Apr 2019 19:15:40 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 15 Apr 2019 23:15:45 -0000

On 4/15/19 4:02 AM, Magnus Westerlund wrote:
> Hi Paul,
> If I attempt to interpret what you are saying as conclusion it has
> several implications, or am I missunderstanding you?
> 1. Obsolete a=charset attribute. The only other options would be to
> define a small set of attribute to which it applies like "i=".

That is what I am suggesting. I can't see how it ever could have worked 
*in general* for arbitrary charset values. And at this point I don't see 
a need.

Alternatively, restrict the charsets which can work to those which have 
some benign values.

IMO the most troublesome charsets are those that are wider than one 
byte, like UTF-16 or UTF-32, since they would have to coexist with the 
rest of the sdp being the ascii subset of UTF-8. Also troublesome are 
charsets not vaguely related to ascii, like the variants of EBCDIC.

> 2. Clarify section 5 that using UTF-8 characters without NUL, CR, LF in
> attribute values and textual fields are okay.

That would amount to limiting charset to UTF-8, which is same as 
deprecating it.

> 3. Define how attribute values that may contain NUL, CR and LF use a
> specified escaping mechanism

Right now it says that the charset must define an escaping mechanism, 
without saying how that might happen. AFAIK *none* of the charsets 
define an escaping mechanism.

But I also don't see how one could define such a mechanism independent 
of the charset since it must presumably reserve some character(s) from 
that charset to introduce the escape.

> 4. Define an escaping mechanism that applies to SDP and UTF-8 strings
> 5. Legacy warning in attempting to use escaping mechanism in old
> attributes.

Not sure what to say about 4 & 5.

> Is this a fair summary?


I'm hoping I'm missing something about how this was intended to work in 
the first place.


> Cheers
> Magnus
> On 2019-04-12 17:45, Paul Kyzivat wrote:
>> On 4/12/19 4:59 AM, Magnus Westerlund wrote:
>>> Hi,
>>> To be clear, I am not proposing that any existing attribute would be
>>> forced to suddenly accept UTF-8 strings with no charset restriction.
>>> Simply that new attribute's values can be defined to be UTF-8 in
>>> general. I think the important distinction here is that a parser must be
>>> ready to accept any UTF-8 character as the sender can't know if a
>>> particular charset limitation applies to a specific consumer. However,
>>> in form I don't see a problem of the SDP itself informing the consumer
>>> that there is a charset limitation applied in this document.
>>> My interpretation of the current situation is that it we hesitate to use
>>> UTF-8 fields due to the uncertainty in the requirements on consumers of
>>> SDP.
>> After re-reviewing the text, I think we may have a can of worms here
>> that has existed a long time, perhaps from the beginning. Let me explain:
>> Section 5 says:
>>      An SDP description is entirely textual.  SDP field names and
>>      attribute names use only the US-ASCII subset of UTF-8, but textual
>>      fields and attribute values MAY use the full ISO 10646 character set
>>      in UTF-8 encoding, or some other character set defined by the
>>      "a=charset:" attribute.
>> Section 6.10 (charset attribute) says:
>>      The charset specified MUST be one of those registered in the IANA
>>      Character Sets registry (
>>      sets), such as ISO-8859-1.
>>      ...
>>      Note that a character set specified MUST still prohibit the use of
>>      bytes 0x00 (Nul), 0x0A (LF), and 0x0d (CR).  Character sets requiring
>>      the use of these characters MUST define a quoting mechanism that
>>      prevents these bytes from appearing within text fields.
>> The charsets registry has a *lot* of entries, including a lot of ancient
>> obsolete ones not vaguely related to ascii. (E.g., EBCDIC-related ones.)
>> Many of these refer to RFC1345, which seems to be a pre-unicode attempt
>> to rationalize character sets. The charsets registry also includes
>> UTF-16 and UTF-32. It is really hard to understand how certain parts of
>> an SDP body might be in UTF-8 while other parts are in UTF-32.
>> I can't make any sense of:
>>      Character sets requiring
>>      the use of these characters MUST define a quoting mechanism that
>>      prevents these bytes from appearing within text fields.
>> AFAIK character sets don't define quoting mechanisms. Also, I don't know
>> what it means for a charset to require use of particular characters. And
>> this text is very sloppy in muddling the use of "character" and "byte".
>> The bottom line is that this is a mess. I'm not sure if it ever could
>> have worked. Nor do I understand what usage it was trying to
>> accommodate. (Note that most of this stuff is largely unchanged since
>> RFC2327.)
>> Perhaps all the charset-dependent stuff should be obsoleted.
>> 	Thanks,
>> 	Paul
>> _______________________________________________
>> mmusic mailing list