Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard

Paul Kyzivat <> Fri, 12 April 2019 15:45 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DB9CE1202DE for <>; Fri, 12 Apr 2019 08:45:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 9PGLsnVWZ4mE for <>; Fri, 12 Apr 2019 08:45:27 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id EE9F812037B for <>; Fri, 12 Apr 2019 08:45:26 -0700 (PDT)
Received: from PaulKyzivatsMBP.localdomain ( []) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by (8.14.7/8.12.4) with ESMTP id x3CFjO9I001569 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for <>; Fri, 12 Apr 2019 11:45:25 -0400
References: <> <> <> <>
From: Paul Kyzivat <>
Message-ID: <>
Date: Fri, 12 Apr 2019 11:45:24 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [MMUSIC] Last Call: <draft-ietf-mmusic-rfc4566bis-34.txt> (SDP: Session Description Protocol) to Proposed Standard
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 12 Apr 2019 15:45:29 -0000

On 4/12/19 4:59 AM, Magnus Westerlund wrote:
> Hi,
> To be clear, I am not proposing that any existing attribute would be 
> forced to suddenly accept UTF-8 strings with no charset restriction. 
> Simply that new attribute's values can be defined to be UTF-8 in 
> general. I think the important distinction here is that a parser must be 
> ready to accept any UTF-8 character as the sender can't know if a 
> particular charset limitation applies to a specific consumer. However, 
> in form I don't see a problem of the SDP itself informing the consumer 
> that there is a charset limitation applied in this document.
> My interpretation of the current situation is that it we hesitate to use 
> UTF-8 fields due to the uncertainty in the requirements on consumers of 
> SDP.

After re-reviewing the text, I think we may have a can of worms here 
that has existed a long time, perhaps from the beginning. Let me explain:

Section 5 says:

    An SDP description is entirely textual.  SDP field names and
    attribute names use only the US-ASCII subset of UTF-8, but textual
    fields and attribute values MAY use the full ISO 10646 character set
    in UTF-8 encoding, or some other character set defined by the
    "a=charset:" attribute.

Section 6.10 (charset attribute) says:

    The charset specified MUST be one of those registered in the IANA
    Character Sets registry (
    sets), such as ISO-8859-1.
    Note that a character set specified MUST still prohibit the use of
    bytes 0x00 (Nul), 0x0A (LF), and 0x0d (CR).  Character sets requiring
    the use of these characters MUST define a quoting mechanism that
    prevents these bytes from appearing within text fields.

The charsets registry has a *lot* of entries, including a lot of ancient 
obsolete ones not vaguely related to ascii. (E.g., EBCDIC-related ones.) 
Many of these refer to RFC1345, which seems to be a pre-unicode attempt 
to rationalize character sets. The charsets registry also includes 
UTF-16 and UTF-32. It is really hard to understand how certain parts of 
an SDP body might be in UTF-8 while other parts are in UTF-32.

I can't make any sense of:

    Character sets requiring
    the use of these characters MUST define a quoting mechanism that
    prevents these bytes from appearing within text fields.

AFAIK character sets don't define quoting mechanisms. Also, I don't know 
what it means for a charset to require use of particular characters. And 
this text is very sloppy in muddling the use of "character" and "byte".

The bottom line is that this is a mess. I'm not sure if it ever could 
have worked. Nor do I understand what usage it was trying to 
accommodate. (Note that most of this stuff is largely unchanged since 

Perhaps all the charset-dependent stuff should be obsoleted.