Re: [apps-discuss] draft-ietf-appsawg-xml-mediatypes vs. JSON and BOM and UTF-8

Larry Masinter <masinter@adobe.com> Wed, 22 January 2014 20:31 UTC

Return-Path: <masinter@adobe.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A8ECE1A0265 for <apps-discuss@ietfa.amsl.com>; Wed, 22 Jan 2014 12:31:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 98569HZ861sH for <apps-discuss@ietfa.amsl.com>; Wed, 22 Jan 2014 12:31:34 -0800 (PST)
Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1blp0190.outbound.protection.outlook.com [207.46.163.190]) by ietfa.amsl.com (Postfix) with ESMTP id D3D2F1A0354 for <apps-discuss@ietf.org>; Wed, 22 Jan 2014 12:31:33 -0800 (PST)
Received: from BL2PR02MB307.namprd02.prod.outlook.com (10.141.91.21) by BL2PR02MB308.namprd02.prod.outlook.com (10.141.91.24) with Microsoft SMTP Server (TLS) id 15.0.859.15; Wed, 22 Jan 2014 20:31:32 +0000
Received: from BL2PR02MB307.namprd02.prod.outlook.com ([10.141.91.21]) by BL2PR02MB307.namprd02.prod.outlook.com ([10.141.91.21]) with mapi id 15.00.0859.013; Wed, 22 Jan 2014 20:31:31 +0000
From: Larry Masinter <masinter@adobe.com>
To: Ned Freed <ned.freed@mrochek.com>
Thread-Topic: [apps-discuss] draft-ietf-appsawg-xml-mediatypes vs. JSON and BOM and UTF-8
Thread-Index: AQHPEITLO/6Ue+XJikmXsxJV64yrQpqH8fHQ
Date: Wed, 22 Jan 2014 20:31:30 +0000
Message-ID: <3acc9b5d43754c7b976b2850e0a3da94@BL2PR02MB307.namprd02.prod.outlook.com>
References: <dc29826a2bbf48088abe51bb5de22e0d@BL2PR02MB307.namprd02.prod.outlook.com> <01P33MC6OLZM0000AS@mauve.mrochek.com>
In-Reply-To: <01P33MC6OLZM0000AS@mauve.mrochek.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [50.184.24.49]
x-forefront-prvs: 00997889E7
x-forefront-antispam-report: SFV:NSPM; SFS:(10019001)(6009001)(199002)(189002)(51704005)(85852003)(74316001)(83072002)(93516002)(90146001)(93136001)(56816005)(86362001)(15975445006)(76786001)(15202345003)(76796001)(76576001)(74366001)(69226001)(53806001)(76482001)(54356001)(74706001)(54316002)(56776001)(51856001)(46102001)(74876001)(2656002)(224313003)(81816001)(224303002)(81686001)(83322001)(4396001)(79102001)(74502001)(47446002)(65816001)(63696002)(87936001)(74662001)(31966008)(92566001)(33646001)(47736001)(49866001)(80976001)(512874002)(19580395003)(81342001)(50986001)(47976001)(87266001)(77982001)(85306002)(80022001)(59766001)(66066001)(81542001)(94316002)(24736002); DIR:OUT; SFP:1102; SCL:1; SRVR:BL2PR02MB308; H:BL2PR02MB307.namprd02.prod.outlook.com; CLIP:50.184.24.49; FPR:; InfoNoRecordsA:1; MX:1; LANG:en;
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: adobe.com
Cc: IETF Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] draft-ietf-appsawg-xml-mediatypes vs. JSON and BOM and UTF-8
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Jan 2014 20:31:36 -0000

> > It cannot be right that everyone specifying a text-based media type should
> > have to go through the process of deciding, for themselves, independently,. ...


> It may not be "right" but it is the reality of the situation.

Agree it's the reality.

> > If the future is UTF-8, UTF-8, UTF-8, then the two documents should say so,
> > right at the beginning.
> 
> I believe the future is UTF-8, including but not limited to its use in XML, and
> we should do what we can to promote it. But beliefs about the future don't
> necessarily belong in an RFC.

We have many documents that give guidelines. 

> Moreover, is this the right place and the right organization to make such a
> statement about XML? The IETF doesn't own the XML specification, the W3C  does.
> And this is a document about how to register XML media types, not about how
> to  use XML.

Organization: yes. We're giving guidelines for communicating text using MIME,
and the interaction of the 'charset' parameter in the metadata with other
sources of encoding information.

Place: Partly.  Guidelines for future text media types belong in a MIME BCP.

> Mind you, given my own beliefs I don't personally object to such a statement if
> there is consensus to include it. I just wonder if it is appropriate.

Although the general policy doesn't belong in appsawg-xml-mediatypes, but
It would be appropriate to advise senders of text/xml and application/xml
to send UTF-8, to not include a BOM, to recommend whether or not to use
 a charset="UTF-8" parameter, to recommend whether or not to include
an internal charset declaration, even when receivers recognize and interpret
other encodings.

This is an interoperability consideration for a revised specification of 
a widely deployed protocol, based on the belief that UTF-8 is becoming
generally even more the default in text-based MIME types.

Those kinds of forward-interoperability requirements seem to be
common in protocols... a SHOULD introduces a new policy to
correct past interoperability difficulties.

Larry
--
http://larry.masinter.net