Re: [xml2rfc-dev] When is @ascii required?

Henrik Levkowetz <henrik@levkowetz.com> Sun, 27 October 2019 18:50 UTC

Return-Path: <henrik@levkowetz.com>
X-Original-To: xml2rfc-dev@ietfa.amsl.com
Delivered-To: xml2rfc-dev@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E8D5512009E for <xml2rfc-dev@ietfa.amsl.com>; Sun, 27 Oct 2019 11:50:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fmOsMssYA5xY for <xml2rfc-dev@ietfa.amsl.com>; Sun, 27 Oct 2019 11:50:48 -0700 (PDT)
Received: from zinfandel.tools.ietf.org (zinfandel.tools.ietf.org [IPv6:2001:1890:126c::1:2a]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 825BD120024 for <xml2rfc-dev@ietf.org>; Sun, 27 Oct 2019 11:50:48 -0700 (PDT)
Received: from h-202-242.a357.priv.bahnhof.se ([158.174.202.242]:57843 helo=tannat.localdomain) by zinfandel.tools.ietf.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <henrik@levkowetz.com>) id 1iOncl-0003UB-5Q; Sun, 27 Oct 2019 11:50:48 -0700
To: Carsten Bormann <cabo@tzi.org>, xml2rfc-dev@ietf.org
References: <37D9DCA7-A262-46A6-88C7-369127959164@tzi.org> <834E00E6-A39A-4E8C-8AF4-7D2F9B736C74@tzi.org>
From: Henrik Levkowetz <henrik@levkowetz.com>
Message-ID: <9079ee9c-3f9c-74bc-9e84-fff223056ab9@levkowetz.com>
Date: Sun, 27 Oct 2019 19:50:39 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <834E00E6-A39A-4E8C-8AF4-7D2F9B736C74@tzi.org>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="U0Uj6cqnW7GqJpl8xdQbaCWne8g8erFEL"
X-SA-Exim-Connect-IP: 158.174.202.242
X-SA-Exim-Rcpt-To: xml2rfc-dev@ietf.org, cabo@tzi.org
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000)
X-SA-Exim-Scanned: Yes (on zinfandel.tools.ietf.org)
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc-dev/jQ2CEiQDagLGQSQT2xNy0lGfW8E>
Subject: Re: [xml2rfc-dev] When is @ascii required?
X-BeenThere: xml2rfc-dev@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion about particulars of xml2rfc V3 design, development and code." <xml2rfc-dev.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc-dev>, <mailto:xml2rfc-dev-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc-dev/>
List-Post: <mailto:xml2rfc-dev@ietf.org>
List-Help: <mailto:xml2rfc-dev-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc-dev>, <mailto:xml2rfc-dev-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 27 Oct 2019 18:50:50 -0000

Hi Carsten,

On 2019-10-27 18:51, Carsten Bormann wrote:
> (Looking at the source of the message in preptool.py/check_ascii_text, I also don’t understand this:
> 
>                 if self.tree.docinfo.encoding.lower() in ['us-ascii', ]:
>                     self.die(c, “Found non-ascii content in a document with xml encoding declared as %s” % self.tree.docinfo.encoding)
> 
> I don’t understand what the source encoding has to do with this; XML
> allows me to have beyond-ascii characters in documents the xml
> encoding of which uses us-ascii (or koi8-r, for that matter).  I
> think that any access to self.tree.docinfo.encoding at this point in
> processing is a layer violation.)

It may very well be that the test can be improved, but it was triggered
by a hard-to-diagnose failure where an XML file had us-ascii encoding
declared, but contained non-ascii characters.  The RPC didn't manage to
get a handle on what the problem was that triggered the failures, and it
also took me a while to understand what was going on.


	Henrik