Re: [Tools-discuss] boilerplate BCP14 and UTF8

Michael Richardson <mcr@sandelman.ca> Wed, 04 December 2019 18:37 UTC

Return-Path: <mcr@sandelman.ca>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A243120939 for <tools-discuss@ietfa.amsl.com>; Wed, 4 Dec 2019 10:37:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3ayfdb0bh-sR for <tools-discuss@ietfa.amsl.com>; Wed, 4 Dec 2019 10:37:09 -0800 (PST)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [209.87.249.19]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6B2041208C3 for <tools-discuss@ietf.org>; Wed, 4 Dec 2019 10:37:09 -0800 (PST)
Received: from sandelman.ca (obiwan.sandelman.ca [IPv6:2607:f0b0:f:2::247]) by tuna.sandelman.ca (Postfix) with ESMTP id 4D3253818F; Wed, 4 Dec 2019 13:33:31 -0500 (EST)
Received: from localhost (localhost [IPv6:::1]) by sandelman.ca (Postfix) with ESMTP id A8ECAAAB; Wed, 4 Dec 2019 13:37:08 -0500 (EST)
From: Michael Richardson <mcr@sandelman.ca>
To: Carsten Bormann <cabo@tzi.org>
cc: tools-discuss@ietf.org
In-Reply-To: <4292ED41-D66B-4150-A086-B2247FA565D9@tzi.org>
References: <20931.1575481656@localhost> <21302.1575481756@localhost> <4292ED41-D66B-4150-A086-B2247FA565D9@tzi.org>
X-Mailer: MH-E 8.6; nmh 1.7+dev; GNU Emacs 24.5.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 04 Dec 2019 13:37:08 -0500
Message-ID: <32136.1575484628@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/Nhv7uRcdpsNYJDdqXRvHg8jv5xo>
Subject: Re: [Tools-discuss] boilerplate BCP14 and UTF8
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Dec 2019 18:37:11 -0000

Carsten Bormann <cabo@tzi.org> wrote:
    >> Signed PGP part
    >>
    >> Michael Richardson <mcr@sandelman.ca> wrote:
    >>> Kramdown gives me this error in some documents:
    >>
    >>> /usr/local/rvm/gems/ruby-2.6.5/gems/kramdown-rfc2629-1.2.12/bin/kramdown-rfc2629:333:in
    >>> `encode': U+00A0 from UTF-8 to US-ASCII (Encoding::UndefinedConversionError)
    >>
    >>> when I include {::boilerplate bcp14}
    >>
    >>> I don't have a BOM in my markdown. I don't think I should.
    >>> It feels like the output ought to be in UTF-8, but isn't somehow.
    >>
    >> I found:
    >> coding: us-ascii

    > How did that get there?

    >> in my markdown, and removed it.  Maybe we can make the error more meaningful?

    > Probably.  Now https://github.com/cabo/kramdown-rfc2629/issues/65

    >> Or maybe the template that is copied in could be US-ASCII only?

    > I’ll need to check how to do the NBSP in “BCP 14” in markdown without
    > leaving ASCII encoding.  But maybe the whole idea of allowing “coding:
    > us-ascii” is not so useful now.

Maybe:

if coding == "us-ascii"
  puts "WARNING: all sorts of things break if you insist on XML output being US-ASCII."
  puts "See https://github.com/cabo/kramdown-rfc2629/issues/65"
end