[rfc-i] entities and unicode

Miek Gieben <miek@miek.nl> Fri, 03 December 2021 10:18 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EA9F43A012A; Fri, 3 Dec 2021 02:18:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.45
X-Spam-Level:
X-Spam-Status: No, score=-2.45 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (2048-bit key) reason="fail (message has been altered)" header.d=miek-nl.20210112.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h7TP244HE0VN; Fri, 3 Dec 2021 02:18:02 -0800 (PST)
Received: from rfc-editor.org (rfc-editor.org [IPv6:2001:1900:3001:11::31]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A70733A0139; Fri, 3 Dec 2021 02:17:59 -0800 (PST)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id 7151415D30F; Fri, 3 Dec 2021 02:17:59 -0800 (PST)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id 456D815D30F for <rfc-interest@rfc-editor.org>; Fri, 3 Dec 2021 02:17:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Authentication-Results: rfcpa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=miek-nl.20210112.gappssmtp.com
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2xREZ2BePPDP for <rfc-interest@rfc-editor.org>; Fri, 3 Dec 2021 02:17:55 -0800 (PST)
Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by rfc-editor.org (Postfix) with ESMTPS id 7323411FDE5 for <rfc-interest@rfc-editor.org>; Fri, 3 Dec 2021 02:17:54 -0800 (PST)
Received: by mail-ed1-x535.google.com with SMTP id o20so9346502eds.10 for <rfc-interest@rfc-editor.org>; Fri, 03 Dec 2021 02:17:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=miek-nl.20210112.gappssmtp.com; s=20210112; h=date:from:to:subject:message-id:mime-version:content-disposition; bh=Dp+o4OA//p2jW8BhHWwTghw8ePahYo6AcpFAyj1Y7Ek=; b=XYz67FL5Ac4LnbT2cIZrgKa7Of+k0I37Zl4f5tBIm+RMpvEshDCVE2UlquQzuTEHoX 59rkeFPokiwqqBitlim7V3jZZW0jinVjVXuVsDF8FkwBC5lt+r+ncfujfwQaiNtOAfck uh3O5H3T8WgrXG4EWQ32kgbWahuy3OotffvbBAkZsWscVylq/pv8yEyJu/TzLIulXJJl Jzx78OVp7ailJa69OgiU1Ubc8NXwPtu26XQkk0wIDSMFqIyF8yo6/HyDbTuBn/EafWvq 0LoGzhYBF9rOUdMWhrqPDOEJWS8f1Rye+tga77UeW14Y8bwMJNKXeH5jm+kEXuL08fx8 twEA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-disposition; bh=Dp+o4OA//p2jW8BhHWwTghw8ePahYo6AcpFAyj1Y7Ek=; b=MXn7dV+xlWVi8DduPLZF9ayVUN7Jk6NlOqYHKpQm/y8+dn+DkALep7j/72Eqd+nSgp HbZpZLwgvbbNfxMoSGnJrTwfn66F585Z+/w1wbCgh0Ez2v+0rvxjbn6HiWac/tBvmMqp Fnng2n4jjZFbaf7YRX86AmXLfVe+PnyxPCmGotl30v+FhvVj57AvhAsnk3uTYI2esvoU n5SgoARxvLOSsEOqbFMsoYyJoXnQGo5EomK4KPrmVEFRPLkvwitN8HwZjZFomQqQB3pQ C94MHVZq12hWJetdzU2x53LH3lRDsyrcUKBs24uiSIVxO3owi/Ms76/mVuwwcKJntEqM VvLg==
X-Gm-Message-State: AOAM5312dU3e4PoQ5nUaDXQPsziDckJtksGRL7dyurwsRYdrMjeLqUc4 jcIeYqoYDUCUwElpW1J7wOa4besYuGcUyg==
X-Google-Smtp-Source: ABdhPJyA0MG57tDP429IXgUSIGzK9jLZ/M9mnhbDM6I/y76okaLxfispLBdkVoa1frIURtnNNQ2LDg==
X-Received: by 2002:a17:906:4d4a:: with SMTP id b10mr23116717ejv.89.1638526671502; Fri, 03 Dec 2021 02:17:51 -0800 (PST)
Received: from miek.nl (dhcp-077-251-206-012.chello.nl. [77.251.206.12]) by smtp.gmail.com with ESMTPSA id qa31sm1569805ejc.33.2021.12.03.02.17.49 for <rfc-interest@rfc-editor.org> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Dec 2021 02:17:50 -0800 (PST)
Date: Fri, 03 Dec 2021 11:17:48 +0100
From: Miek Gieben <miek@miek.nl>
To: RFC Interest <rfc-interest@rfc-editor.org>
Message-ID: <20211203101748.GA26129@miek.nl>
MIME-Version: 1.0
Content-Disposition: inline
Subject: [rfc-i] entities and unicode
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

Hello all,

In https://www.rfc-editor.org/materials/FAQ-xml2rfcv3.html it says I need to wrap unicode
characters in <u> tags (which is already a bit confusing:
https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/205).

Due to some other bug, I was testing (html) entities and if I put:

     <t>this is some dashes &#x2011;</t>

In the XML, xml2rfc --text renders the "-" (but then the proper unicode one) in the text
document... which now puzzles me.

Is <u> really needed? Or are entities not allowed? Or something else that I'm not seeing?

Thanks for any insight.


/Miek

--
Miek Gieben
_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest