Re: [Rfc-markdown] [xml2rfc] [irsg] character sets, was UPDATE regarding <u>

Martin Thomson <mt@lowentropy.net> Mon, 06 March 2023 00:02 UTC

Return-Path: <mt@lowentropy.net>
X-Original-To: rfc-markdown@ietfa.amsl.com
Delivered-To: rfc-markdown@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6A246C14CF01 for <rfc-markdown@ietfa.amsl.com>; Sun, 5 Mar 2023 16:02:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.796
X-Spam-Level:
X-Spam-Status: No, score=-2.796 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=lowentropy.net header.b="Eu0QemFd"; dkim=pass (2048-bit key) header.d=messagingengine.com header.b="JFVtstqf"
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rjEGONy94dmx for <rfc-markdown@ietfa.amsl.com>; Sun, 5 Mar 2023 16:02:19 -0800 (PST)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9D33DC14CEED for <rfc-markdown@ietf.org>; Sun, 5 Mar 2023 16:02:19 -0800 (PST)
Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 7CC1C5C00C1; Sun, 5 Mar 2023 19:02:18 -0500 (EST)
Received: from imap41 ([10.202.2.91]) by compute6.internal (MEProxy); Sun, 05 Mar 2023 19:02:18 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lowentropy.net; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1678060938; x= 1678147338; bh=M1PVEIbHG7FQ+pTrKKXxkjproDieK+toWgsPXU3TigI=; b=E u0QemFdrGDoQ/sz7gHqzdhsXWZmA8JVlhG3k+M1EXnj9oYvilzfqaAvQObKSTLdz Mo2dzLXpcIqBC7A5mOHw+9Io0q7PCN/VW7wUqQ7rW5lvyAZhZcMGypMguEmZLdYC kAL4aujVvw0o5ct5yXe9FkTyVWxR69n015t+ecB0lsR8U/+nlubR3He6AR0eZo5m 5xoudqxirpSBTuxciM6G2JWZkRkkqZSsHNXURPxe3ke2rxMiUiqhZyqN5Zp82mqD UhY1GWObo0drTiY8tFZYNBrsPeMVlPdLyGHdbsverq7oQJgSdemL0yg7NSBYmViY nDzpLhFKIabZXTN8h7uqQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1678060938; x=1678147338; bh=M1PVEIbHG7FQ+ pTrKKXxkjproDieK+toWgsPXU3TigI=; b=JFVtstqf5F7mCNvedHefFr4atMPX/ 3zK59yIOeerEu7TUZmC48BTsu/ZMl9+76IClIUBVuSQbvNXsEaQY3ND1tdwmBu/G 1DLk6yArAn2X+i2G8Guiqgf37ePHbEs/AlNDz5hGJDfc3iwvAq9HsdItW4SybCdH DGtyPJss7xEVTXRW/V8/Zt7jLuNEkJR4mZ505+xU8bwL3VhhmTtiKw03qB8qRa4Z TLObW+b2AEF2FktgNTo2eGLQSVBMa/26LGXlXzQLgUQAvdSe6zDP6q7mUxfPdV4Z nKpg1IDzo2w83rvq5UctHkO+2d6KxXWd/fBevAYaJsE3b106R1Yg1qkhA==
X-ME-Sender: <xms:ii0FZJJgKVKVeiAYhQCYoI8jyNlYkgTTH4x3X2bBnNxTml1OwhmT9g> <xme:ii0FZFJm7xILhzI548VLDF2r5DQ7Pw82IV5OsD8Rt5U4oqhW-YDWajaflT45gQGjc iqj9EEkWZKMKeOdn94>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvddthedgudejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvfevufgtsehttdertderredtnecuhfhrohhmpedfofgr rhhtihhnucfvhhhomhhsohhnfdcuoehmtheslhhofigvnhhtrhhophihrdhnvghtqeenuc ggtffrrghtthgvrhhnpeetffffhffgvdelueehudffvddtjeffvdeifeejhfeufefhlefg hfejieevueekgeenucffohhmrghinhepghhithhhuhgsrdgtohhmnecuvehluhhsthgvrh fuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhhtsehlohifvghnthhrohhp hidrnhgvth
X-ME-Proxy: <xmx:ii0FZBuV1Iwn5RRcNIvkUeEub_KVOiDQ6CX7fVTNW0sQcTQ7RUyGSw> <xmx:ii0FZKYqQRirNL238JCszNitoZKP2S2p35SzObjzvp6jqoeRiLNFiw> <xmx:ii0FZAYRXGEc5DF8PRmpDUUKZh9YTSflWjSobyfT0cNt1W2BxVA4bw> <xmx:ii0FZG1RB0nKQXHl8kVmz4B-EuGg6RqAe5XZycGEYAYpWzJu4oq3Sw>
Feedback-ID: ic129442d:Fastmail
Received: by mailuser.nyi.internal (Postfix, from userid 501) id 430FD234007B; Sun, 5 Mar 2023 19:02:18 -0500 (EST)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.9.0-alpha0-183-gbf7d00f500-fm-20230220.001-gbf7d00f5
Mime-Version: 1.0
Message-Id: <15c105fb-7f9e-45ab-ac00-d161a51fa2d1@betaapp.fastmail.com>
In-Reply-To: <940B4C2A-9253-4E05-AF01-0BA123BAE072@tzi.org>
References: <20230304190316.05346A51F3D2@ary.qy> <5081F069-705D-4707-85EB-DBA11D594D19@tzi.org> <a39b8c32-6f4f-4caf-8400-1846ea25faa2@betaapp.fastmail.com> <940B4C2A-9253-4E05-AF01-0BA123BAE072@tzi.org>
Date: Mon, 06 Mar 2023 11:01:58 +1100
From: Martin Thomson <mt@lowentropy.net>
To: Carsten Bormann <cabo@tzi.org>
Cc: rfc-markdown@ietf.org
Content-Type: text/plain
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-markdown/iErZUXY-MJqcUUrJFA3KrPWGoCs>
Subject: Re: [Rfc-markdown] [xml2rfc] [irsg] character sets, was UPDATE regarding <u>
X-BeenThere: rfc-markdown@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "rfc-markdown is a discussion list for people writing I-Ds and RFCs in Markdown and the authors of the tools used for that." <rfc-markdown.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-markdown/>
List-Post: <mailto:rfc-markdown@ietf.org>
List-Help: <mailto:rfc-markdown-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Mar 2023 00:02:24 -0000

On Mon, Mar 6, 2023, at 10:55, Carsten Bormann wrote:
> I think it would be interesting to create a variety of targeted text 
> extractors for RFCXML.

xmllint --xpath tends to work pretty well for that use case, at least in my experience.

I also have a python library that I had unrealized plans to use: https://github.com/martinthomson/rfc-extract  (this supports markdown also)