Re: [rfc-i] looking for a volunteer to write a simple script

Joe Touch <touch@strayalpha.com> Fri, 12 July 2019 18:08 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 05D8B1207DC for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Fri, 12 Jul 2019 11:08:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.749
X-Spam-Level:
X-Spam-Status: No, score=-4.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (2048-bit key) reason="fail (message has been altered)" header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EtmcKVhDK-hS for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Fri, 12 Jul 2019 11:08:20 -0700 (PDT)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B41C120602 for <rfc-interest-archive-eekabaiReiB1@ietf.org>; Fri, 12 Jul 2019 11:08:08 -0700 (PDT)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id 1325CB81427; Fri, 12 Jul 2019 11:08:03 -0700 (PDT)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id 085BBB81427; Fri, 12 Jul 2019 11:08:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Authentication-Results: rfcpa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gqtxx7zt2Tsc; Fri, 12 Jul 2019 11:08:00 -0700 (PDT)
Received: from server217-3.web-hosting.com (server217-3.web-hosting.com [198.54.115.226]) by rfc-editor.org (Postfix) with ESMTPS id 6E945B81423; Fri, 12 Jul 2019 11:08:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id: Content-Transfer-Encoding:Cc:Date:In-Reply-To:From:Subject:Mime-Version: Content-Type:Sender:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=gWGLEFqIQeg2ey/gstJw61fI6AS6gFg02T503qizSvM=; b=seGuNbpZv/0EBT7MtaFGPnYWj tg58IXYY7WWuEY7tSvSFYhWAmTG2pWvTR/R3eu18M3p4i/BEB8MwdKxk4KvqOqzxp+7VKfIQ9c0WA K4ecwKfwSQazYhYpqUztngJ/E4HhYdXkCMGjM1WFs2lLzNZpC2eJxh8k2R2qfr1bYDl9ud1ChoE12 Hec5XybbTwJH8iuGzPWn0ftmZugZOMjaC/BJbR8s1CDNKCxAjlFwxog+sicvAiStBVaFsOED47+VC ejZlzzF0fiZJ1Vn93hFCRSs5sp01zE9j62NDtnb6UBSLTByDBIG9AWjzdm2DjNosFTJpssZryvR0n 5dZzNYs8w==;
Received: from [38.64.80.138] (port=56117 helo=[172.21.27.64]) by server217.web-hosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92) (envelope-from <touch@strayalpha.com>) id 1hlzxe-004AxD-LP; Fri, 12 Jul 2019 14:08:04 -0400
Mime-Version: 1.0 (1.0)
From: Joe Touch <touch@strayalpha.com>
X-Mailer: iPhone Mail (16F203)
In-Reply-To: <A8576629-E352-4016-ACA7-1B7160370E26@tzi.org>
Date: Fri, 12 Jul 2019 11:07:58 -0700
Message-Id: <31E57D31-967D-42AD-937F-31327B9F931D@strayalpha.com>
References: <62c8413d-c735-4ec3-8b22-eb0fa5356636@Spark> <38d0704f-348c-4ec0-9d94-340747960201@Spark> <e86b8894-4d7a-4c9d-3476-0221a94c9eb0@gmx.de> <13A89BE6-8654-49C4-9FBA-2F709EE0BA1B@rfc-editor.org> <0504f606252c476f66804e338fa460b4@strayalpha.com> <A8576629-E352-4016-ACA7-1B7160370E26@tzi.org>
To: Carsten Bormann <cabo@tzi.org>
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - rfc-editor.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Subject: Re: [rfc-i] looking for a volunteer to write a simple script
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Cc: Julian Reschke <julian.reschke@gmx.de>, RFC Interest <rfc-interest@rfc-editor.org>, Heather Flanagan <rse@rfc-editor.org>
Content-Type: multipart/mixed; boundary="===============4058379925151627113=="
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>


> On Jul 12, 2019, at 10:52 AM, Carsten Bormann <cabo@tzi.org> wrote:
> 
> I would spend a couple of \b (or \< and \>, I don't remember Perl well enough), so MARSHALL in MAYNARD isn't hit. (Theresa is hopeless, but we knew that.)

Ah yes - fixed. 

> 
> Sent from mobile
> 
>> On 12. Jul 2019, at 19:43, Joe Touch <touch@strayalpha.com> wrote:
>> 
>> This will do the trick:
>> 
>> 
>> 
>> perl -0777 -pe "s/(((MUST|SHOULD|SHALL)(\s+NOT)?)|((NOT\s+)?RECOMMENDED)|MAY|OPTIONAL|REQUIRED)/<bcp14>\$1<\/bcp14>/g" INFILE.xml > OUTFILE.xml
>> 
>> (replace INFILE.xml and OUTFILE.xml with your filenames)
>> 
>> If you want it to edit in-place (riskly, but simpler if you work on a copy anyway):
>> 
>> perl -0777 -i -pe "s/(((MUST|SHOULD|SHALL)(\s+NOT)?)|((NOT\s+)?RECOMMENDED)|MAY|OPTIONAL|REQUIRED)/<bcp14>\$1<\/bcp14>/g" INFILE.xml > OUTFILE.xml
>> 
>>  
>> 
>> 
>>> On 2019-07-12 10:26, Heather Flanagan wrote:
>>> 
>>> 
>>> 
>>>> On Jul 12, 2019, at 10:23 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
>>>> 
>>>> On 12.07.2019 18:55, Heather Flanagan wrote:
>>>> Hola a todos!
>>>> 
>>>> The RFC Editor has the need for a comparatively simple script that would
>>>> automatically add <bcp14></bcp14> tags to requirement language in v3 RFCs.
>>>> 
>>>> Specifically, this would take a v3 XML input file, and create a v3 XML
>>>> output file with <bcp14></bcp14> added around each instance of a 2119
>>>> keyword in the file. (MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT,
>>>> SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL)
>>>> 
>>>> Anyone up for helping us out with that?
>>>> 
>>>> Thanks! Heather
>>>> ...
>>>> 
>>>> The tricky part is to find the right instances. For instance, what if it
>>>> appears in a quote, or in artwork? Or if "SHALL NOT" is across a line
>>>> break...
>>>> 
>>>> So the output will require sanity checking.
>>>  
>>> Well, yes, of course. We're aiming for a rough pass to catch maybe 80% of the situations. Everything will still need to be reviewed.
>>> 
>>>> 
>>>> I assume that the tool is supposed to preserve whitespace, line breaks
>>>> etc? This essentially rules out running the input through an XML parser...
>>> 
>>> Seriously, we're not aiming for that robust right now. It doesn't have to be perfect, it just has to help.
>>>  
>>> -Heather
>>>  
>>> 
>>> 
>>> _______________________________________________
>>> rfc-interest mailing list
>>> rfc-interest@rfc-editor.org
>>> https://www.rfc-editor.org/mailman/listinfo/rfc-interest
>> _______________________________________________
>> rfc-interest mailing list
>> rfc-interest@rfc-editor.org
>> https://www.rfc-editor.org/mailman/listinfo/rfc-interest
_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest