[OPSEC] [IETF Successes and Failures] #3 (component1): xml2rfc: hyphen not escaped in unicode.py
opsec issue tracker <trac@tools.ietf.org> Mon, 04 January 2021 12:00 UTC
Return-Path: <trac@tools.ietf.org>
X-Original-To: opsec@ietfa.amsl.com
Delivered-To: opsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AD4323A0C6A for <opsec@ietfa.amsl.com>; Mon, 4 Jan 2021 04:00:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 77NUpd6xjq1V for <opsec@ietfa.amsl.com>; Mon, 4 Jan 2021 04:00:18 -0800 (PST)
Received: from zinfandel.tools.ietf.org (zinfandel.tools.ietf.org [64.170.98.42]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5B4AC3A0C67 for <opsec@ietf.org>; Mon, 4 Jan 2021 04:00:18 -0800 (PST)
Received: from localhost ([127.0.0.1]:54428 helo=zinfandel.tools.ietf.org ident=www-data) by zinfandel.tools.ietf.org with esmtp (Exim 4.80) (envelope-from <trac@tools.ietf.org>) id 1kwOX3-0007ZB-Hb; Mon, 04 Jan 2021 04:00:17 -0800
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: opsec issue tracker <trac@tools.ietf.org>
X-Trac-Version: 0.12.5
Precedence: bulk
Cc: opsec@ietf.org
Auto-Submitted: auto-generated
X-Mailer: Trac 0.12.5, by Edgewall Software
To: roger@rogerprice.org
X-Trac-Project: IETF Successes and Failures
Date: Mon, 04 Jan 2021 12:00:17 -0000
Reply-To: opsec@ietf.org
X-URL: http://tools.ietf.org/misc/outcomes/
X-Trac-Ticket-URL: https://trac.tools.ietf.org/misc/outcomes/ticket/3
Message-ID: <068.6ad53292a0eaefd51561ae1bc31a242f@tools.ietf.org>
X-Trac-Ticket-ID: 3
X-SA-Exim-Connect-IP: 127.0.0.1
X-SA-Exim-Rcpt-To: roger@rogerprice.org, opsec@ietf.org
X-SA-Exim-Mail-From: trac@tools.ietf.org
X-SA-Exim-Scanned: No (on zinfandel.tools.ietf.org); SAEximRunCond expanded to false
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/opsec/LredhjJvSN3RfFtPnWLy-NzDKLE>
Subject: [OPSEC] [IETF Successes and Failures] #3 (component1): xml2rfc: hyphen not escaped in unicode.py
X-BeenThere: opsec@ietf.org
X-Mailman-Version: 2.1.29
List-Id: opsec wg mailing list <opsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/opsec>, <mailto:opsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/opsec/>
List-Post: <mailto:opsec@ietf.org>
List-Help: <mailto:opsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/opsec>, <mailto:opsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Jan 2021 12:00:21 -0000
#3: xml2rfc: hyphen not escaped in unicode.py Debian Stretch, uname -a reports: Linux maria 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23) x86_64 GNU/Linux Command python3 reports: Python 3.8.1 (default, Feb 22 2020, 11:56:23) [GCC 6.3.0 20170516] on linux When I enter command "xml2rfc -h" repeatedly, it fails half the time with error message: Traceback (most recent call last): File "/usr/bin/xml2rfc", line 7, in <module> from xml2rfc.run import main File "/mnt/home/rprice/.local/lib/python3.5/site- packages/xml2rfc/__init__.py", line 14, in <module> from xml2rfc.parser import XmlRfcError, CachingResolver, XmlRfcParser, XmlRfc File "/mnt/home/rprice/.local/lib/python3.5/site- packages/xml2rfc/parser.py", line 20, in <module> from xml2rfc.writers import base File "/mnt/home/rprice/.local/lib/python3.5/site- packages/xml2rfc/writers/__init__.py", line 2, in <module> from xml2rfc.writers.base import RfcWriterError File "/mnt/home/rprice/.local/lib/python3.5/site- packages/xml2rfc/writers/base.py", line 30, in <module> from xml2rfc.util.unicode import ( punctuation, unicode_replacements, unicode_content_tags, bare_unicode_tags, File "/mnt/home/rprice/.local/lib/python3.5/site- packages/xml2rfc/util/unicode.py", line 260, in <module> punctuation_re = re.compile(r'[%s]'%''.join(list(punctuation.keys()))) File "/usr/lib/python3.5/re.py", line 224, in compile return _compile(pattern, flags) File "/usr/lib/python3.5/re.py", line 293, in _compile p = sre_compile.compile(pattern, flags) File "/usr/lib/python3.5/sre_compile.py", line 536, in compile p = sre_parse.parse(p, flags) File "/usr/lib/python3.5/sre_parse.py", line 829, in parse p = _parse_sub(source, pattern, 0) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 575, in _parse raise source.error(msg, len(this) + 1 + len(that)) sre_constants.error: bad character range −-“ at position 3 At line 260 in .../xml2rfc/util/unicode.py I inserted two lines to display the value of punctuation.keys() 259-punctuation.update(unicode_quote_replacements) 260-import sys 261-print("unicode.py: list(punctuation.keys()) {}" .format(list(punctuation.keys())),file=sys.stderr) 262-punctuation_re = re.compile(r'[%s]'%''.join(list(punctuation.keys()))) When xml2rfc succeeded, I saw unicode.py: list(punctuation.keys()) = ['\u2002', '-', '‐', '′', '–', '´', '…', '’', '−', '„', '—', '\u2009', '‚', '‘', '”', '“', '\u2003'] unicode.py: list(punctuation.keys()) = ['´', '„', '\u2003', '‚', '−', '“', '’', '‘', '-', '…', '\u2009', '—', '–', '”', '′', '‐', '\u2002'] When xml2rfc failed, I saw unicode.py: list(punctuation.keys()) = ['´', '\u2002', '−', '-', '“', '„', '‘', '′', '‚', '–', '…', '’', '‐', '”', '—', '\u2009', '\u2003'] unicode.py: list(punctuation.keys()) = ['‐', '\u2003', '„', '\u2002', '\u2009', '‚', '—', '’', '−', '…', '‘', '′', '-', '“', '”', '´', '–'] It looks as if the character "-" is being wrongly interpreted by re as a range indicator. Perhaps it should be escaped. My apologies for the wretched formatting of this message. Roger -- ----------------------------------+---------------------- Reporter: roger@rogerprice.org | Owner: somebody Type: defect | Status: new Priority: major | Milestone: Component: component1 | Version: Keywords: re escape hyphen | ----------------------------------+---------------------- Ticket URL: <https://trac.tools.ietf.org/misc/outcomes/ticket/3> IETF Successes and Failures <http://tools.ietf.org/misc/outcomes/> IETF Successes and Failures
- [OPSEC] [IETF Successes and Failures] #3 (compone… opsec issue tracker