[yaco-idsubmit-tool] Testing Notes / Henrik / 17 March
Henrik Levkowetz <henrik@levkowetz.com> Sat, 19 March 2011 22:06 UTC
Return-Path: <henrik@levkowetz.com>
X-Original-To: yaco-idsubmit-tool@core3.amsl.com
Delivered-To: yaco-idsubmit-tool@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 09D1D3A69B8 for <yaco-idsubmit-tool@core3.amsl.com>; Sat, 19 Mar 2011 15:06:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.669
X-Spam-Level:
X-Spam-Status: No, score=-102.669 tagged_above=-999 required=5 tests=[AWL=0.130, BAYES_00=-2.599, GB_I_LETTER=-2, J_CHICKENPOX_24=0.6, J_CHICKENPOX_25=0.6, J_CHICKENPOX_57=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qdI12L-pJiaK for <yaco-idsubmit-tool@core3.amsl.com>; Sat, 19 Mar 2011 15:06:47 -0700 (PDT)
Received: from merlot.tools.ietf.org (merlot.tools.ietf.org [IPv6:2a01:3f0:0:31:214:22ff:fe21:bb]) by core3.amsl.com (Postfix) with ESMTP id 60F9D3A69B3 for <yaco-idsubmit-tool@ietf.org>; Sat, 19 Mar 2011 15:06:46 -0700 (PDT)
Received: from 90-230-136-60-no45.tbcn.telia.com ([90.230.136.60]:62836 helo=vigonier.lan) by merlot.tools.ietf.org with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.74) (envelope-from <henrik@levkowetz.com>) id 1Q14JS-0006AI-JC; Sat, 19 Mar 2011 23:08:04 +0100
Message-ID: <4D852938.80708@levkowetz.com>
Date: Sat, 19 Mar 2011 23:07:52 +0100
From: Henrik Levkowetz <henrik@levkowetz.com>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9
MIME-Version: 1.0
To: yaco-idsubmit-tool@ietf.org
Content-Type: multipart/mixed; boundary="------------080809090707050005040004"
X-SA-Exim-Connect-IP: 90.230.136.60
X-SA-Exim-Rcpt-To: yaco-idsubmit-tool@ietf.org, esanchez@yaco.es, henrik-sent@levkowetz.com
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 22 Mar 2010 06:51:10 +0000)
X-SA-Exim-Scanned: Yes (on merlot.tools.ietf.org)
Subject: [yaco-idsubmit-tool] Testing Notes / Henrik / 17 March
X-BeenThere: yaco-idsubmit-tool@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Discussion of the Yaco / I-D Submission Tool Project <yaco-idsubmit-tool.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/yaco-idsubmit-tool>, <mailto:yaco-idsubmit-tool-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/yaco-idsubmit-tool>
List-Post: <mailto:yaco-idsubmit-tool@ietf.org>
List-Help: <mailto:yaco-idsubmit-tool-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/yaco-idsubmit-tool>, <mailto:yaco-idsubmit-tool-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Mar 2011 22:06:49 -0000
Hi, Here are some additional testing notes. I've already covered some of these with Emilio (the developer) on jabber; but also send them to the list for info and for the record. This testing covers author extraction from 191 drafts, including all of the drafts which wasn't accepted for automatic posting by the current submission tool during the period leading up to the Prague meeting posting cut-off. Already covered: * The author extraction code supplied to Yaco handles obfuscated email addresses such as "joe (at) example.com", but does not handle the case "joe at example.com" which it probably should. * The upload form should also catch possible exceptions from the author extraction code, to let people know there's a problem in a better format than a 'Server 500' error. Pointed out by Emilio. * All -00 submissions were treated the same way, requiring WG Chair permissions to post -- but this should not apply to individual submissions, only to new WG documents. New: (For the issues below which affect the ietf/utils/draft.py module, a patch file is enclosed.) * Extraction of Title which don't have the draft name on a separate page fails. See for instance this example: http://www.ietf.org/staging/draft-ma-cdni-publisher-use-cases-00.txt The regex should maybe be updated to permit but not require a newline before the draft filename: '(?:\n\s*\n\s*)((.+\n){1,2}(.+\n?))(\s+<?draft-\S+\s*\n)\s*\n' Fixed in patch. * If there are blank lines before the start of the author list on the first page, the author extraction will fail. This sometimes happens when there's junk at the start of a draft, see for instance http://www.ietf.org/id/draft-ietf-mpls-tp-process-00.txt . Fixed in patch. * Sometimes the Authors' Addresses section lists authors with the same workplace address on the same line: "Sam Spade and Joe Smith". This needs a fix in the author extraction code. Provided in the patch. * Sometimes the order of first name, surname is different on the first page and in the author list, and sometimes the surname is uppercase in one place, but not in the other. This also needs a fix in the author extraction code. Provided in the patch. * The header stripping code had a bug, where multiple blank lines could be replaced by a single blank line in the stripped text, which could mess up title extraction. Fixed in the patch. * Title space normalization should be done also for titles from the 'unusual title format' code branch of the title extraction code. Fix provided in the patch. * Company names on the first page are sometimes rendered with different case than in the Authors' Addresses section. Fixed in the patch. * Some drafts list the draft filename _before_ the title, rather than after the title. Permit this too. Covered in the patch. * Spanish names can be shown as either <given_name> <fathers_first_surname> <mothers_first_surname> or less formally as <given_name> <fathers_first_surname> If the first form is used in the Authors' Addresses section, but the second form (with the given name possibly abbreviated to its first letter) the author extraction will fail. Fix provided in patch. * Drafts containing tabs will be caught by idnits during I-D submission, but in case the drafts.py module is used independently from idnits, convert tabs to spaces in order for the author extraction and other methods to work as expected. Example: recently submitted draft draft-bergeron-payload-rtpfec-rs-00.txt. Fix provided in patch. * Found a draft with a previously unhandled header/footer format: draft-fang-mpls-tp-oam-toolset-01.txt. Tweak needed for header/footer stripping. Fix provided in patch. The patch also includes code to extract lists of references used in the document. This is not expected to be of use for the submission tool, but later in other parts of the datatracker. Best, Henrik
- [yaco-idsubmit-tool] Testing Notes / Henrik / 17 … Henrik Levkowetz