Re: [Tools-discuss] RFCmarkup v1.28
Henrik Levkowetz <henrik@levkowetz.com> Thu, 27 July 2006 13:08 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1G65au-0003lL-Pj; Thu, 27 Jul 2006 09:08:04 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1G65at-0003lG-1a for tools-discuss@ietf.org; Thu, 27 Jul 2006 09:08:03 -0400
Received: from av10-2-sn2.hy.skanova.net ([81.228.8.182]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1G65ar-0005hR-HU for tools-discuss@ietf.org; Thu, 27 Jul 2006 09:08:03 -0400
Received: by av10-2-sn2.hy.skanova.net (Postfix, from userid 502) id CCB813811D; Thu, 27 Jul 2006 15:08:00 +0200 (CEST)
Received: from smtp4-1-sn2.hy.skanova.net (smtp4-1-sn2.hy.skanova.net [81.228.8.92]) by av10-2-sn2.hy.skanova.net (Postfix) with ESMTP id BC12F38000; Thu, 27 Jul 2006 15:08:00 +0200 (CEST)
Received: from shiraz.levkowetz.com (81-232-110-214-no16.tbcn.telia.com [81.232.110.214]) by smtp4-1-sn2.hy.skanova.net (Postfix) with ESMTP id A8B1137E42; Thu, 27 Jul 2006 15:08:00 +0200 (CEST)
Received: from localhost ([127.0.0.1]) by shiraz.levkowetz.com with esmtp (Exim 4.62) (envelope-from <henrik@levkowetz.com>) id 1G65ac-0007fV-CY; Thu, 27 Jul 2006 15:08:00 +0200
Message-ID: <44C8BAA2.8000404@levkowetz.com>
Date: Thu, 27 Jul 2006 15:07:46 +0200
From: Henrik Levkowetz <henrik@levkowetz.com>
User-Agent: Thunderbird 1.5.0.4 (Macintosh/20060530)
MIME-Version: 1.0
To: Elwyn Davies <elwynd@dial.pipex.com>
Subject: Re: [Tools-discuss] RFCmarkup v1.28
References: <44C78E71.9050003@levkowetz.com> <44C7B93E.7020105@dial.pipex.com> <44C7C471.9020908@levkowetz.com> <44C7D035.9000209@dial.pipex.com> <44C7F662.3050803@levkowetz.com> <44C88FCF.2070801@dial.pipex.com> <44C8991B.6040509@levkowetz.com> <44C8AA9B.6060109@dial.pipex.com>
In-Reply-To: <44C8AA9B.6060109@dial.pipex.com>
X-Enigmail-Version: 0.94.0.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-SA-Exim-Connect-IP: 127.0.0.1
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Scanned: No (on shiraz.levkowetz.com); SAEximRunCond expanded to false
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 6d95a152022472c7d6cdf886a0424dc6
Cc: Tools Team Discussion <tools-discuss@ietf.org>
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/tools-discuss>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
Errors-To: tools-discuss-bounces@ietf.org
Hi Elwyn, on 2006-07-27 13:59 Elwyn Davies said the following: > Hi Henrik. > > A couple of thoughts: > > Without doing major parsing you could use re capabilities to separate > off the header part and only apply certain rules to the parts. > Thus: > Use findall to find all the blank lines. > Identify the start of the title as the first group of blank lines inside > the document (i.e., ignoring any blank lines at the beginning) - use > group and start to get positions. > Chop the data up and apply re's as required. then resplice. Yes, that could be a possibility. > Some more below... > > /Elwyn > > Henrik Levkowetz wrote: >> Hi Elwyn, >> >> Thanks for more feedback; >> >> on 2006-07-27 12:05 Elwyn Davies said the following: >> >>> IE now looks fine - printing on both Firefox and IE looks good. BTW I >>> realized that the 75% on IE is not how it scales the printing but is a >>> way of zooming the on screen display of the preview. Doh! >>> >>> The product of a paranoid's breakfast: >>> >>> 1. http://www1.tools.ietf.org/html/rfc3410 : The second item (2.2) that >>> claims to be on page 4 in the ToC doesn't get a link (something to do >>> with longish title and only one period in the leader?) >>> >> >> Right. Won't fix. >> > One possibility if you did want to try would be to identify the leader > end/number/end of line pattern from the early part of the ToC and then > apply it throughout ToC. Right. As soon as you move from a straight overall re approach to something more stateful, such as identifying and splitting off doc header, title, toc etc, you can do better handling. I'll consider this for a later version, but it should be a major re-write, to change the approach, rather than trying to tweak this in here and there, I think. > Not a big deal. >> >>> 2. >>> http://www1.tools.ietf.org/html/draft-aoun-middlebox-token-authentication-00: >>> The section headers are now <h2> but the title is still body text. This >>> one has 'Expires on' >>> >> >> Mmm. Right. Won't fix now (not trivial), but maybe later. >> > See above. >> >>> 3. http://www1.tools.ietf.org/html/draft-ietf-ipngwg-icmp-v3-07: (no >>> 'Expires:' at all) - how about not looking for the title etc until after >>> the second group of totally blank lines (or the first group that isn't >>> at the start of the document)? >>> >> >> Can you put that in a regexp ;-) ? >> >> The boldfaced 11 July is taken to be a section - I tried to require the >> section numbers to contain a period, but had to revert that as too many >> document have major section numbers without a period. >> >> Currently the title is the first group of lines which are preceded by >> a line which begins with "Category:" or ends in a year, then a blank >> line. I'll look at changing that to the pattern you suggest, for the >> next version. >> > See above. Agreed. This might be an easier fix than the rest. >>> 4. http://www1.tools.ietf.org/html/draft-aoun-mgcp-nat-package-02: This >>> is a very badly formatted draft.. you fixed the link in the ToC problem >>> but it has the same problem as #2 above and thereafter the markup of >>> section headers is semi-random. Sections 1, 2 and 3 miss out; the first >>> three non-empty body text lines on p3 become a header. Sections 3.x are >>> found but not s4 onwards. s4.x you would have difficulty with as they >>> are indented. Horrible! I think I owe you a beer if you can canonicalize >>> this one! >>> >> >> Actually, it seems you're looking at an old cached copy - after refreshing >> here, this one looks pretty good too me, on all 3 servers. >> >> > Right.. this is pretty much OK apart from the s4.x which are > additionally indented. ... let's not bother too much. Right. Regards, Henrik _______________________________________________ Tools-discuss mailing list Tools-discuss@ietf.org https://www1.ietf.org/mailman/listinfo/tools-discuss
- [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- Re: [Tools-discuss] RFCmarkup v1.28 Julian Reschke
- Re: [Tools-discuss] RFCmarkup v1.28 Elwyn Davies
- Re: [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- Re: [Tools-discuss] RFCmarkup v1.28 Elwyn Davies
- Re: [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- Re: [Tools-discuss] RFCmarkup v1.28 Elwyn Davies
- Re: [Tools-discuss] RFCmarkup v1.28 Julian Reschke
- Re: [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- Re: [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- Re: [Tools-discuss] RFCmarkup v1.28 Julian Reschke
- Re: [Tools-discuss] RFCmarkup v1.28 Elwyn Davies
- Re: [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- Re: [Tools-discuss] RFCmarkup v1.28 Henrik Levkowetz
- [Tools-discuss] Re: RFCmarkup v1.28 Frank Ellermann
- Re: [Tools-discuss] Re: RFCmarkup v1.28 Henrik Levkowetz