Re: [Tools-discuss] RFCmarkup v1.28

Elwyn Davies <elwynd@dial.pipex.com> Thu, 27 July 2006 11:56 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1G64Ta-0007Je-Tg; Thu, 27 Jul 2006 07:56:26 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1G64Ta-0007JZ-03 for tools-discuss@ietf.org; Thu, 27 Jul 2006 07:56:26 -0400
Received: from a.painless.aaisp.net.uk ([2001:8b0:0:81::51bb:5133] helo=smtp.aaisp.net.uk) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1G64TY-0006Ci-C7 for tools-discuss@ietf.org; Thu, 27 Jul 2006 07:56:25 -0400
Received: from 247.254.187.81.in-addr.arpa ([81.187.254.247] helo=[127.0.0.1]) by smtp.aaisp.net.uk with esmtps (TLSv1:AES256-SHA:256) (Exim 4.43) id 1G64TS-0000RE-B6; Thu, 27 Jul 2006 12:56:18 +0100
Message-ID: <44C8AA9B.6060109@dial.pipex.com>
Date: Thu, 27 Jul 2006 12:59:23 +0100
From: Elwyn Davies <elwynd@dial.pipex.com>
User-Agent: Thunderbird 1.5.0.4 (Windows/20060516)
MIME-Version: 1.0
To: Henrik Levkowetz <henrik@levkowetz.com>
Subject: Re: [Tools-discuss] RFCmarkup v1.28
References: <44C78E71.9050003@levkowetz.com> <44C7B93E.7020105@dial.pipex.com> <44C7C471.9020908@levkowetz.com> <44C7D035.9000209@dial.pipex.com> <44C7F662.3050803@levkowetz.com> <44C88FCF.2070801@dial.pipex.com> <44C8991B.6040509@levkowetz.com>
In-Reply-To: <44C8991B.6040509@levkowetz.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.8 (--)
X-Scan-Signature: 32b73d73e8047ed17386f9799119ce43
Cc: Tools Team Discussion <tools-discuss@ietf.org>
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/tools-discuss>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
Errors-To: tools-discuss-bounces@ietf.org

Hi Henrik.

A couple of thoughts:

Without doing major parsing you could use re capabilities to separate 
off the header part and only apply certain rules to the parts.
Thus:
Use findall to find all the blank lines.
Identify the start of the title as the first group of blank lines inside 
the document (i.e., ignoring any blank lines at the beginning) - use 
group and start to get positions.
Chop the data up and apply re's as required. then resplice.

Some more below...

/Elwyn
 
Henrik Levkowetz wrote:
> Hi Elwyn,
>
> Thanks for more feedback;
>
> on 2006-07-27 12:05 Elwyn Davies said the following:
>   
>> IE now looks fine - printing on both Firefox and IE looks good. BTW I 
>> realized that the 75% on IE is not how it scales the printing but is a 
>> way of zooming the on screen display of the preview. Doh!
>>
>> The product of a paranoid's breakfast:
>>
>> 1. http://www1.tools.ietf.org/html/rfc3410 : The second item (2.2) that 
>> claims to be on page 4 in the ToC doesn't get a link (something to do 
>> with longish title and only one period in the leader?)
>>     
>
> Right. Won't fix.
>   
One possibility if you did want to try would be to identify the leader 
end/number/end of line pattern from the early part of the ToC and then 
apply it throughout ToC.
Not a big deal.
>   
>> 2. 
>> http://www1.tools.ietf.org/html/draft-aoun-middlebox-token-authentication-00: 
>> The section headers are now <h2> but the title is still body text. This 
>> one has 'Expires on'
>>     
>
> Mmm.  Right.  Won't fix now (not trivial), but maybe later.
>   
See above.
>   
>> 3. http://www1.tools.ietf.org/html/draft-ietf-ipngwg-icmp-v3-07: (no 
>> 'Expires:' at all) - how about not looking for the title etc until after 
>> the second group of totally blank lines (or the first group that isn't 
>> at the start of the document)?
>>     
>
> Can you put that in a regexp ;-) ?
>
> The boldfaced 11 July is taken to be a section - I tried to require the
> section numbers to contain a period, but had to revert that as too many
> document have major section numbers without a period.
>
> Currently the title is the first group of lines which are preceded by
> a line which begins with "Category:" or ends in a year, then a blank
> line.  I'll look at changing that to the pattern you suggest, for the
> next version.
>   
See above.
>   
>> 4. http://www1.tools.ietf.org/html/draft-aoun-mgcp-nat-package-02: This 
>> is a very badly formatted draft.. you fixed the link in the ToC problem 
>> but it has the same problem as #2 above and thereafter the markup of 
>> section headers is semi-random. Sections 1, 2 and 3 miss out; the first 
>> three non-empty body text lines on p3 become a header.  Sections 3.x are 
>> found but not s4 onwards.  s4.x you would have difficulty with as they 
>> are indented. Horrible! I think I owe you a beer if you can canonicalize 
>> this one!
>>     
>
> Actually, it seems you're looking at an old cached copy - after refreshing
> here, this one looks pretty good too me, on all 3 servers.
>
>   
Right.. this is pretty much OK apart from the s4.x which are 
additionally indented.  ... let's not bother too much.

<<snip>>



_______________________________________________
Tools-discuss mailing list
Tools-discuss@ietf.org
https://www1.ietf.org/mailman/listinfo/tools-discuss