Re: [Tools-discuss] Bibxml7: anchor normalization, xml2rfc: caching with anchors

Henrik Levkowetz <henrik@levkowetz.com> Wed, 15 May 2019 11:19 UTC

Return-Path: <henrik@levkowetz.com>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6ED54120098 for <tools-discuss@ietfa.amsl.com>; Wed, 15 May 2019 04:19:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NJOkUgK4yQ16 for <tools-discuss@ietfa.amsl.com>; Wed, 15 May 2019 04:19:46 -0700 (PDT)
Received: from zinfandel.tools.ietf.org (zinfandel.tools.ietf.org [IPv6:2001:1890:126c::1:2a]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2D1612004B for <tools-discuss@ietf.org>; Wed, 15 May 2019 04:19:46 -0700 (PDT)
Received: from h-202-242.a357.priv.bahnhof.se ([158.174.202.242]:53807 helo=tannat.localdomain) by zinfandel.tools.ietf.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <henrik@levkowetz.com>) id 1hQrwl-00031W-5m; Wed, 15 May 2019 04:19:44 -0700
To: Carsten Bormann <cabo@tzi.org>, tools-discuss <tools-discuss@ietf.org>
References: <2FA9E921-3C89-4400-B47B-5889AA18A140@tzi.org>
Cc: Tony Hansen <tony@att.com>
From: Henrik Levkowetz <henrik@levkowetz.com>
Message-ID: <72f07a93-3979-1707-3fbc-0e2f562b1555@levkowetz.com>
Date: Wed, 15 May 2019 13:19:21 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <2FA9E921-3C89-4400-B47B-5889AA18A140@tzi.org>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="O39NxQ0aVGSn4R2RU0Qvm498GcUUMOG5w"
X-SA-Exim-Connect-IP: 158.174.202.242
X-SA-Exim-Rcpt-To: tony@att.com, tools-discuss@ietf.org, cabo@tzi.org
X-SA-Exim-Mail-From: henrik@levkowetz.com
X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000)
X-SA-Exim-Scanned: Yes (on zinfandel.tools.ietf.org)
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/IsGxWqMLh7BoH1ALOnqFqQHl1Bc>
Subject: Re: [Tools-discuss] Bibxml7: anchor normalization, xml2rfc: caching with anchors
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 May 2019 11:19:48 -0000

Hi Carsten,

I just checked to see who's the bibxml7 script author, and from what I can
see it's Tony's script.  Adding Cc: Tony.

Regards,

	Henrik

On 2019-05-15 07:28, Carsten Bormann wrote:
> I just tried to work around a problem in kramdown-rfc’s “stand_alone: false” mode (this is where kramdown-rfc doesn’t do the reference handling by itself but instead generates entity declarations like it were 1986).
> 
> I noticed that
> 
> https://xml2rfc.tools.ietf.org/public/rfc/bibxml7/reference.DOI.10.1145_1282427.1282421.xml?anchor=foo
> 
> generates
> 
> <reference anchor="FOO">
> 
> Hmm, ID/IDREF are case-sensitive in XML, so this doesn’t quite work for an <xref target=“foo”/>.
> ➔ bug1
> 
> Independent of what anchor I specify, xml2rfc creates a cache file called
> 
> reference.DOI.10.1145_1282427.1282421.xml
> 
> So if I have another document that does 
> 
> https://xml2rfc.tools.ietf.org/public/rfc/bibxml7/reference.DOI.10.1145_1282427.1282421.xml?anchor=bar
> 
> Xml2rfc still uses the cache entry with 
> 
> <reference anchor=“FOO">
> 
> ➔ bug2
> 
> Bug 1 actually has another interesting facet:
> 
> https://xml2rfc.tools.ietf.org/public/rfc/bibxml7/reference.DOI.10.1145_1282427.1282421.xml?anchor=DOI.10.1145_637201.637236
> 
> creates
> 
> <reference anchor=‘DOI101145_637201637236' >
> 
> Whoa, what happened to my dots?
> 
> ➔ bug1b
> 
> So can we make anchor normalization on the bibxml server less aggressive?
> No upcasing (bug1), no removal of valid ID/IDREF characters (bug1b).
> 
> Also, we’d need to fix the caching in xml2rfc (bug2).
> (Kramdown-rfc’s [1.2.12, unreleased] cache file names for URLs with anchors currently look like this:
> reference.DOI.10.1145_1282427.1282421--anchor=foo.xml
> )
> 
> Grüße, Carsten
> 
> ___________________________________________________________
> Tools-discuss mailing list
> Tools-discuss@ietf.org
> https://www.ietf.org/mailman/listinfo/tools-discuss
> 
> Please report datatracker.ietf.org and mailarchive.ietf.org
> bugs at http://tools.ietf.org/tools/ietfdb
> or send email to datatracker-project@ietf.org
> 
> Please report tools.ietf.org bugs at
> http://tools.ietf.org/tools/issues
> or send email to webmaster@tools.ietf.org
>