Re: [Xml-sg-cmt] Odd <u> display in PDF: Re: AUTH48: RFC-to-be 9290 <draft-ietf-core-problem-details-08> for your review

Kesara Rathnayake <kesara@staff.ietf.org> Mon, 29 August 2022 04:18 UTC

Return-Path: <kesara@staff.ietf.org>
X-Original-To: xml-sg-cmt@ietfa.amsl.com
Delivered-To: xml-sg-cmt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D1D3DC1522D7; Sun, 28 Aug 2022 21:18:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.911
X-Spam-Level:
X-Spam-Status: No, score=-1.911 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZH05sfSeiShR; Sun, 28 Aug 2022 21:18:37 -0700 (PDT)
Received: from ietfx.amsl.com (ietfx.amsl.com [50.223.129.196]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0F3C2C1522D5; Sun, 28 Aug 2022 21:18:37 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by ietfx.amsl.com (Postfix) with ESMTP id DFCD740651EA; Sun, 28 Aug 2022 21:18:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
Received: from ietfx.amsl.com ([50.223.129.196]) by localhost (ietfx.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h_udz2GHQ4uq; Sun, 28 Aug 2022 21:18:36 -0700 (PDT)
Received: from [192.168.1.87] (122-58-124-12-vdsl.sparkbb.co.nz [122.58.124.12]) by ietfx.amsl.com (Postfix) with ESMTPSA id 76FA340651E6; Sun, 28 Aug 2022 21:18:33 -0700 (PDT)
Content-Type: multipart/mixed; boundary="------------E0EOwMwW2F9Nkb6TFrqjUJnh"
Message-ID: <7be220ff-cec7-c7a1-9bd0-ccb828b701ad@staff.ietf.org>
Date: Mon, 29 Aug 2022 16:18:30 +1200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:104.0) Gecko/20100101 Thunderbird/104.0
Content-Language: en-NZ
To: Sandy Ginoza <sginoza@amsl.com>, Carsten Bormann <cabo@tzi.org>, xml-sg-cmt@ietf.org
Cc: Megan Ferguson <mferguson@amsl.com>, Thomas Fossati <Thomas.Fossati@arm.com>, RFC Editor <rfc-editor@rfc-editor.org>, "core-ads@ietf.org" <core-ads@ietf.org>, "core-chairs@ietf.org" <core-chairs@ietf.org>, Jaime Jiménez <jaime@iki.fi>, "auth48archive@rfc-editor.org" <auth48archive@rfc-editor.org>, Jay Daley <jay@staff.ietf.org>
References: <20220804195913.906BF55ECC@rfcpa.amsl.com> <557D1A94-9729-4D7E-90B4-D53B6A0DEDEE@tzi.org> <DB9PR08MB6524313A6D026E0F63B483AF9C719@DB9PR08MB6524.eurprd08.prod.outlook.com> <629C2E8C-A79C-4CBF-AE49-CEC9C8C0B5F2@amsl.com> <69949BE3-B08B-4780-9FE5-ABA415DFBECA@tzi.org> <45032CE1-D56C-4F1F-8EFE-66B149407D35@amsl.com>
From: Kesara Rathnayake <kesara@staff.ietf.org>
Organization: IETF Administration LLC
In-Reply-To: <45032CE1-D56C-4F1F-8EFE-66B149407D35@amsl.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml-sg-cmt/pPB47yUiujv0t_Kh99_f_Vin4ws>
X-Mailman-Approved-At: Mon, 29 Aug 2022 08:28:20 -0700
Subject: Re: [Xml-sg-cmt] Odd <u> display in PDF: Re: AUTH48: RFC-to-be 9290 <draft-ietf-core-problem-details-08> for your review
X-BeenThere: xml-sg-cmt@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Working list for the xml and style guide change management team <xml-sg-cmt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml-sg-cmt>, <mailto:xml-sg-cmt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml-sg-cmt/>
List-Post: <mailto:xml-sg-cmt@ietf.org>
List-Help: <mailto:xml-sg-cmt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml-sg-cmt>, <mailto:xml-sg-cmt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Aug 2022 04:18:37 -0000


On 27/08/22 11:05 am, Sandy Ginoza wrote:
> Authors, CMT,
> 
> Carsten and Thomas approved publication pending "repairing the PDF 
> glitch and the typo" where the "PDF glitch" is #873 
> <https://github.com/ietf-tools/xml2rfc/issues/873>.
> Is there some escalation path or alternate fix for this, as it seems as 
> though WeasyPrint won’t fix this (issue 1711 
> <https://github.com/Kozea/WeasyPrint/issues/1711> was closed as "not 
> planned”).  Any thoughts on how to proceed?
> 

WeasyPrint seems to play nice when RTL content is not within a `span` tag.
So there's a fix on xml2rfc for the current issue [1].
I have attached the PDF file generated with the fix.

[1] https://github.com/ietf-tools/xml2rfc/pull/884

   --Kesara

> Thanks,
> Sandy
> 
> 
> 
>> On Aug 22, 2022, at 1:55 PM, Carsten Bormann <cabo@tzi.org 
>> <mailto:cabo@tzi.org>> wrote:
>>
>> Hi Megan,
>>
>> I’m in the middle of my full reread (3 totally optional nits so far), 
>> but now I’m running into a major glitch with the PDF generation, which 
>> I found in the RFC-editor’s output and can reproduce when generating 
>> the PDF locally with both xml2rfc 3.13.1 and 3.14.1 [weasyprint-56.1]).
>>
>> Correct in HTML:
>>
>>> The following example shows how the Hebrew-language string "שלום" 
>>> (HEBREW LETTER SHIN, HEBREW LETTER LAMED, HEBREW LETTER VAV, HEBREW 
>>> LETTER FINAL MEM, U+05E9 U+05DC U+05D5 U+05DD) is represented. Note 
>>> the rtl direction expressed by setting the third element in the array 
>>> to "true”.
>>
>> Correct in TXT:
>>
>>>   The following example shows how the Hebrew-language string "שלום"
>>>   (HEBREW LETTER SHIN, HEBREW LETTER LAMED, HEBREW LETTER VAV, HEBREW
>>>   LETTER FINAL MEM, U+05E9 U+05DC U+05D5 U+05DD) is represented.  Note
>>>   the rtl direction expressed by setting the third element in the array
>>>   to "true".
>>
>> Glitch in PDF:
>>
>>> The following example shows how the Hebrew-language string ,HEBREW 
>>> LETTER SHIN) "􏰀􏰁􏰂שלום" HEBREW LETTER LAMED, HEBREW LETTER VAV, HEBREW 
>>> LETTER FINAL MEM, U+05E9 U+05DC U+05D5 U+05DD) is represented. Note 
>>> the rtl direction expressed by setting the third element in the array 
>>> to "true".
>>
>> Please note that, after a copy-paste from PDF to text, I get the three 
>> private-use characters before the שלום.
>> For comparison, poppler's pdftotext just finds a U+202B RIGHT-TO-LEFT 
>> EMBEDDING before and a U+202C POP DIRECTIONAL FORMATTING after the 
>> hebrew text (including the terminating quote) both for the running 
>> text and the artwork example that follows.
>>
>> So if the latin text for (HEBREW LETTER SHIN, weren't surprisingly 
>> reordered in the PDF, all would be fine.
>>
>> (I'd like to reconfirm that (except for the PDF glitch) the
>> presentation of the example with שלום in Appendix A looks exactly like
>> we intended -- a more beautiful rendering of the Unicode names and
>> scalars would certainly be possible, but is outside the scope of what
>> we want to achieve here.)
>>
>>
>> Grüße, Carsten
>>
> 
> 

-- 
Kesara Rathnayake
Senior Software Development Engineer - IETF LLC
kesara@staff.ietf.org