Re: [precis] Applying the rules three times to get a stable output string?
William Fisher <william.w.fisher@gmail.com> Sat, 09 December 2017 22:09 UTC
Return-Path: <william.w.fisher@gmail.com>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C580E120454 for <precis@ietfa.amsl.com>; Sat, 9 Dec 2017 14:09:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ET8-FR8-qpdg for <precis@ietfa.amsl.com>; Sat, 9 Dec 2017 14:09:22 -0800 (PST)
Received: from mail-lf0-x22e.google.com (mail-lf0-x22e.google.com [IPv6:2a00:1450:4010:c07::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 98979120227 for <precis@ietf.org>; Sat, 9 Dec 2017 14:09:22 -0800 (PST)
Received: by mail-lf0-x22e.google.com with SMTP id 94so15316044lfy.10 for <precis@ietf.org>; Sat, 09 Dec 2017 14:09:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=aSjHrEm2I7GDFrc/77BqPwNcqjKHi4nGELrNu2DTCXw=; b=Tu5aiqg/39mNJdp5raV0bq2gJ8JvOTLm456Mh5QrCx816/jc0+4qItpGAQ1kQCwg7U eL0wCdmOmhOxw6oSy9NSyHmys44js2FDxjObk8fWAH5UxilAXO99ceaERuVJpEW9UIhu SnUk0kOyeiJuFzBwn1RUEAkzrHTuZmGwHeP1HMCCtE9RvY0ckrxP1DM9MDHO+IfYIOGT HNtyOZfyS9TLNa2nEXnxga0TaQmHyw/1qq/BXXomURypoQtZhEaoIOYj1LcSOMgj7bs8 ZKSYgHcwtXXcvUnNlnJPIVxqKw5/YctZZgiuspF/Y8hokihid1r+4ryYpqCR0dJ5rB6+ bHKA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=aSjHrEm2I7GDFrc/77BqPwNcqjKHi4nGELrNu2DTCXw=; b=oKq+hQEIkmsNl4Z1dHNYeLzLVauUZxSGs9q3jbT2Tp3c6MoB2LKvbi7UiyXwT7soq0 y0jn2v6gEW80HeLLFSUAWqvQ9xJqRDtXmMkkOOo1sXStgZ7FPGw5xQQh/KPDUvcWfZxM R9CGwUdA76cWh91dqu3+nlGeN87C4tZsZR0YAWlxScu9EfkPlx8JIB+fbjNOwW4Nrqt7 edm/yx7rpIFnnk9meyFbYg2T/Ou7MP1pq81Dvkz0dzeqCtnuA9X+dE1FwshRgSf2YU3U Za+nThaLfytwoRCzUDffWgeKvKVqDglZI0+gaH1VCMjnaJTR1KpPpcxYno/flwKPbIjm by7w==
X-Gm-Message-State: AJaThX5fVOK9TZ0kTZaVZizUWpx33JMChTplWb3SA6GiDcBlt3XaizfF VD5DB/Ke8npmiYRGabKOPFDkJ2QfiXIU53QV35c=
X-Google-Smtp-Source: AGs4zMb3wsguIoGA731azEFvgReGcmb6um3EGYX4VZE0P4940WOcHs/XstQniHm+4YlCcclkM8ryNqIiAAErOOOkN1w=
X-Received: by 10.46.20.5 with SMTP id u5mr19384781ljd.9.1512857360629; Sat, 09 Dec 2017 14:09:20 -0800 (PST)
MIME-Version: 1.0
Received: by 10.179.26.33 with HTTP; Sat, 9 Dec 2017 14:09:20 -0800 (PST)
In-Reply-To: <C31DFCC1-31BB-49E4-A9BD-071BF5AC6C02@gmx.de>
References: <C64B78C6-8109-4F36-BB76-EA8AB229FCE2@gmx.de> <CAHVjMKGmZK1DQJmbM-4Gb6W8NUbzG-qQXnXBScr6Yh+o==wxuw@mail.gmail.com> <C31DFCC1-31BB-49E4-A9BD-071BF5AC6C02@gmx.de>
From: William Fisher <william.w.fisher@gmail.com>
Date: Sat, 09 Dec 2017 15:09:20 -0700
Message-ID: <CAHVjMKEEndoJhMvMEQPPvvCS+t_4vkpp61iFoKrXNksrCB6ohA@mail.gmail.com>
To: Christian Schudt <christian.schudt@gmx.de>
Cc: precis@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/precis/hVKk3369xkUd7mJBDI5Z9t5O6Tg>
Subject: Re: [precis] Applying the rules three times to get a stable output string?
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Dec 2017 22:09:25 -0000
I did not come across any code points where IdentifierClass/Usernames required multiple passes to make the result idempotent. Only the Nickname profile is affected, due to the interaction between NFKC and the case/space rules. My implementation applies an extra iteration for the Nickname profile. The other profiles verify that the result is idempotent and raise a DISALLOWED/not_idempotent error if this is violated. I do not believe there are legal inputs for Usernames which violate the idempotency requirement, so this is purely defensive. On Sat, Dec 9, 2017 at 2:27 PM, Christian Schudt <christian.schudt@gmx.de> wrote: > Great, thanks! These code points revealed some bugs :-). They should have been included in the Examples. > > Are there any known code points for the IdentifierClass / Usernames as well? > Seems like all these code points are disallowed anyway. > > If not, implementations could save 1-2 iterations and only apply the „3-times“-rule for FreeformClass. > > > >> Am 09.12.2017 um 20:34 schrieb William Fisher <william.w.fisher@gmail.com>: >> >> Where it makes a difference for NicknameCaseMapped: >> >> "\u210c" >> "\u20a8" >> >> Where it makes a difference for Nickname due to spaces: >> >> "\u00a8" >> "\u02dc" >> >> >> On Sat, Dec 9, 2017 at 8:37 AM, Christian Schudt >> <christian.schudt@gmx.de> wrote: >>> Hi, >>> >>> RFC 8264 introduced these new sentences: >>> >>> under certain circumstances, such as when Unicode >>> Normalization Form KC is used, performing Unicode normalization after >>> case mapping can still yield uppercase characters for certain code >>> points >>> >>> Therefore, an implementation SHOULD apply the rules >>> repeatedly until the output string is stable >>> >>> >>> I could imagine these sentences refer to code points of the „Unstable“ category, but this category is unused. >>> >>> Are there any concrete code points or input strings which show this unstable behaviour? >>> I am asking for some test vectors, i.e. an input string, which doesn’t have the expected output string after the first rule application, but after the second one. >>> >>> Thanks, >>> — Christian >>> _______________________________________________ >>> precis mailing list >>> precis@ietf.org >>> https://www.ietf.org/mailman/listinfo/precis >
- Re: [precis] Applying the rules three times to ge… Christian Schudt
- Re: [precis] Applying the rules three times to ge… William Fisher
- Re: [precis] Applying the rules three times to ge… William Fisher
- [precis] Applying the rules three times to get a … Christian Schudt
- Re: [precis] Applying the rules three times to ge… Christian Schudt