Re: [Cbor] Implementing float->int numeric reduction

"lgl island-resort.com" <lgl@island-resort.com> Fri, 18 August 2023 17:20 UTC

Return-Path: <lgl@island-resort.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5216FC151067 for <cbor@ietfa.amsl.com>; Fri, 18 Aug 2023 10:20:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MtpbSJI7lLEg for <cbor@ietfa.amsl.com>; Fri, 18 Aug 2023 10:20:12 -0700 (PDT)
Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2118.outbound.protection.outlook.com [40.107.244.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5DA89C14CE51 for <cbor@ietf.org>; Fri, 18 Aug 2023 10:20:11 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=W07chIGJpb/CGiEnlNY+MxbHzjLRlL6kQFMA3tG2ZJ5+G347QSbBxwQq1bdLBIcsohvxXLuqVotWQK+vssNVcDXrsuzotwvZZyuJaUJymrTFDuJ5ckZjdZx8peNlM6FJ4DPFOGRVJ3R+n/Huuhef+Jofb6YIMgFbKkvOKoMMH7yWaWuzNzbXaFHmh1ITkg6qYTSIL0TU6UHVA1YXZr40mItnTNxH6t2E9DPj6X3Tyas8OE8KzasfE/3c0xYCi04DzWjfOTAVpRzAHuMSrFPbT239K1jdE9QKQOcC8TSKXr6S0rgaRxG3KuyCS8SFHkxiOiIpv1uPT+zlGwpoij2cxA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TH2OZD4KBKUzCCIOoeARubggT6jgPvSqsqoWmTM3r5I=; b=JaJgi8mJCUJAJWp+enrlLBpV1nWrnbUtZ+nccn55Vo68shuBPjgyUz/MWR+NX1RksMEPH4eQr0diLNFNKNYF+ZATZAoRBN+WOsJXBH9245/wobVg/niqvEq5SNdKp4uoQkrgqRE6Yz9mDlwsUtl1sr3mqtEwDfSdhKjpcKGkBJnT/M4Wa0RnljgMWS+OIM8DmQkeDbHfsDvF+qrxjoopZfzhMwZDxe5ObdorvovG89Y84dHLCiEFu2kIIEUPnsdCmHhPuuLaqM3ulyrgzSIXncaacAhXycpCZEG3wFQxeRGmUAqWM6ecnVV2/eh4VsyVrO1y0J96P7qZa0+SghiESQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=island-resort.com; dmarc=pass action=none header.from=island-resort.com; dkim=pass header.d=island-resort.com; arc=none
Received: from PH7PR22MB3092.namprd22.prod.outlook.com (2603:10b6:510:13b::8) by PH0PR22MB3259.namprd22.prod.outlook.com (2603:10b6:510:11a::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.20; Fri, 18 Aug 2023 17:20:07 +0000
Received: from PH7PR22MB3092.namprd22.prod.outlook.com ([fe80::f317:e4d1:7e1e:3934]) by PH7PR22MB3092.namprd22.prod.outlook.com ([fe80::f317:e4d1:7e1e:3934%3]) with mapi id 15.20.6652.029; Fri, 18 Aug 2023 17:20:07 +0000
From: "lgl island-resort.com" <lgl@island-resort.com>
To: Thiago Macieira <thiago.macieira@intel.com>
CC: "cbor@ietf.org" <cbor@ietf.org>
Thread-Topic: [Cbor] Implementing float->int numeric reduction
Thread-Index: AQHZz59UvtB7Lwv8pEW/lbdrfV/KOK/rqMSAgAANj4CAACioAIABYt0AgAAdhgCAABV3AIAAC0SAgAAPCoCAACSOgIAANksAgABL+ACAABnngIAAUhuAgAAEVgCAAA9WgIABm5eA
Date: Fri, 18 Aug 2023 17:20:07 +0000
Message-ID: <1CDC99B2-28E3-41EE-940D-9B5B9E87EB13@island-resort.com>
References: <7F396D3A-6411-44FA-B642-DAF6FF1F0742@island-resort.com> <2167834.Icojqenx9y@tjmaciei-mobl5> <E6997462-A05E-4CA3-B4EC-ACD6CB92A1BA@tzi.org> <2650753.X9hSmTKtgW@tjmaciei-mobl5>
In-Reply-To: <2650753.X9hSmTKtgW@tjmaciei-mobl5>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=island-resort.com;
x-ms-publictraffictype: Email
x-ms-traffictypediagnostic: PH7PR22MB3092:EE_|PH0PR22MB3259:EE_
x-ms-office365-filtering-correlation-id: 65aea794-ff3f-4b9f-109b-08dba00f5e9f
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: rprS40v02jUJQCOeoCVuYLUZ/u5kVU8hwO1q+XQ686VbepXtmaSQL3i+LFHzwjx3HtgqkoF8+vLev4xD6E1F+x0CdIDpNY1UhEwt02iRstH6XfMkQw+qG8YtKLtYGKvhWSeGBbxdSlH40GlFMgd0Q2db2vaz4HownWb+jQAMnqdC25uT7cbUh9b0kFCS4nPNIc03ufaJYHu8I65j47NkrPTaJbG8e1+tsmDpb6lNO3j2glXrQHm2CXtVO8yWr5Dnk7BCGnNAvmaKvo2Sqfr1iipsKiFRPpFR5sa3tiFcyw0cfUJiBjjvDJprKcNjy3WK/I3Zzk/K1jzAhlEe1GBJm6c0a/sddpiOrwO418Kz9KGcqB5rjdcdjfrOBXYppKH+7JfqMKHnv95z2U5l0NUqXlh1+TdMP5ugb2REl4UX14sKXsJFUp91AZnHJlhXU3GqtlA701hNxCO9x1ru2UeZjuzVqnlrPNt4rLqmtmHT8zmVFy22SDGqIrLDjz/lCwqanrI7gUTFHLgniW6QwFmzp5CObpRuAqrDSsNWARjpao2+HPMjOW5d9QkC9GGcDSBVLMpY/4xnoIHtT7lfCtXry82lo4m9iZGQeRp978ybgfBI0OIDG8DG/zcZgWhepu459MAjzy03a8Bn7ElVOUg1wQ==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH7PR22MB3092.namprd22.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(346002)(39830400003)(376002)(396003)(366004)(84040400005)(186009)(1800799009)(451199024)(6486002)(71200400001)(6506007)(38070700005)(38100700002)(6512007)(122000001)(53546011)(26005)(83380400001)(36756003)(86362001)(2616005)(33656002)(2906002)(66556008)(64756008)(316002)(6916009)(66476007)(66946007)(66446008)(41300700001)(91956017)(76116006)(5660300002)(8676002)(4326008)(8936002)(478600001)(966005)(45980500001); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: sJsF8bsYzBHeyw8SFszKeHmDsVk3N5ml+fWSd/rljMfUSi5BkA5wFgnDz0n0WO183W8syD1/WEhCtyp6CjOat+0s0gK99CXx5UOPAsJaGf0TehQ1zKr3UVjNd6CJuC1nY5OJfpRAMsGgik2iFCMHMBkpnwWDVxCkp0nrTsmNtSuPVdJ2w6CfFv9/Cm9h1jGWy0Phbhq7uHAx3gaSLSmbAwDLNDxZUlINxuVgSs31SzNNWwzhlNtVebaQX2ixBNwHgQb4vfaQoCzcp5QMy8QlgIXWVe5ZSu9cLaqRkx917rPU4+D9CMpTZNxO6ryCglZ5noWSWujwWc1pRglVTTVu4Dh6FOpJZMaLU+F0x6xLp07Jmu6ll3CAmcjPpqbSEyiPa0vakTcDA7qQZtLvwbZZSdRqr7dNcUZp1ArjWugs/4KnMFex2Zuyl6mBmg1hSWDxHZAMU/w8R9++li/Kuu/a1KoQeIf5UgB5dwOw9M7TDlERVlbLkwo6WekljZQkVoIPMmhszT7RR50eW7ZoAd1Vb5zrsMzDwDyTNTALW/gqIWvzZJERuMqmNifJfdwtHuALSC9ycXOkq7pPpUhCqcXw5ls23YDAxEH6t4GIUSzBscEElEpD9vkaXOSnrmpIoBSiDsJCXSeeiZjCMHcWffE1Fmfr8mp3WyMbSRVWvT4xJitVBkFZ/k6/ADQg5lp0QccxlTVuZzjKZQs9eX2y9Mei3A3V+F1CReuLERF6TknAZdeOdt9r1ctSQ1giE0qX94cLXFUAwDowvCN/rKGONP4+2jY8Wdd1b4pPxvuhXKBeunldTbLJaWEneG7MGqDS+uHPnr8esURB8hHOlmJHv6Yv1F/WRXs51E+DlAcalakwDrS4bPq3UJ5uEjiGCosUNcoDnY6V1f4gCxCVg6tpLw0fW1HQGVC1h3AozjbKlVf8yNtlOCawUNcytp7Tkz4t2uAeFWArGmIkguxWypUCPIwckrN8pQ4yCxf25UIVrsuFhGP1dXuwvHIoSmyyP0xhNO0cCKe8oa7NKCBPzpXHH491BmR6yZpAQHw/am3q5yN9QgG5c1M7YHTa0HsT/O4lX3SBsLBcV0eITHwfvB0PoOZC+kYCb/1/XNQ1qDJXCKxAIU9WEoGZk78EPMidAJy5yfisq7guVBbFvZ42GdiTkHuiyLnaBcO/tc/bzi2jsw1gM8/+FN2Llmy1KpF+W0R5Kj8pPCEVdiKpcpQSbOxZ6wp9bMHCwadA8WzeBNryInbNqmRnsJUEeWfwXdMHLNTq8DgfqAJoOdIlkv/3fiLxVRKapXngNIO7fnPZARNCmdLfoUMCtukoa61PFOz3wANJz5X4yF58ZdrBZ+2geuh7+xFmIQZXwpjr8cDUq8jN0SiqDX2ro4oyHpWXcGtZeNWKM7Bov5x/Ap5qTn2rLzKhKfkxgCU3zhASliraGxcDKpZIa1dUN9Yn647JTxVRaFZ9Lx3l4hHbHafscgx063zZAFrJazpY9Bmp69sKUDef8e8odN7tTCdBlkjKgM/PxmEu4gBhODyIcQberm3kojGk9phqw5Akj1yBh+qCokiCinEXLfPNVPeaUuIkMvRWdQ5jj1byxKr1QFfagPh6xjl9uu+yQA==
Content-Type: text/plain; charset="utf-8"
Content-ID: <C74D30224F238B4D89F45834FC3A5C81@namprd22.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: island-resort.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: PH7PR22MB3092.namprd22.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 65aea794-ff3f-4b9f-109b-08dba00f5e9f
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Aug 2023 17:20:07.5119 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: ad4b5b91-a549-4435-8c42-a30bf94d14a8
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: z09d/62CcOpbBhy2xL0c7Epf7ezEuvpuBBwOqupu757UAHmbJQIIStvxk11vRG3+SA/8Mba9l0OLdJaeZhp2yA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR22MB3259
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/XwV4Fe-iQk18oBbP1db5SKz2Npg>
Subject: Re: [Cbor] Implementing float->int numeric reduction
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Aug 2023 17:20:16 -0000

Hi Thiago,

Appreciate your comments here and the ref to the C/C++ standard.

It sounds to me that those of us that want completely bullet-proof code for every compiler and platform and that want to guarantee passing static analyzers and such should put the range check in. The check is probably unnecessary for the popular compilers and CPUs, but there might be one somewhere for which it is.

Would be interesting to see what the rust definition says. Maybe it’s not UB in rust (in which case the compiler is doing something).

LL



> On Aug 17, 2023, at 9:46 AM, Thiago Macieira <thiago.macieira@intel.com> wrote:
> 
> On Thursday, 17 August 2023 08:52:06 PDT you wrote:
>>>> If the value of the integral part cannot be represented by the integer
>>>> type, the behavior is undefined.
>> 
>> OK, officially it is undefined in the sense that it could burn down the
>> computer. In reality, a value will be returned by the conversion to
>> integer, however weird that value will be. Now that value is then checked
>> against the floating point version (is there a need for the cast, BTW?), so
>> if the value is weird, the reduction won’t happen.
> 
> There are two problems with UB here. The first is the reason why this is UB in 
> the first place: different architectures produce different values for the 
> conversion if it falls outside the range. Some may saturate to the maximum and 
> minimum, others may set to a specific value (SSE and x87 do this), others may 
> discard the value above the integral limit and produce essentially garbage. 
> This problem is mitigated by converting back and comparing to the original 
> value, but it can still lead to problems.
> 
> The second is that compilers often optimise based on the assumption you never 
> trigger UB, so if they see you performing a conversion from FP to integral, 
> they are allowed to assume that the value was in range in the first place. And 
> if it does eliminate the cases where the FP input was out of range, the 
> comparison failing can only be the result of there being a fractional 
> component. If you combine that with it somehow knowing the absolute value of 
> the input was above 2^53 (meaning a double-precision can't have a fraction), 
> the compiler could optimise the entire check out of existence.
> 
> In any case, this is an implementation detail for those implementations that 
> use languages with undefined behaviour (like mine). The wording is correct.
> 
>> (To avoid checking against both int64 and uint64, I would probably extract
>> the sign first.)
> 
> Do note the conversion to uint64_t is more expensive due to the lack of such 
> instruction for x86 until AVX512 came along.
> 
>>> I ran into this issue before, in both TinyCBOR and Qt's wrapper around it.
>> 
>> Can you help me understand this issue better by explaining how it manifested
>> there?
> 
> In our particular case, it was just the UBSan complaining. I don't remember it 
> manifesting in real life. I also noticed this because I was mostly trying to 
> suppress the -Wfloat-equal warning.
> 
> Here's the original implementation:
> https://codebrowser.dev/qt5/qtbase/src/corelib/global/qnumeric_p.h.html#205
> 
> I ended up writing an optimised version with intrinsics because then I am 
> allowed to assume certain things. You can see it in the Qt6 update to the 
> above:
> 
> https://codebrowser.dev/qt6/qtbase/src/corelib/global/qnumeric_p.h.html#154
> 
>>> PS: I haven't yet verified whether QCborValue already generates dCBOR
>>> without even trying. It's possible it does. But it doesn't support
>>> integers above 2^63-1.
>> 
>> If your application doesn’t use (generate and/or is expected to consume)
>> uint64\int64, i.e., [2**63..2**64-1], that may not actually matter.  (But
>> it is a restriction that is not that widely found in CBOR implementations.)
> 
> QCborValue is a library class, meaning lots of applications and other 
> libraries could use it to implement their decoding of existing protocols.
> 
> Strictly speaking, QCborValue supports those values by aliasing them to 
> negative ones in int64_t. If the application knows the payload was unsigned, 
> it can convert back to unsigned and retrieve the original losslessly (and 
> without UB).
> 
> But it's impossible to tell what the payload actually was. So one 
> recommendation I'd make is that protocols choose *either* [0..2^64-1] or 
> [-2^63..2^63-1], but never [-2^63..2^64-1]. The entire range is representable 
> in CBOR and dCBOR, but protocols should avoid making use of that.
> 
> -- 
> Thiago Macieira - thiago.macieira (AT) intel.com
>  Cloud Software Architect - Intel DCAI Cloud Engineering
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor