Re: [Cbor] NaN payload notation (Re: 7049bis: Diagnostic notation gaps)

Thiago Macieira <thiago.macieira@intel.com> Wed, 16 September 2020 16:58 UTC

Return-Path: <thiago.macieira@intel.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 01B6B3A0EF9 for <cbor@ietfa.amsl.com>; Wed, 16 Sep 2020 09:58:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6C3obEsGSeUR for <cbor@ietfa.amsl.com>; Wed, 16 Sep 2020 09:58:40 -0700 (PDT)
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5B4783A0CA1 for <cbor@ietf.org>; Wed, 16 Sep 2020 09:58:39 -0700 (PDT)
IronPort-SDR: CdC+ZQA7g3X34pwyjVrWc7a66Jh5NZjikOmVgHDYs0+OmQpTgSyQHQOxN6tzCZkfiqeDB+iAh6 Cljd8rg4kTmA==
X-IronPort-AV: E=McAfee;i="6000,8403,9746"; a="139028132"
X-IronPort-AV: E=Sophos;i="5.76,433,1592895600"; d="scan'208";a="139028132"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2020 09:58:39 -0700
IronPort-SDR: V//lo5Q8wj3W9eFawcvZGvtHm6aJhSR9xxraDVkGVcrRNiFhGPSNslgqlK2Agrfgyu3aghRRfL oRatsaUQ+Jjg==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.76,433,1592895600"; d="scan'208";a="483386387"
Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by orsmga005.jf.intel.com with ESMTP; 16 Sep 2020 09:58:39 -0700
Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 16 Sep 2020 09:58:38 -0700
Received: from orsmsx101.amr.corp.intel.com (10.22.225.128) by orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 16 Sep 2020 09:58:38 -0700
Received: from tjmaciei-mobl1.localnet (10.255.230.4) by ORSMSX101.amr.corp.intel.com (10.22.225.128) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 16 Sep 2020 09:58:38 -0700
From: Thiago Macieira <thiago.macieira@intel.com>
To: <cbor@ietf.org>, Carsten Bormann <cabo@tzi.org>
Date: Wed, 16 Sep 2020 09:58:37 -0700
Message-ID: <1648968.enHRFnnXMp@tjmaciei-mobl1>
Organization: Intel Corporation
In-Reply-To: <B5903EB7-8030-4A79-B73B-AF96B4F8E342@tzi.org>
References: <2766F4E6-0E67-472B-8BFA-75C529F4EE80@tzi.org> <1686854.WtuvBSIOmm@tjmaciei-mobl1> <B5903EB7-8030-4A79-B73B-AF96B4F8E342@tzi.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"
X-Originating-IP: [10.255.230.4]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/b6lyzIgWh3bQJOdcX0MA_Ds1FrU>
Subject: Re: [Cbor] NaN payload notation (Re: 7049bis: Diagnostic notation gaps)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Sep 2020 16:58:42 -0000

On Wednesday, 16 September 2020 09:17:23 PDT Carsten Bormann wrote:
> > As for the all-bits-set... I would prefer not to. The current 7049 says
> > that the preferred forms are 7e00, 7fc00000, 7ff8000000000000, which
> > match the quiet NaN that CPUs like x86 generate when they need to create
> > a NaN out of non-NaN parameters. The other way around is important and is
> > detailed on the Intel manual: the SIMD all-bits-set value has the NaN's
> > quiet bit set.
> Some diagnostic notation writers will put out 0xffffffff as NaN (as defined
> in RFC 7049).  Some may make use of new notation we come up with.

It is *a* NaN, but not *the* NaN. That value should produce the letters "NaN" 
somewhere, but it will need more details to disambiguate.

> > Personally, I'd prefer never to generate that. Printing as hexfloat isn't
> > too difficult, so whenever possible I'd like to have my own code do that.
> > For this reason, I'd like an alternative for NaN that has the letters
> > "nan" somewhere in them, so I can see from the dump output that it is a
> > NaN.
> 
> OK, sticking with your proposal:  NaN’7e00’_1 (or maybe NaN’7e00’ as we
> don’t really have leading zero bytes in a NaN?) This of course means
> NaN’7c00’ (or NaN’3c00’) would be invalid, as these are not NaNs.

Indeed. I think that's acceptable as it allows code to detect that the FP 
number is a NaN and then simply dump the entire thing, without having to 
understand which bits are mantissa and which ones aren't.

GCC's intrinsic __builtin_nanf16("0x1") is NaN'7e01', but fortunately it also 
generates the same thing for __builtin_nanf16("0x7e01"). It just masks off the 
high bits, including the sign bit, so to generate NaN'fe00', you'd write 
-_builtin_nanf16("").

To confirm I understand you, are these sequences identical to each other (in 
the same line)?

  NaN_1	NaN'7e00'		NaN
  NaN_2	NaN'7fc00000'
  NaN_3	NaN'7ff8000000000000'

(the bit patterns are the QNaN, unless I made a mistake; I'm not trying to 
trick you)

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering