Re: New Version Notification for draft-kamp-httpbis-structure-01.txt (fwd)

Willy Tarreau <w@1wt.eu> Thu, 17 November 2016 05:58 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E04F129651 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 16 Nov 2016 21:58:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.398
X-Spam-Level:
X-Spam-Status: No, score=-8.398 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-1.497, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fdovOBZUVUAY for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 16 Nov 2016 21:58:02 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C85091295DB for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 16 Nov 2016 21:58:02 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1c7FeU-0001pn-S4 for ietf-http-wg-dist@listhub.w3.org; Thu, 17 Nov 2016 05:54:26 +0000
Resent-Date: Thu, 17 Nov 2016 05:54:26 +0000
Resent-Message-Id: <E1c7FeU-0001pn-S4@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <w@1wt.eu>) id 1c7FeL-0001ll-CW for ietf-http-wg@listhub.w3.org; Thu, 17 Nov 2016 05:54:17 +0000
Received: from wtarreau.pck.nerim.net ([62.212.114.60] helo=1wt.eu) by titan.w3.org with esmtp (Exim 4.84_2) (envelope-from <w@1wt.eu>) id 1c7FeF-0004yx-5f for ietf-http-wg@w3.org; Thu, 17 Nov 2016 05:54:12 +0000
Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id uAH5rjCR009840; Thu, 17 Nov 2016 06:53:45 +0100
Date: Thu, 17 Nov 2016 06:53:45 +0100
From: Willy Tarreau <w@1wt.eu>
To: Kazuho Oku <kazuhooku@gmail.com>
Cc: Poul-Henning Kamp <phk@critter.freebsd.dk>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20161117055345.GB9646@1wt.eu>
References: <78354.1477853918@critter.freebsd.dk> <CANatvzx5RSnnN9ybqh6tRqKV=7NVO+PTpgAuVUU+6JuKGFtczg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CANatvzx5RSnnN9ybqh6tRqKV=7NVO+PTpgAuVUU+6JuKGFtczg@mail.gmail.com>
User-Agent: Mutt/1.6.0 (2016-04-01)
Received-SPF: pass client-ip=62.212.114.60; envelope-from=w@1wt.eu; helo=1wt.eu
X-W3C-Hub-Spam-Status: No, score=-5.5
X-W3C-Hub-Spam-Report: AWL=-0.575, BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1c7FeF-0004yx-5f 71104420977f49129b3e52afcad48f3e
X-Original-To: ietf-http-wg@w3.org
Subject: Re: New Version Notification for draft-kamp-httpbis-structure-01.txt (fwd)
Archived-At: <http://www.w3.org/mid/20161117055345.GB9646@1wt.eu>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32915
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi Kazuho,

On Thu, Nov 17, 2016 at 10:06:04AM +0900, Kazuho Oku wrote:
> Hi,
> 
> Thank you for writing the draft.
> 
> Regarding the numbers, could we either exclude floating point from the
> specification or state that an integral number MUST be encoded without
> using a dot?
> 
> The reason I ask is because it is hard to correctly implement a parser
> for floating point numbers, and a bug in the parser would likely lead
> to a vulnerability [1]. Note that in some (if not most) of the
> programming languages you would need to implement your own number
> parser to meet the needs. For example, you cannot use sscanf in C,
> because depending on the locale the function allows use of decimal
> points other than '.'.
> 
> If we could exclude floating point numbers from the specification
> entirely or have a restriction something like above, parser
> implementors can refrain from implementing their own floating point
> number parsers until the specification in which they are interested in
> actually start using the notation.
> 
> Non-integral numbers are rarely used in the HTTP headers. The only one
> I can recall is the q value of Accept-Encoding, but it is not a
> floating-point but actually a fixed-point number (of three decimals
> below the point), which could have been represented by using integral
> numbers between 0 to 1000.
> 
>      weight = OWS ";" OWS "q=" qvalue
>      qvalue = ( "0" [ "." 0*3DIGIT ] )
>             / ( "1" [ "." 0*3("0") ] )

I'd like to avoid FP as well. However it's important to note that fixed
point numbers is not exempt from similar issues due to the way they are
encoded, since everyone will store them in floats/doubles, but the error
is limited to the mantissa precision. For example 64-bit double numbers
contain a 53 bit mantissa so we can easily see a difference in the lower
bits. Example :

  #include <stdio.h>
  #include <stdlib.h>

  int main(int argc, char **argv)
  {
        double f = atof(argv[1]);
        printf("input=%s float=%f\n", argv[1], f);
        return 0;
  }

  $ ./a.out $((1<<32)).000001
  input=4294967296.000001 float=4294967296.000001
  $ ./a.out $((1<<33)).000001
  input=8589934592.000001 float=8589934592.000002

In my opinion we don't care here. And maybe we can document the expected
minimal precision (eg: minimum 53 bits to be able to store a 32-bit
integral range with a 1/1000000 fractional precision.

Also it's pretty certain that developers will use atof() on fixed point
numbers, but at least the input can be sanitized easily by ensuring that
only digits, dot and - are allowed in it.

Regards,
Willy