Re: Very large values (Re: Call For Adoption Live Byte Ranges)

Craig Pratt <craig@ecaspia.com> Tue, 03 January 2017 09:40 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 19867129545 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 3 Jan 2017 01:40:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.001
X-Spam-Level:
X-Spam-Status: No, score=-10.001 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-3.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ecaspia-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7HBbtCU1ULXm for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 3 Jan 2017 01:40:40 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1362C1294AE for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 3 Jan 2017 01:40:39 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cOLX1-0001Sw-8i for ietf-http-wg-dist@listhub.w3.org; Tue, 03 Jan 2017 09:37:23 +0000
Resent-Date: Tue, 03 Jan 2017 09:37:23 +0000
Resent-Message-Id: <E1cOLX1-0001Sw-8i@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <craig@caspiaconsulting.com>) id 1cOLWw-0001S7-TA for ietf-http-wg@listhub.w3.org; Tue, 03 Jan 2017 09:37:18 +0000
Received: from mail-pf0-f177.google.com ([209.85.192.177]) by mimas.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <craig@caspiaconsulting.com>) id 1cOLWq-0003w2-BY for ietf-http-wg@w3.org; Tue, 03 Jan 2017 09:37:13 +0000
Received: by mail-pf0-f177.google.com with SMTP id d2so76575209pfd.0 for <ietf-http-wg@w3.org>; Tue, 03 Jan 2017 01:36:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ecaspia-com.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:organization:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=eSmmV5lI+9w57JlcPxJTx1PAfDppa7nbG3rB/bIGMTk=; b=fRrm2LuL2ljXWlHhJlYQUqDYY4HtbspEbfnPVnTUQbSJ2LCfqYXXj2/dF1g/gZeCeq WSZTPyzJWdOjstqZbGm1P9rk6E5Adm6fFS8NgRhdUPmCGtoCOFNJ8G646rHkXWAyzjgk lizUCJoO/sdrw8Uo1XfJzxzW+cbIzX3RzeTWGfznS6w3p/FM1t8gG3KAJSpk5dRI43oQ 8s9HzGfe8zEJ70MI+vHNrLR5OdjMKkjc6OZjqQjKBmFIbeHHXNzDoaRwcgKI2puz7Zjd SJTMxUrsY7smcEU7wZyMhyfyzTHJcBmDydqa5DIu1vsWRKK32VRSjjDYlkhqMutv13OS aWRw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding; bh=eSmmV5lI+9w57JlcPxJTx1PAfDppa7nbG3rB/bIGMTk=; b=ZT6GbtMTFW+zPks0nSwHXBPnDABIeTdhktoo4w9cM494YXij2na59+K0nu08RA22ek MqxTA6bb2unHX/2xq20ZJekcRVZqrw3yaflQCWM+4wtayHeRtfPFaB+mW2PcMhxPZ7mH WggCZVKjBMFBaRjG1mJPtSob3FEcJf48r9FuYr4NMvwkoip1U1eS9TBUT4GVTYqOwqQs InYj0cXRI/P/vq7SKzuxP/53+ZeAyE1DhIayQg+6T8p93fTukKB7J2SbTo2YeNoj2Ov7 Nu5JcqZHrD4JmCtkbYlE6ARuvnb5x7QZ0KfAYnIjS1D7EXqQudNj/VBBfmNlwBT6/rvG aC5w==
X-Gm-Message-State: AIkVDXKuW95A9jVC/bTdYa/pk0gSv7Z9RBfiYVcuEL7oswIf4qcdyTrTWyLgzoUwlJtOIA==
X-Received: by 10.99.181.76 with SMTP id u12mr114205374pgo.64.1483436205887; Tue, 03 Jan 2017 01:36:45 -0800 (PST)
Received: from [10.10.1.125] (50-193-209-61-static.hfc.comcastbusiness.net. [50.193.209.61]) by smtp.googlemail.com with ESMTPSA id y62sm74488787pfk.52.2017.01.03.01.36.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Jan 2017 01:36:44 -0800 (PST)
To: Martin Thomson <martin.thomson@gmail.com>
References: <CABkgnnVj6yKkdr=QZiZpqMDsbYDpO39bbYThi6bPOcOx7Sdi5Q@mail.gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
From: Craig Pratt <craig@ecaspia.com>
Organization: Caspia Consulting
Message-ID: <f2cba1d4-246b-ce06-cbcf-67a42b5835c1@ecaspia.com>
Date: Tue, 03 Jan 2017 01:36:43 -0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:45.0) Gecko/20100101 Thunderbird/45.5.1
MIME-Version: 1.0
In-Reply-To: <CABkgnnVj6yKkdr=QZiZpqMDsbYDpO39bbYThi6bPOcOx7Sdi5Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=209.85.192.177; envelope-from=craig@caspiaconsulting.com; helo=mail-pf0-f177.google.com
X-W3C-Hub-Spam-Status: No, score=-5.1
X-W3C-Hub-Spam-Report: AWL=-0.017, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-1.156, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1cOLWq-0003w2-BY fa71d882caf8f8aaf16bb3bdf506d182
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Very large values (Re: Call For Adoption Live Byte Ranges)
Archived-At: <http://www.w3.org/mid/f2cba1d4-246b-ce06-cbcf-67a42b5835c1@ecaspia.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33259
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Thanks Martin for the feedback. I'm glad to get this feedback prior to
revising - since you bring up some good points. Reply in-line.

On 1/2/17 4:37 PM, Martin Thomson wrote:
> On 3 January 2017 at 10:00, Craig Pratt <craig@ecaspia.com> wrote:
>> 2^63 is 9223372036854775808 (decimal). I've defined a smaller value to avoid
>> potential conflicts and to make the value more easily identifiable:
>> 9222999999999999999.
>>
>> I think having a clearly-defined Very Large Value such as this to represent
>> the indeterminate end of content will be more deterministic/easily
>> implemented than having a Server try to establish a VLV in each HTTP
>> exchange. But I'd appreciate any thoughts prior to revising the draft.
> I think that any value you choose will be OK-ish.  The question is
> whether you think that there is a response that will exceed that size.
> If there is, then no single value you choose will be enough.  If that
> is possible, then you don't want a single fixed value at all, just a
> recommendation to pick a big number that far exceeds the size you
> want/expect.
On that, I think we're OK - based on the use cases I'm concerned about,
2^63 would be more than enough.
> I guess the other concern is that 9222999999999999999 (which I had to
> copy because I go cross-eyed counting those nines), is too big for
> some numeric formats.  Javascript has trouble with that number, which
> it reads as 9223000000000000000 instead, a problem that starts with
> 9007199254740993 (just paste that into your browser console and see
> what comes back). That suggests a smaller value might be safer, but
> then you have more problems with overflow.
Yeah - my bad for not having researched that. I've done much in Java,
but not JavaScript (yet).

It looks like ECMAScript 6 uses an IEEE 754 number format. And
JavaScript defines Number.MAX_SAFE_INTEGER as 2^53 - 1
(9007199254740991). So yeah - any HTTP request using a JS number
as a range value isn't going to be able to (accurately) represent numbers
beyond that, as you've observed. This is definitely an issue, IMHO.
> Note that whatever value you pick has to be safe for a great many
> implementations, even if those implementations never need that space.
> They still have to parse the value properly, preferably without
> resorting to use of bignums.
>
> If you believe it to be possible to pick a safe value that will never
> be exceeded, then ignore the rest of my mail :)
I think what concerns me now is (a) any other language-specific and machine-
specific gotchas, and (b) the fact that this value could represent a 
real limit for
some high-rate application someone drams up. e.g. A limit of 2^53 causes my
1Gb/s example to go from a quite-comfortable 2339 years of content to a 
less-comfortable 2 years...
> The risk in specifying a single value is that implementations will
> hard-code checks around that value like (end == VLV) or if things are
> done poorly (end >= VLV).  Implementations that have that check will
> assume indefinite ranges, even if there isn't an indefinite range and
> might get caught with bugs, like infinite loops:
>
> 10: I have up to <VLV>, I need more bytes
> 20: ask for a range from current end to <VLV> (i.e., VLV-VLV)
> 30: get a zero-length range back
> 40: if need more bytes, goto 20
>
> That leads to problems: implementations won't be able to send
> responses of exactly the size you choose (however unlikely that is),
> or in the bad case, you won't ever be able to exceed that value.
>
> You can get the same effect if major implementations pick the same value.
Agreed. No one should use floating point representations for this stuff,
IMHO. But as it can't be avoided (JS isn't going away), then all the 
typical
rules about equality checks and floating point become necessary. And
therein lies many potential bugs.
> On the other hand, a client can just pick an arbitrary stupidly large
> value (ASLV).  This can be an increment on what the client already
> has, and should probably include some randomness.  If there is that
> much still remaining, then they just have to make a new request.
I didn't think of this as a common use case, but who am I to say?

BTW, I'd like to consider the term "AASLUVE" (Absurdly Arbitrary
Stupidly Large Unrepresentable Value Encoding) for the Definitions
section.
> Thus, clients can pick a minimum increment that won't cause too much
> pain for them.  2^32 might be enough for clients that don't mind
> making a request every 4Gb or so, and it might make sense to start
> with "smaller" increments like that to avoid triggering
> incompatibility problems.
Yeah - I'd thought about 32-bit values in the initial draft - esp for
limited-resource devices that wish to avoid bigint math/comparisons.
> Adding some amount of randomness will provide greater surety that the
> server has read and understood the request.  e.g.,
>
> aslv = lastByte + 2**32 + random(2**32)
> request.setHeader('Content-Range, 'bytes %d-%d/*' % (lastByte, aslv)
I hope using randomness isn't necessary. But I see what you're getting at.

I think the points you bring up convince me that we should stick with
the mechanism defined in the current Live Bytes draft. And while the
flexibility afforded the Client requires a bit more description and
server-side logic than defining a single VLMV (Very Large Magic Value),
I think it's still sufficiently simple and covers the bases. (and it's 
already
written)

I do think I should expand upon the Security Issues section a bit to
describe the additional issues with Very Large Values that you've
highlighted. Servers really need to be able to handle VLVs. And returning
the same range end value provided by the Client (which may be a VLV)
is critical for the operation of the Live Bytes mechanism as defined. So
look for a revision that incorporates this (and some of Poul-Henning's
corrections)

Thanks again,

cp

-- 

craig pratt

Caspia Consulting

craig@ecaspia.com

503.746.8008