Re: [dhcwg] Load Balancing for DHCPv6

"Bernie Volz (volz)" <volz@cisco.com> Fri, 21 September 2012 17:04 UTC

Return-Path: <volz@cisco.com>
X-Original-To: dhcwg@ietfa.amsl.com
Delivered-To: dhcwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6E3E721F8726 for <dhcwg@ietfa.amsl.com>; Fri, 21 Sep 2012 10:04:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.639
X-Spam-Level:
X-Spam-Status: No, score=-9.639 tagged_above=-999 required=5 tests=[AWL=-0.840, BAYES_00=-2.599, J_CHICKENPOX_31=0.6, J_CHICKENPOX_33=0.6, J_CHICKENPOX_73=0.6, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o+o3G5a8e4rw for <dhcwg@ietfa.amsl.com>; Fri, 21 Sep 2012 10:04:50 -0700 (PDT)
Received: from rcdn-iport-9.cisco.com (rcdn-iport-9.cisco.com [173.37.86.80]) by ietfa.amsl.com (Postfix) with ESMTP id 3EC0E21F871C for <dhcwg@ietf.org>; Fri, 21 Sep 2012 10:04:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=5210; q=dns/txt; s=iport; t=1348247090; x=1349456690; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=PouoMTspm8FyPz8NsAxfMUijTCK+fTq2RQMraMY+yf0=; b=SqQakyni1yCgdQfygGrFKnHOgCKbexYvBn9WcoPkE5+BVEfJH85q00xz IBQPr53fwpi5wCUmD5gEr6QYiVl7iJuz9MxCp1GeG2X63T0QMAK41Cz/+ gYXgS4n5U0jo0TzjarrO/r2O2Ekz3tfOOY/kJk3reJSHNKF0E3T0aVKfz s=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgAFAAedXFCtJXG9/2dsb2JhbABCA74YgQiCIAEBAQQBAQEPASc0CwwEAgEIDgMEAQEBCgoKCQchBgsUCQcBAgEDDgUIARUEh1EDDwuZF5Y/DYlTijpiFAEQglyCRWADlA+CaooDgyGBaYIaTYFaATw
X-IronPort-AV: E=Sophos;i="4.80,463,1344211200"; d="scan'208";a="121048913"
Received: from rcdn-core2-2.cisco.com ([173.37.113.189]) by rcdn-iport-9.cisco.com with ESMTP; 21 Sep 2012 17:04:49 +0000
Received: from xhc-rcd-x05.cisco.com (xhc-rcd-x05.cisco.com [173.37.183.79]) by rcdn-core2-2.cisco.com (8.14.5/8.14.5) with ESMTP id q8LH4n2X011788 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 21 Sep 2012 17:04:49 GMT
Received: from xmb-rcd-x04.cisco.com ([169.254.8.159]) by xhc-rcd-x05.cisco.com ([173.37.183.79]) with mapi id 14.02.0318.001; Fri, 21 Sep 2012 12:04:49 -0500
From: "Bernie Volz (volz)" <volz@cisco.com>
To: Andre Kostur <akostur@incognito.com>
Thread-Topic: [dhcwg] Load Balancing for DHCPv6
Thread-Index: AQHNlvCUaJ+khP2NkUiXQs2wUqwOeJeTIoPtgAI0BoD//68foA==
Date: Fri, 21 Sep 2012 17:04:48 +0000
Message-ID: <489D13FBFA9B3E41812EA89F188F018E0F5033D6@xmb-rcd-x04.cisco.com>
References: <CAL10_Bqa4ftiVhyyf0ezwKR7mzAEOLNi_K3EJFPFUzPnz7QGPw@mail.gmail.com> <489D13FBFA9B3E41812EA89F188F018E0F4F3093@xmb-rcd-x04.cisco.com> <CAL10_Br=OcWZuar1fDOopevTy_W-3g9TsYqo61rOWm4tKkuYgg@mail.gmail.com> <2D09D61DDFA73D4C884805CC7865E61118003F@GAALPA1MSGUSR9N.ITServices.sbc.com> <CAL10_BpXdx03WfV1PeMKg1zYc1dAFKe1CDNdrcNf45+_EVCBPg@mail.gmail.com> <CDDB9016-BE11-489A-9361-0172D96A464C@nominum.com> <CAOpJ=k2CJS=FuUvFwOq=s2m871_qfo=xROsW=fx0E48w2wxAqQ@mail.gmail.com> <CAL10_BoLsdppYKNSfHheYrZg+SfaggoynQf2X11HEdy=ELFUiQ@mail.gmail.com> <5049C317.7090603@gmail.com> <94FA926F-2432-4AE7-BC20-AE7458AB40D9@cisco.com> <CAF4+nEHqRFHbz9qfQuOqpLCNeZqkT=+f53_eCboECfWX8QCt6A@mail.gmail.com> <71F17433-B2D9-4366-9B32-F0E4D294EDB5@nominum.com> <CAF4+nEE6pbmO_ss+3UEpRG1kh2YD20P2KD4CFQ7LwoRZ6gyxsg@mail.gmail.com> <A563FEA0-0A26-431D-BDAA-AD897F691754@nominum.com> <CAF4+nEFqLW6p1wE9BWE6c2CqGDwzUQTYom9K4icV1_NjM-nTXg@mail.gmail.com> <1BB5EDFF-BBA9-4335-BE55-058FACEE30EB@cisco.com> <CAL10_Bo_DE1Lq4BuXRaMmBs9Fah=4TxroBgUEQ8V_6VC2NnTzQ@mail.gmail.com>
In-Reply-To: <CAL10_Bo_DE1Lq4BuXRaMmBs9Fah=4TxroBgUEQ8V_6VC2NnTzQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.86.250.189]
x-tm-as-product-ver: SMEX-10.2.0.1135-7.000.1014-19200.001
x-tm-as-result: No--62.518600-8.000000-31
x-tm-as-user-approved-sender: No
x-tm-as-user-blocked-sender: No
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dhcwg@ietf.org" <dhcwg@ietf.org>, Ted Lemon <Ted.Lemon@nominum.com>
Subject: Re: [dhcwg] Load Balancing for DHCPv6
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <dhcwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dhcwg>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 17:04:51 -0000

Thanks Andre.

While it would make no difference, it is always a mystery whether DUID will use a hardware type of 1 or 6. Per http://www.iana.org/assignments/arp-parameters/arp-parameters.xml, type 1 is for Ethernet whereas 6 is for IEEE 802.3.

It may well be for a more limited data set (i.e. ASCII strings), FNV1a could be better. I think Pearson was optimized for binary data.

My feeling is we should stick with Pearson and use the entire DUID (no length limit).

- Bernie

-----Original Message-----
From: Andre Kostur [mailto:akostur@incognito.com] 
Sent: Friday, September 21, 2012 12:44 PM
To: Bernie Volz (volz)
Cc: Donald Eastlake; dhcwg@ietf.org; Ted Lemon
Subject: Re: [dhcwg] Load Balancing for DHCPv6

I've crafted up a test rig for Pearson vs. FNV1a   (Pearson code from
the RFC, FNV1a from
http://isthe.com/chongo/tech/comp/fnv/#FNV-reference-source, with the XOR-folding at http://www.isthe.com/chongo/tech/comp/fnv/index.html#xor-fold).

I generated my DUIDs as a set of DUID-LLs (so the first four bytes are 00, 03, 00, 06).  The next three were taken randomly from IEEE's list of registered OUIs.  The remaining three are rand().

Running those through both Pearson and FNV1a 5 times each (on a set of
2,560,000 DUID-LLs) resulted in:

Pearson:
Min: 9637 - 9750, Max: 10220 - 10318, StdDev: 92.91 - 102.28

FNV1a:
Min: 9691 - 9754, Max: 10221 - 10379, Stdev: 93.27 - 107.66


This seems to suggest that Pearson is better than FNV1a over DUID-LLs.


On Thu, Sep 20, 2012 at 5:05 AM, Bernie Volz (volz) <volz@cisco.com> wrote:
> Bad test - This is a very limited character set.
>
> The data usually used for dhcp is binary data.
>
> Sent from my iPad
>
> On Sep 20, 2012, at 1:27 AM, "Donald Eastlake" <d3e3e3@gmail.com> wrote:
>
>> Well, I have some more statistics. One my co-authors on the draft 
>> whipped up some code that I then extended. The data hashed consisted 
>> of 2,560,000 strings calculated as follows:
>>        #define RUN 10000
>>        for(i = 0; i < 256*RUN; ++i)
>> {
>>                sprintf(str, "%08d^^%08d", i, i);
>>                {do hashes, etc}
>> }
>>
>> Each string was hashed with Pearson and with 32-bit FNV-1a. Three 
>> arrays of 256 counters were incremented based on the Pearson hash, 
>> based on the bottom byte of the 32-bit FNV after one 16-bit fold (
>> hash2 = (hash&0xffff) ^ (hash >> 16) ), and based on the bottom byte 
>> of the 32-bit FNV after an additional 8-bit fold (hash3 = 
>> (hash2&0xff) ^ (hash2 >> 8); the equivalent of xor'ing all four bytes together).
>> The expected number of hits on each bucket with perfect hashing would 
>> be 10,000. Here are the minimum and maximum number of hits and the 
>> standard deviation:
>>
>> Pearson: min=9678 max=19938 std-dev=630.72
>> FNV-32 1-fold: min=9699 max=10331 std-dev=103.06
>> FNV-32 2-folds: min=9662 max=10247 std-dev=90.78
>>
>> So you now have Brian Carpenter's results, showing FNV to be much 
>> better than a variety of other methods, but not Pearson, on fairly 
>> realistic data. And you have these results showing FNV to be much 
>> better than Pearson on somewhat unrealistic data. ...
>>
>> Thanks,
>> Donald
>> =============================
>> Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>> 155 Beaver Street, Milford, MA 01757 USA d3e3e3@gmail.com
>>
>>
>> On Fri, Sep 7, 2012 at 6:29 PM, Ted Lemon <Ted.Lemon@nominum.com> wrote:
>>> On Sep 7, 2012, at 5:37 PM, Donald Eastlake <d3e3e3@gmail.com> wrote:
>>>> FNV takes a little more effort than Pearson. It also has a smaller
>>>> footprint in memory. And I believe it will produce statistically
>>>> better results. What are the relative weights of these or other
>>>> factors in this case?
>>>
>>> Unfortunately, as far as I can tell neither of the studies you cite compares FNV to Pearson, although the results from Brian Carpenter's study are certainly pretty impressive.    I think more work is worse, smaller footprint probably doesn't matter (this algorithm is being run in DHCP servers, which typically aren't seriously memory-constrained), and statistically better results only matter if it's a pretty big difference, so it's frustrating that we don't have a comparison to the existing algorithm.
>>>
>>> To me, the biggest win of using FNV would be if there were an RFC to refer to, but it looks like this is not quite there yet.   So my conclusion is that I certainly don't oppose using FNV instead of Pearson, but I don't see a strong reason to support it either.
>>>
>> _______________________________________________
>> dhcwg mailing list
>> dhcwg@ietf.org
>> https://www.ietf.org/mailman/listinfo/dhcwg
> _______________________________________________
> dhcwg mailing list
> dhcwg@ietf.org
> https://www.ietf.org/mailman/listinfo/dhcwg



-- 
Andre Kostur
Distinguished Software Architect & Engineer
P: 604-678-2864
E: akostur@incognito.com
Toll-Free: 1-800-877-1856


F: 604-678-2864
VoIP: sip:864@sip.incognito.com