Re: [dhcwg] Load Balancing for DHCPv6

Andre Kostur <akostur@incognito.com> Fri, 21 September 2012 16:44 UTC

Return-Path: <akostur@incognito.com>
X-Original-To: dhcwg@ietfa.amsl.com
Delivered-To: dhcwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADA0F21F87D5 for <dhcwg@ietfa.amsl.com>; Fri, 21 Sep 2012 09:44:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.077
X-Spam-Level:
X-Spam-Status: No, score=-5.077 tagged_above=-999 required=5 tests=[AWL=-0.900, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, J_CHICKENPOX_31=0.6, J_CHICKENPOX_33=0.6, J_CHICKENPOX_73=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HFaPduhmFvln for <dhcwg@ietfa.amsl.com>; Fri, 21 Sep 2012 09:44:01 -0700 (PDT)
Received: from na3sys010aog108.obsmtp.com (na3sys010aog108.obsmtp.com [74.125.245.84]) by ietfa.amsl.com (Postfix) with SMTP id 6C6BB21F87E6 for <dhcwg@ietf.org>; Fri, 21 Sep 2012 09:44:00 -0700 (PDT)
Received: from mail-ie0-f172.google.com ([209.85.223.172]) (using TLSv1) by na3sys010aob108.postini.com ([74.125.244.12]) with SMTP ID DSNKUFyZUGsq/1Oh79a51yXLqWnFyTgigHJq@postini.com; Fri, 21 Sep 2012 09:44:01 PDT
Received: by iec9 with SMTP id 9so6776429iec.31 for <dhcwg@ietf.org>; Fri, 21 Sep 2012 09:44:00 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=z0LRJ4gNpYglArXvEvdG1om5QmHavf1USOjFsHBfMu0=; b=HpHTjtFZ+V62ayUSGFV78OdgzfErZIHV86eCzgfmXcu8pGqWZ1azGDLc/xlafWElH2 yv0hrFSdbGoG6hdFIfOry3jTaaEz+PgXWU+c3zOWOB41SLBRa0jZVpoRa5kvqHTR/sP5 JNwm+dWB00lpbUyrFBtb86V087JkJcnyuAhuBTuzZi5mRWVl+twnqW7agSXF12hLz+d+ MMTsJetl4TLaiMBtSIohizB8KN9l+lcxoPzeOuPLNutdLYGxJWbAf69R9Pu4t4Kurl/J EUp7aCXneNUBOsjp0MZiaQyV14Iho4ZmnH7qT6Hm28d3p6Q6A+hJmpxqPEgLhEtm6jyh 9g/g==
MIME-Version: 1.0
Received: by 10.50.152.137 with SMTP id uy9mr1517150igb.62.1348245839597; Fri, 21 Sep 2012 09:43:59 -0700 (PDT)
Received: by 10.64.100.137 with HTTP; Fri, 21 Sep 2012 09:43:59 -0700 (PDT)
In-Reply-To: <1BB5EDFF-BBA9-4335-BE55-058FACEE30EB@cisco.com>
References: <CAL10_Bqa4ftiVhyyf0ezwKR7mzAEOLNi_K3EJFPFUzPnz7QGPw@mail.gmail.com> <489D13FBFA9B3E41812EA89F188F018E0F4F3093@xmb-rcd-x04.cisco.com> <CAL10_Br=OcWZuar1fDOopevTy_W-3g9TsYqo61rOWm4tKkuYgg@mail.gmail.com> <2D09D61DDFA73D4C884805CC7865E61118003F@GAALPA1MSGUSR9N.ITServices.sbc.com> <CAL10_BpXdx03WfV1PeMKg1zYc1dAFKe1CDNdrcNf45+_EVCBPg@mail.gmail.com> <CDDB9016-BE11-489A-9361-0172D96A464C@nominum.com> <CAOpJ=k2CJS=FuUvFwOq=s2m871_qfo=xROsW=fx0E48w2wxAqQ@mail.gmail.com> <CAL10_BoLsdppYKNSfHheYrZg+SfaggoynQf2X11HEdy=ELFUiQ@mail.gmail.com> <5049C317.7090603@gmail.com> <94FA926F-2432-4AE7-BC20-AE7458AB40D9@cisco.com> <CAF4+nEHqRFHbz9qfQuOqpLCNeZqkT=+f53_eCboECfWX8QCt6A@mail.gmail.com> <71F17433-B2D9-4366-9B32-F0E4D294EDB5@nominum.com> <CAF4+nEE6pbmO_ss+3UEpRG1kh2YD20P2KD4CFQ7LwoRZ6gyxsg@mail.gmail.com> <A563FEA0-0A26-431D-BDAA-AD897F691754@nominum.com> <CAF4+nEFqLW6p1wE9BWE6c2CqGDwzUQTYom9K4icV1_NjM-nTXg@mail.gmail.com> <1BB5EDFF-BBA9-4335-BE55-058FACEE30EB@cisco.com>
Date: Fri, 21 Sep 2012 09:43:59 -0700
Message-ID: <CAL10_Bo_DE1Lq4BuXRaMmBs9Fah=4TxroBgUEQ8V_6VC2NnTzQ@mail.gmail.com>
From: Andre Kostur <akostur@incognito.com>
To: "Bernie Volz (volz)" <volz@cisco.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQnSnYeL+UOV8EdmjfRqQAmKxze7GC2uhHcKP6J6tb89XiobRreC7Phu1Cq8+X0PDW9uFdpj
Cc: "dhcwg@ietf.org" <dhcwg@ietf.org>, Ted Lemon <Ted.Lemon@nominum.com>
Subject: Re: [dhcwg] Load Balancing for DHCPv6
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <dhcwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dhcwg>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 16:44:02 -0000

I've crafted up a test rig for Pearson vs. FNV1a   (Pearson code from
the RFC, FNV1a from
http://isthe.com/chongo/tech/comp/fnv/#FNV-reference-source, with the
XOR-folding at http://www.isthe.com/chongo/tech/comp/fnv/index.html#xor-fold).

I generated my DUIDs as a set of DUID-LLs (so the first four bytes are
00, 03, 00, 06).  The next three were taken randomly from IEEE's list
of registered OUIs.  The remaining three are rand().

Running those through both Pearson and FNV1a 5 times each (on a set of
2,560,000 DUID-LLs) resulted in:

Pearson:
Min: 9637 - 9750, Max: 10220 - 10318, StdDev: 92.91 - 102.28

FNV1a:
Min: 9691 - 9754, Max: 10221 - 10379, Stdev: 93.27 - 107.66


This seems to suggest that Pearson is better than FNV1a over DUID-LLs.


On Thu, Sep 20, 2012 at 5:05 AM, Bernie Volz (volz) <volz@cisco.com> wrote:
> Bad test - This is a very limited character set.
>
> The data usually used for dhcp is binary data.
>
> Sent from my iPad
>
> On Sep 20, 2012, at 1:27 AM, "Donald Eastlake" <d3e3e3@gmail.com> wrote:
>
>> Well, I have some more statistics. One my co-authors on the draft
>> whipped up some code that I then extended. The data hashed consisted
>> of 2,560,000 strings calculated as follows:
>>        #define RUN 10000
>>        for(i = 0; i < 256*RUN; ++i)
>> {
>>                sprintf(str, "%08d^^%08d", i, i);
>>                {do hashes, etc}
>> }
>>
>> Each string was hashed with Pearson and with 32-bit FNV-1a. Three
>> arrays of 256 counters were incremented based on the Pearson hash,
>> based on the bottom byte of the 32-bit FNV after one 16-bit fold (
>> hash2 = (hash&0xffff) ^ (hash >> 16) ), and based on the bottom byte
>> of the 32-bit FNV after an additional 8-bit fold (hash3 = (hash2&0xff)
>> ^ (hash2 >> 8); the equivalent of xor'ing all four bytes together).
>> The expected number of hits on each bucket with perfect hashing would
>> be 10,000. Here are the minimum and maximum number of hits and the
>> standard deviation:
>>
>> Pearson: min=9678 max=19938 std-dev=630.72
>> FNV-32 1-fold: min=9699 max=10331 std-dev=103.06
>> FNV-32 2-folds: min=9662 max=10247 std-dev=90.78
>>
>> So you now have Brian Carpenter's results, showing FNV to be much
>> better than a variety of other methods, but not Pearson, on fairly
>> realistic data. And you have these results showing FNV to be much
>> better than Pearson on somewhat unrealistic data. ...
>>
>> Thanks,
>> Donald
>> =============================
>> Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>> 155 Beaver Street, Milford, MA 01757 USA
>> d3e3e3@gmail.com
>>
>>
>> On Fri, Sep 7, 2012 at 6:29 PM, Ted Lemon <Ted.Lemon@nominum.com> wrote:
>>> On Sep 7, 2012, at 5:37 PM, Donald Eastlake <d3e3e3@gmail.com> wrote:
>>>> FNV takes a little more effort than Pearson. It also has a smaller
>>>> footprint in memory. And I believe it will produce statistically
>>>> better results. What are the relative weights of these or other
>>>> factors in this case?
>>>
>>> Unfortunately, as far as I can tell neither of the studies you cite compares FNV to Pearson, although the results from Brian Carpenter's study are certainly pretty impressive.    I think more work is worse, smaller footprint probably doesn't matter (this algorithm is being run in DHCP servers, which typically aren't seriously memory-constrained), and statistically better results only matter if it's a pretty big difference, so it's frustrating that we don't have a comparison to the existing algorithm.
>>>
>>> To me, the biggest win of using FNV would be if there were an RFC to refer to, but it looks like this is not quite there yet.   So my conclusion is that I certainly don't oppose using FNV instead of Pearson, but I don't see a strong reason to support it either.
>>>
>> _______________________________________________
>> dhcwg mailing list
>> dhcwg@ietf.org
>> https://www.ietf.org/mailman/listinfo/dhcwg
> _______________________________________________
> dhcwg mailing list
> dhcwg@ietf.org
> https://www.ietf.org/mailman/listinfo/dhcwg



-- 
Andre Kostur
Distinguished Software Architect & Engineer
P: 604-678-2864
E: akostur@incognito.com
Toll-Free: 1-800-877-1856


F: 604-678-2864
VoIP: sip:864@sip.incognito.com