Re: [icnrg] New Version Notification for draft-hong-icnrg-bloomfilterbased-name-resolution-01.txt

Petri Jokela <petri.jokela@ericsson.com> Sat, 27 September 2014 07:17 UTC

Content-Type: multipart/signed; boundary="Apple-Mail=_7AA14684-2519-4DCD-95B9-4F1253442735"; protocol="application/pkcs7-signature"; micalg="sha1"
MIME-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: Petri Jokela <petri.jokela@ericsson.com>
In-Reply-To: <F8EFC212DF9A004DA18AA8FB011E42331D611152@SMTP1.etri.info>
Date: Sat, 27 Sep 2014 10:17:48 +0300
Message-ID: <3F24F5B9-BEDB-4C25-AFAB-F4DA368789A0@ericsson.com>
References: <20140917002758.31365.13695.idtracker@ietfa.amsl.com> <F8EFC212DF9A004DA18AA8FB011E42331D610CB8@SMTP1.etri.info>, <8A63F6C6-4361-4105-B1AD-04B3DCE99F6E@ericsson.com> <F8EFC212DF9A004DA18AA8FB011E42331D611152@SMTP1.etri.info>
To: 홍정하 <jhong@etri.re.kr>
Archived-At: http://mailarchive.ietf.org/arch/msg/icnrg/93gZ7-oldh86ku4MylDkEXWaSY0
Cc: "icnrg@irtf.org" <icnrg@irtf.org>
Subject: Re: [icnrg] New Version Notification for draft-hong-icnrg-bloomfilterbased-name-resolution-01.txt
Precedence: list

Hi, thanks for the clarifications. Unfortunately, I am not attending the Saturday meeting. I hope you have fruitful discussions there. See my reply in-line: 

On 27 Sep 2014, at 00:15, 홍정하 <jhong@etri.re.kr> wrote:

> Hi,
>  
> We can take advantage of that bloom filter size can be adjusted by the maximum number of names registered to a single NRS server to keep the FPP less than a desirable value.

> For example, the bloom filter with the size of 2MB for 10^6 entries can keep the FPP less than 0.00046. If we assume that the number of such a NRS server on the top level is 10^6 and they are fully peered, then the maximum memory required at each server is 2TB. To deal with such a size, we are considering of being supported by hardware using GPU or TCAM and we are working on it.

If you want to merge BFs at higher level (by ORing them together), you must use same size filters on each level. Filters with different sizes cannot be merged. Just to clarify, are you assuming that you are using different sizes of filters on different places in the network, or are you using one-size filters which must be optimised for the worst case? If you have different sizes of filters, you have to collect the lower layer filters to the higher layer servers and when this higher level B-NRS receives a request, it must match the item to each of the 10^6 BFs separately. If you are 

On the other hand, different sizes of filters on different B-NRS nodes would need a different set of hash functions for them. This is not a major thing, but not efficient because you have to do more hash calculations over the same item name to get correct values for different filters where we want to test the existence of the data item on a single server with multiple BFs from other B-NRSs.

If you are using same size filters on every layer to make it possible to OR them together, this requires huge filters because they must take into account also the worst case (i.e. the maximum number of information items that may be included in a single filter). 

> 
> As the draft [3] mentions, there are 10^9 nodes in the current Internet. The number of addressable ICN objects will be several orders of magnitude higher. This means that there will be, on average, tens of thousands of ICN objects per node. How many of the objects will be in the NRS, is not known. It is possible that some of the objects are only inside a single node and some globally available. Also, depending on the role of a node, the number of objects may be lower or higher. 
>  
> In B-NRS[3], it is not assumed that all names are globally visible. We are considering that the maximum number of names which are globally visible is 10^9 and the maximum number of B-NRS servers which are fully peered on the top level with 10^6 names for each is 10^3 ~ 10^6, which is the upper bound.

Now I must as that are you really talking about information items, not the node names. Considering the network today and what is available there, AND taking into account the new developments, such as IoT with lots of information, I guess that the amount of information items will be larger. If we have now 10^9 nodes in the network, your assumption would mean one information item published per one node (on average) globally, which is pretty low. Of course, not everything is globally visible, but still the amount looks low. 

There is also a scalability issue. What happens if the number of items still grows in the future and the filter sizes must be adjusted accordingly? If increasing the size is enough, it is probably not a big problem. But we should prepare us also for something that we cannot envision now.  

> 

> On the third level, the BFs from the ten 2nd level NRS are merged. The merged BF contains information of 2 000 000 000 objects. In theory, the 3rd level NRS can use this merged BF to check if a certain ICN object can be found in any of the 1st level NRSes in this tree. 
>  
> It has to be corrected like that the 3rd level NRS checks for the 2nd level and the 2nd level does for the 1st level

Yes, true. What is assumed to be the highest level of NRSs, is three levels enough?

> 
> - What is the acceptable rate for false positives?
> We are considering that FPP is less than 0.001 (i.e., 0.1%, 1000 out of 1 M entries).

Reasonable. However, this changes my original calculations, where I assumed slightly worse performance. 

> 
> - How big BFs are feasible to send and receive in the network? 
> When bloom filters at each server are initially built, it will take a long time. However, once they are established, the insertion is done by each new name not the whole bloom filter.

> On bloom filter updates for deletions, we are considering of refresh by shadow memory and the frequency can be from several hours to several days

Inserting items into a BF is not a problem, the problem comes when items should be removed and the whole BF must be recalculated as you say and the update frequency obviously depends on many things. 

> - The verification process should be fast, how can we efficiently match publications to filters with gigabit size?
> With H/W assisted implementation using GPU, chunks including the corresponding bit set to 1 for hash value can be loaded to GPU memory and a child can be selected by bit-wise AND.

Ok, thanks! Although this probably requires some processing to find these chunks before matching can be done. But, as said, I don’t know much about the optimisations that you can do on lower layers. 

BR, Petri

> 
> [1] Bellovin S., “Using Bloom Filters for Authenticated Yes/No Answers in the DNS”, draft-bellovin-dnsext-bloomfilt-00.txt, December 2001, Expired
> [2] http://pages.cs.wisc.edu/~cao/papers/summary-cache/node8.html
> [3] Hong, J., Chun, W., Jung, H., “Bloom Filter-based Flat Name Resolution System for ICN”, draft-hong-icnrg-bloomfilterbased-name-resolution-01.txt

-- 
Petri Jokela
Senior researcher
NomadicLab, Ericsson Research
Oy L M Ericsson Ab                  

E-mail: petri.jokela@ericsson.com
Mobile: +358 44 299 2413

Attachment: smime.p7s

[icnrg] FW: New Version Notification for draft-ho… 홍정하
Re: [icnrg] New Version Notification for draft-ho… Petri Jokela
Re: [icnrg] New Version Notification for draft-ho… 정희영
Re: [icnrg] New Version Notification for draft-ho… 홍정하
Re: [icnrg] New Version Notification for draft-ho… Petri Jokela
Re: [icnrg] New Version Notification for draft-ho… Konstantinos V. Katsaros
Re: [icnrg] New Version Notification for draft-ho… Konstantinos Katsaros
Re: [icnrg] New Version Notification for draft-ho… Petri Jokela
Re: [icnrg] New Version Notification for draft-ho… 홍정하
Re: [icnrg] New Version Notification for draft-ho… 홍정하
Re: [icnrg] New Version Notification for draft-ho… Christian Esteve Rothenberg
Re: [icnrg] New Version Notification for draft-ho… 홍정하
Re: [icnrg] New Version Notification for draft-ho… 홍정하
Re: [icnrg] New Version Notification for draft-ho… Konstantinos Katsaros

Re: [icnrg] New Version Notification for draft-hong-icnrg-bloomfilterbased-name-resolution-01.txt

Attachment: smime.p7s