[P2PSIP] Review of RELOAD base draft
Jouni Mäenpää <jouni.maenpaa@ericsson.com> Fri, 27 November 2009 10:49 UTC
Return-Path: <jouni.maenpaa@ericsson.com>
X-Original-To: p2psip@core3.amsl.com
Delivered-To: p2psip@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8D2433A6A16 for <p2psip@core3.amsl.com>; Fri, 27 Nov 2009 02:49:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.648
X-Spam-Level:
X-Spam-Status: No, score=-2.648 tagged_above=-999 required=5 tests=[BAYES_05=-1.11, HELO_EQ_SE=0.35, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4, SARE_SPEC_REPLICA_OBFU=1.812]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Gcd2yWrENeok for <p2psip@core3.amsl.com>; Fri, 27 Nov 2009 02:49:54 -0800 (PST)
Received: from mailgw4.ericsson.se (mailgw4.ericsson.se [193.180.251.62]) by core3.amsl.com (Postfix) with ESMTP id 870063A6A15 for <p2psip@ietf.org>; Fri, 27 Nov 2009 02:49:53 -0800 (PST)
X-AuditID: c1b4fb3e-b7b33ae0000045f9-44-4b0faeca10d9
Received: from esealmw128.eemea.ericsson.se (Unknown_Domain [153.88.253.125]) by mailgw4.ericsson.se (Symantec Mail Security) with SMTP id 49.75.17913.ACEAF0B4; Fri, 27 Nov 2009 11:49:46 +0100 (CET)
Received: from esealmw126.eemea.ericsson.se ([153.88.254.170]) by esealmw128.eemea.ericsson.se with Microsoft SMTPSVC(6.0.3790.3959); Fri, 27 Nov 2009 11:49:17 +0100
Received: from mail.lmf.ericsson.se ([131.160.11.50]) by esealmw126.eemea.ericsson.se with Microsoft SMTPSVC(6.0.3790.3959); Fri, 27 Nov 2009 11:49:16 +0100
Received: from nomadiclab.lmf.ericsson.se (nomadiclab.lmf.ericsson.se [131.160.33.3]) by mail.lmf.ericsson.se (Postfix) with ESMTP id A1AA624FF; Fri, 27 Nov 2009 12:49:16 +0200 (EET)
Received: from nomadiclab.lmf.ericsson.se (localhost [127.0.0.1]) by nomadiclab.lmf.ericsson.se (Postfix) with ESMTP id 67C8221A21; Fri, 27 Nov 2009 12:49:16 +0200 (EET)
Received: from [127.0.0.1] (localhost [127.0.0.1]) by nomadiclab.lmf.ericsson.se (Postfix) with ESMTP id B6B5B21A20; Fri, 27 Nov 2009 12:49:15 +0200 (EET)
Message-ID: <4B0FAEAC.40104@ericsson.com>
Date: Fri, 27 Nov 2009 12:49:16 +0200
From: Jouni Mäenpää <jouni.maenpaa@ericsson.com>
User-Agent: Thunderbird 2.0.0.17 (X11/20080925)
MIME-Version: 1.0
To: "David A. Bryan" <dbryan@ethernot.org>, Brian Rosen <br@brianrosen.net>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 8bit
X-Virus-Scanned: ClamAV using ClamSMTP
X-OriginalArrivalTime: 27 Nov 2009 10:49:17.0026 (UTC) FILETIME=[442B2C20:01CA6F4F]
X-Brightmail-Tracker: AAAAAA==
Cc: "p2psip@ietf.org" <p2psip@ietf.org>
Subject: [P2PSIP] Review of RELOAD base draft
X-BeenThere: p2psip@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Peer-to-Peer SIP working group discussion list <p2psip.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/p2psip>, <mailto:p2psip-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/p2psip>
List-Post: <mailto:p2psip@ietf.org>
List-Help: <mailto:p2psip-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/p2psip>, <mailto:p2psip-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2009 10:49:56 -0000
Hi, I volunteered to review the RELOAD base draft in the P2PSIP session in IETF 76. My collegue Ari Keranen also volunteered after the IETF meeting. Since it's a long draft, we decided to do the review together. Our comments are listed below. Cheers, Jouni Maenpaa Ari Keranen Page 13, Section “1.2.1. Usage Layer” The text talks about the abstract Message Transport API. Could the functionality that the API provides be described somewhere in more detail? Page 17, Section “2. Terminology” (1) There are some terms marked with a (*) that are new and “perhaps should be added to the concepts document”. I suppose a decision needs to be made whether to have the terms permanently in the base draft or whether to move them to the concepts draft. (2) Should the terms Kind and Kind-ID be defined here? Page 22, Section “3.2.1. Client Routing” (1) “Note that clients that choose this option MUST process Update messages from the peer. Those updates can indicate that the peer no longer owns the Client's Node-ID.” Should the content of these Update requests be defined somewhere in the draft or are the regular Updates enough for doing this? (2) If I have understood correctly, a peer does not know whether a particular node in its connection table is a peer or a client. Therefore, it has to send stabilization messages (i.e., Updates) to all nodes in its connection table. The cost of this can be very high in overlays experiencing churn. I have more comments on this topic later (in the comments I have for section 9). If a peer could know which nodes in the connection table are clients and which are peers, Updates could be sent only to those nodes (i.e., clients) that need to be informed immediately (peers are informed through the periodic stabilization process so there is no need to send a reactive Update). (3) The description of the second routing option by which a client may be located in an overlay contains the following text: “The Destination List required to reach it [the client] must be learnable via other mechanisms, such as being stored in the overlay by a usage...” Since this alternative is given in the base draft, should the base draft also describe the usage for storing Destination Lists? Without such a usage, the second routing option is not really usable. (4) Should the overlay configuration document contain information about which client routing mode is used in the overlay? Page 22, Section “3.2.2. Minimum Functionality Requirements for Clients” “Implement RELOAD's connection-management connections” What are connection-management connections? Page 23, Section “3.3. Routing” “Client promotion: RELOAD must support clients that become peers at a later point as determined by the overlay algorithm and deployment.” Should client promotion be described somewhere in the base draft? Page 27, step (4) of the second operation required to join the overlay “AP does Updates to JP and to other peers...” Doing this is probably ok in a small and low-churn network (i.e., in cases when it is ok to use reactive stablization). However, in a larger, higher-churn overlay (in which periodic stabilization would be used instead of reactive stabilization), sending Updates to all peers in the AP's connection (or routing/neighbor) table might be too expensive. Such Updates may not be needed since the other peers will learn about the JP through the periodic stablization process. Page 32, Section “4.3. Application Connectivity” “Each usage specifies which types of connections can be initiated using AppAttach.” Should this requirement be listed also in Section “4.1.2. Usages”? Page 36, third paragraph (1) “When an originating node transmits a requests it MUST set a 3 second timer” Should the value of the timer be configurable? (2) Should there be a back-off mechanism for the retransmissions? (3) “If a response has not been received when the timer fires, the request is retransmitted...” How is this related to the procedures in Section “5.6.2. Reliability for Unreliable Links”? Is it intentional that both the originator of the request and the intermediate peers retransmit the messages? (4) “The request MAY be retransmitted up to 4 times...” Should the maximum number of retransmissions be configurable? When should one use less than four retransmissions? (5) The P2PP protocol had a pretty nice way of describing the transaction state machines, including the various timers and retransmissions (see http://tools.ietf.org/id/draft-baset-p2psip-p2pp-01.txt, figures 8-11). I would like to see similar diagrams in the RELOAD base draft as well. This would be very helpful when implementing RELOAD. P2PP also specified different state machines for reliable and unreliable transports. This could be done in the RELOAD base draft as well. Page 36, fourth paragraph “the maximum request lifetime (15s)” Should the maximum request lifetime be configurable? Page 38, first paragraph There is an open question: “We consider it an open question what the final protocol definition method and encodings use.” I suppose this open question can be removed. Page 38 (1) “Note: the use of "typedef" here is an extension to the TLS language, but its meaning should be relatively obvious” Perhaps it would be a good idea to explain the meaning anyway. (2) Is an opaque string of bytes always preceded by one or more length bytes like in case of ResourceID? Could you clarify this in the draft (it is currently not very clear)? Page 41, “version” field in the forwarding header Just wanted to verify that it is intentional to still use version 0.1. Page 42, Section “5.3.2.1. Processing Configuration Sequence Numbers” (1) Is the Config_Update request sent on a direct connection or across the overlay? If it is sent across the overlay via multiple hops, then it can cause additional load at intermediate peers. The Config_Update request may also be considerably larger than other requests sent in the overlay, because the configuration document can be rather large.. I would like to see some additional discussion on the traffic load the proposed mechanism can cause. If the configuration documents propagate very quickly throughout the system, then there might be a considerable short-term traffic peak. Perhaps there is a need for a mechanism that can limit the additional traffic caused by Config_Update requests. (2) As an optimization, intermediate peers could potentially inspect the configuration documents in the Config_Update messages they forward and update their own configuration settings if they see a newer document. However, I am not sure whether this would reduce the load enough. Page 46 Where is the format of the MessageCode defined? Page 48, definition of error_info What is the format of the text string? UTF-8? Page 52, Section “5.4.1. Topology Plugin Requirements” (1) “All overlay algorithms MUST specify maintenance procedures that send Updates to all members of the Connection Table whenever the range of Ids for which the peer is responsible changes” Doesn't this mean that all overlay algorithms are mandated to use reactive stablization? I would rephrase this to something like “All overlay algorithms MUST specify maintenance procedures.”, that is, not mandate reactive Updates. (2) Further, sending to all members of the Connection Table might be an overkill. Wouldn't it be enough to send only to the peers in the Neighbor set (provided that the overlay algorithm that is used maintains such a set)? Page 54, Section “5.4.2.3. Update” (1) “In general, peers MUST send Update messages to all their adjacencies whenever they detect a topology shift” Again, the draft is mandating the use of reactive stablization. So I guess this sentence should be removed. Alternatively, the MUST could be changed to MAY. (2) “The UpdateAns response is expected to be either success or an error” I think that in some cases, it might be beneficial to have some content in the UpdateAns response (e.g., routing table entries). Therefore, I think the draft should make it possible for UpdateAns responses to contain overlay-specific data (in the same way it allows this for UpdateReq requests). Page 59, AttachReqAns struct Is the “role” encoded as a string? Wouldn't an enum be sufficient? Page 60, definition of “rel_addr_port” “corresponds to the rel-addr and rel-port productions. Only present for type "relay".” Should this be present also for srflx and prflx types as in the struct on the previous page? Page 61, Section “5.5.1.3. Using ICE With RELOAD” “Because RELOAD always tries to use TCP and then UDP as a fallback [...]” Is this normative? Can't a peer prefer UDP? Page 70, Section “5.5.6.2. Response Definition” “If the Config_UpdateReq is of type “kind” it MUST only be processed if it is correctly digitally signed by an acceptable kind signer.” Could you define what an “acceptable kind signer” is here? Perhaps I missed something, but this was unclear to me. Page 73, Section “5.6.2.2. Retransmission and Flow Control” This section mentions multiple alternatives for the retransmission and congestion control scheme. Should all the peers in the overlay use the same retransmission and congestion control scheme? In that case, the scheme to use should probably be specified in the overlay configuration document. Page 82, text about replica numbers “If the replica number is zero, then the peer MUST check that it is responsible for the resource and, if not, reject the request. If the replica number is nonzero, then the peer MUST check that it expects to be a replica for the resource and that the request sender is consistent with being the responsible node (i.e., that the receiving peer does not know of a better node) and, if not, reject the request.” This text may assume too much about the replication strategy that is being used. As far as I know, in some replication strategies (e.g., multi publication key replication), a peer storing a replica may not be able to check whether it is supposed to store a replica (in the successor replication strategy that RELOAD uses, this can be done by checking the predecessor list). Therefore, I think this text should be made more generic so that it allows the use of other replication strategies. Page 96, Section “8. TURN Server Usage” Perhaps there could be some additional text in this section stating that the overlay MAY use an alternative, more advanced service discovery mechanism for locating TURN servers if such a mechanism is available. Pages 97-98, Section “9. Chord Algorithm” (1) One additional major difference to the original Chord algorithm that should be mentioned here is that chord-reload can optionally use reactive stabilization. The original Chord algorithm does not use reactive stablization but only periodic stabilization. One could also explain why reactive stabilization is allowed in chord-reload (i.e., that it has been shown to work well in small, low-churn overlays). One should also add some information about previous results, which have shown that reactive recovery can result in degraded performance or even congestion collapse in large-scale (1000 peers or more) overlays experiencing moderate or high levels of churn. Results proving that reactive recovery is not optimal in such networks are available in [1]. (2) It should also be mentioned in here that reactive stabilization MAY be disabled in chord-reload (the “if using reactive recovery” statements in the text actually allow this). In this case, only periodic stabilization is used. Use of periodic stabilization is more appropriate in large-scale (1000 peers or more) overlays experiencing churn. Page 98, Section “9.1. Overview” (1) “The neighbor table typically contains the three peers before this peer and the three peers after it in the DHT ring” The draft should state here that the successor list and the predecessor list can contain more entries if necessary. The Chord paper [2] states that O(log2(N)) successors should be maintained to increase robustness. Maintaining three successors is sufficient in very small overlays. However, in a Chord ring with for instance 1000 peers, on the order of 10 successors should be maintained to make the system more robust. Perhaps the number of predecessors and successors to maintain should be made configurable? (2) “There may not be three entreis...” => “There may not be three *entries*...” Page 99, Section “9.3. Redundancy” (1) “When a peer receives a Store request for Resource-ID k, and it is responsible for Resource-ID k, it stores the data and returns a success response. It then sends a Store request to its successor in the neighbor table and to that peer's successor.” I would expect that in large-scale overlays experiencing churn, it may be necessary to store more than 2 replica copies. Perhaps the number of replicas to store should be configurable (of course the size of the neighbor table limits the number of replicas that can be stored). (2) This section is missing details on when replicas can be removed and when the replicating peer needs to store further replicas. Those details are specified in the end of Section “9.6.3. Receiving Updates”. Could those details be moved to Section 9.3? Page 99, Section “9.4. Joining” (1) “3. JP SHOULD send Attach requests to initiate connections to each of the peers in the neighbor table as well as to the desired 16 finger table entries.” Should it be mentioned here that also finger tables having more than 16 entries are allowed? Perhaps the number of fingers to maintain should be made configurable. (2) “8. AP MUST send an Update to all its neighbors with the new values of its neighbor set (including JP).” Informing all the neighbors of the AP may be unnecessary if periodic recovery is being used. Could this sentence be modified as follows: “8. If using reactive recovery, the AP MUST send an Update...” If periodic recovery is used, then peridic stabilization will take care of informing the neighbors of the AP that a new peer has joined. (3) “In order to set up its neighbor table entry for peer i...” I guess this should be “In order to set up its finger table entry for peer i...” Page 102, Section “9.6.1. Handling Neighbor Failures” (1) This section and many other subsections contain statements like this: “If using reactive recovery...” I think it should be stated explicitly somewhere (in the beginning of Section 9, as I am suggesting above) that reactive recovery can be disabled, in which case only periodic recovery is used. (2) Should the overlay configuration document indicate whether reactive recovery is used or not in the overlay? Page 104, second paragraph “If peer N which is responsible for a Resource-ID R discovers that the replica set for R (the next two nodes in its successor set)...” Same comment as before: perhaps replica sets consisting of more than two nodes shoud be allowed. Page 104, Section “9.6.4.1. Updating Neighbor Table” (1) “A peer MUST periodically send an Update request to every peer in its Connection Table” Should “Connection Table” be rather “Neighbor Table”? Sending to every peer in the Connection Table means that an Update is sent also to the fingers (i.e., at the very minimum to 3+3+16=22 peers, to all the clients attached to the peer, and also to all of the peers that are in the connection table but not in the routing table; such as peers that are using this peer as a finger). This may cause too much traffic. Note that the original Chord algorithm sends only an Update to the first predecessor and to the first successor. I would rephrase this to something like: “A peer MUST periodically send an Update request to all or a subset of the peers in its neighbor table” (2) “The default time is about every ten minutes.” This is not sufficient in large-scale networks experiencing churn. However, it may be sufficient in small-scale, low-churn overlays in which reactive recovery is used. Therefore, I would modify this as follows: “If reactive recovery is used, the default time is about every ten minutes.” Page 104-105, Section “9.6.4.2. Refreshing finger table” (1) “A peer SHOULD NOT send Ping requests looking for new finger table entries more often than the configuration element “chord-reload-ping-interval”, which defaults to 3600 seconds (one per hour)”. This is not sufficient in large-scale networks experiencing churn. However, it may be sufficient in small-scale, low-churn overlays in which reactive recovery is used. Therefore, I would modify this as follows: “A peer SHOULD NOT send Ping requests looking for new finger table entries more often than the configuration element “chord-reload-ping-interval”. If reactive recovery is used, this defaults to 3600 seconds (one per hour).” (2) Paragraph 4 on page 105 describing Alternative 2: I would add a warning here stating that Alternative 2 should only be used in small-scale, low-churn overlays because it can cause much more traffic than Alternative 1. Alternative 1 is more appropriate for large-scale overlays experiencing churn. Page 112, last paragraph of Section 10.1. “When a node receives a new configuration file, it MUST change its configuration to meet the new requirements. This may require the node to exit the DHT and re-join.” Exiting and re-joining as a result of receiving a new configuration file is potentially a huge problem. If the configuration file spreads rapidly in the overlay, there is a short period of time during which all the peers will exit and re-join. The resulting churn will most probably be so high that the network will collapse. Page 138, Section “15.2. Informative References” [I-D.maenpaa-p2psip-self-tuning]: there is a newer version of the draft available: [I-D.maenpaa-p2psip-self-tuning] Maenpaa, J., Camarillo, G., and J. Hautakorpi, "A Self- tuning Distributed Hash Table (DHT) for REsource Location And Discovery (RELOAD)", draft-maenpaa-p2psip-self-tuning-01 (work in progress), October 2009. ==== Nits ==== Node-ID is sometimes spelled as “Node-Id” and sometimes as “Node-ID”. Further, sometimes "Peer-ID" or "peer ID" is used. Page 10, Section “1.2. Architecture” “RELOAD is fundamentally an overlay network” I guess it would be better to say “The RELOAD network is fundamentally an overlay network” to be consistent with the text in the previous subsection. Page 12, last paragraph “the roles of various layer” => “the roles of various *layers*” Page 13, the figure There seem to be some indentation issues in the figure. Page 25 “RELOAD does not support an intermediate peer returning a response that it will not recursively route a normal request” Is something missing from this sentence? Page 35, Section “5.2.1 Request Origination” “The simplest such destination list is a single entry containing the peer or Resource-ID.” => “The simplest such destination list is a single entry containing the *Node*-ID or Resource-ID.” Page 43, Section 5.3.2.2 enum DestinationType uses term “node”, whereas the definition of “type” on the following page uses “peer”. Page 44, definition of “type” Could you open the acronym PDU? Page 58, Section “5.5.1. Attach”, first paragraph “A node sends an Attach request when it wishes to establish a direct TCP or UDP connection...” Should this be “a direct TLS or DTLS connection...”? Page 66, row 6 “Check the certificate receive in...” => “Check the certificate *received* in...” Page 74, first paragraph “...if it is known that all nodes are are...” => “...if it is known that all nodes *are*...” Page 102, the description of the “fingers” field: “The finger table if the Updating peer...” => “The finger table *of* the updating peer...” Page 111, definition of “multicast-bootstrap” “This element represents the address of a multicast, boradcast...” => “This element represents the address of a multicast, *broadcast*...” Page 116, second last paragraph of Section 10.3. “...and have appropriate HTTP stats codes.” => “...and have appropriate HTTP *status* codes.” Page 124, Section “12.4. Certificate-based Security” (1) “The user is also assigned one or more Node-IDs by the central enrollment authority. Both the name and the peer ID are placed in the certificate,...” => “The user is also assigned one or more Node-IDs by the central enrollment authority. Both the name and the *Node-ID* are placed in the certificate,...” (2) “o As a overlay peer with the peer ID(s)...” => “o As a overlay peer with the *Node-ID(s)*...” Page 137, Section “14. Acknowledgments” “Hautakorp” => “Hautakorpi” Page 143, “Appendix B. AIMD Retransmission Scheme” “...and is based on the theAIMD...” => “...and is based on *the AIMD*...” [1] Rhea, S., Geels, D., Roscoe, T., and J. Kubiatowicz, "Handling churn in a DHT", In Proc. of the USENIX Annual Techincal Conference June 2004. [2] Stoica, I., Morris, R., Liben-Nowell, D., Karger, D., Kaashoek, M., Dabek, F., and H. Balakrishnan, "Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications", IEEE/ACM Transactions on Networking Volume 11, Issue 1, 17-32, Feb 2003.
- [P2PSIP] Review of RELOAD base draft Jouni Mäenpää
- Re: [P2PSIP] Review of RELOAD base draft Ari Keranen
- Re: [P2PSIP] Review of RELOAD base draft Bruce Lowekamp
- Re: [P2PSIP] Review of RELOAD base draft Ari Keranen
- Re: [P2PSIP] Review of RELOAD base draft Jouni Mäenpää