Re: [P2PSIP] Revisions to draft-ietf-p2psip-base-24 made in response to 2nd IETF LC

"Bless, Roland (TM)" <roland.bless@kit.edu> Mon, 25 February 2013 16:16 UTC

Message-ID: <512B8E5A.7060400@kit.edu>
Date: Mon, 25 Feb 2013 17:16:26 +0100
From: "Bless, Roland (TM)" <roland.bless@kit.edu>
Organization: Institute of Telematics, Karlsruhe Institute of Technology
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2
MIME-Version: 1.0
To: Dean Willis <dean.willis@softarmor.com>
References: <CAOHm=4tKTzqsKmpXAO65_WveuZ=ayKByt6K1vjB4t910Zv_a+Q@mail.gmail.com>
In-Reply-To: <CAOHm=4tKTzqsKmpXAO65_WveuZ=ayKByt6K1vjB4t910Zv_a+Q@mail.gmail.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Cc: Cullen Jennings <fluffy@cisco.com>, "Goltsman, Polina" <polina.goltsman@student.kit.edu>, p2psip@ietf.org, Marc Petit-Huguenin <marc@petit-huguenin.org>, p2psip-chairs@tools.ietf.org
Subject: Re: [P2PSIP] Revisions to draft-ietf-p2psip-base-24 made in response to 2nd IETF LC
Precedence: list

Hi,

Thanks for reviewing our long list of comments (just to clarify:
there was no clear distinction between Polina's comments and comments
made by me, i.e. it just was a common list, some issues found by her
some by me - not sure how you made a distinction :-). However, we still
are not fully happy with the current version and want to propose some
clearer wording:

> 1) PG Major issue 1: sec. 6.3.2:
> length field in Forwarding Header covers what exactly? it is not
> really clear whether the length field counts the whole message or in
> case of fragmentation only the fragment size. When reading Sec 6.7 is
> seems that the former is meant, but the definition could and should be
> clearer. Similarly, sec. 6.7 should be clear about this, e.g.,
> describing that all Forwarding Headers are identical for fragments of
> the same message with exception of the fragment field.
>
> Revised:

Our point was, that according to the version 24 of the draft,
interpretations of ForwardingHeader::Length field as 1) total length
of the message or 2) length of a fragment were equally valid, and
that the decision on which interpretation to use, couldn't be left
to the implementation. We actually interpreted it as possibility 1)
while it is now 2)...
In versions 25/26, the exact interpretation of length as being the
length of fragment is now present but we think it could be more precise
nevertheless:

Sec 6.3.2.
==========
old:
        length: The count in bytes of the size of the message including
        the header, after the eventual fragmentation.
new:
        length: The count in bytes of the size of the message including
        the header. If the message is fragmented, this is the size of
        the current fragment (including the header), not the total
        message length before fragmentation.

Sec 6.7
=======
old:

If these elements cannot fit within the MTU of the underlying datagram
protocol, RELOAD fragmentation is not performed and IP-layer
fragmentation is allowed to occur. The length field MUST contain the
size of the message after fragmentation. When a message MUST be
fragmented, it SHOULD be split into equal-sized fragments that are no
larger than the PMTU of the next overlay link minus 32 bytes. This is to
allow the via list to grow before further fragmentation is required.

new:
If these elements cannot fit within the MTU of the underlying datagram
protocol, RELOAD fragmentation is not performed and IP-layer
fragmentation is allowed to occur. When a message MUST be fragmented, it
SHOULD be split into equal-sized fragments that are no larger than the
PMTU of the next overlay link minus 32 bytes. This is to allow the via
list to grow before further fragmentation is required. The length field
in the ForwardingHeader MUST contain the size of the fragment.

> 2) PG Major issue 2: sec. 7.4.1.1: replica number handling is unclear
> it is unclear how
> replica numbers are incremented (e.g., per peer) and what receiving peers
> should actually do with this number. Is it important to store the value or
> would a boolean be sufficient so that the peer knows that it's a
> replica?"
>
> Revised:
>
> Added text explaining that replica numbers are allocated and
> interpreted by the topology plugin.
>
> Explained how CHORD-RELOAD interprets the replica numbers.

Unfortunately, we don't consider this issue as being resolved.

1. Does the _Storage_ have to save replica numbers or are they
discarded, after processing the StoreReq?

The draft describes, how individual data elements are stored in
StoredDataValue structure, but it doesn't explicitly state,
what other values MUST be stored. It is obvious, that the generation
counter is stored, but we can't figure out, whether we
should store replica numbers. Moreover, a reminder to store
certificates for data values would be nice.

2. Why do you require incrementing the numbers, when it is only
required, that they are non-zero?
If the value has to be incremented, how should it be assigned when the
StoreReq is initiated by a topology shift, rather than the original
store. Please review the example from our previous e-mail:

> # Let's assume responsible peer X stores replicas at neighbors
> # A and C: does A get replica number 1 and C replica number 2?
> # if node B replaces neighbor C, does B get replica number 3 or number 2?
> # if the responsible peer X is replaced by a new node Y, Y will get
> # get the data from peer X, but how should X change its replica number?
> # Can Y simply start sending replicas beginning at 1?

3. (minor)

old:
     replica_number
          The number of this replica.  When a storing peer saves replicas to
          other peers each peer is assigned a replica number starting
from 1[*]
          and sent in the Store message.  This field is set to 0 when a node
          is storing its own data.[**]  This allows peers to distinguish
replica
          writes from original writes.  Different topologies may choose to
          allocate or interpret the replica number differently (see [***]
          Section 10.4).

new:
     replica_number
          The number of this replica. This field is set to 0 when a node
is storing
          its own data. When a storing peer saves replicas to other
peers, the replica
          number must be set to a non-zero value. This allows peers to
distinguish replica
          writes from original writes.  Different topologies may choose
to allocate or
          interpret the replica number differently (for CHORD-RELOAD see
Section 10.4).

# [*] 'starting from 1' still implies the sequential order (1 2 3 ...).
The Topology Plugin shouldn't be forced to use it.
# [**] sentences number 2 and 3 are reordered.
# [***] CHORD-RELOAD is only one option for topology plugin, the
reference should mention this.

Unfortunately, one further editorial issue was missing from our original
list:

Sec 7.

 There is one subtle point about signature computation on arrays.  If
   the storing node uses the append feature (where the
   index=0xffffffff), then the index in the StoredData that is returned
   will not match that used by the storing node, which would break the
   signature.  In order to avoid this issue, the index value in the
   array is set to zero before the signature is computed.  This implies
   that malicious storing nodes can reorder array entries without being
   detected.

This text is written in Sec. 7.4.2.2 [Fetch] Response Definition, but
it rather belongs to section 7.1. At the place where it is now, it can
be easily overlooked, and it is really important that this is
implemented (especially, since certificate storage is an array.)

Furthermore, we would be really grateful if you can
consider our other 3 major issues:

[major issue nr. 3].

> 3) sec. 6.5.2: transport protocol set for AppAttach. Applications may
> require use of other transport protocols than those defined in Overlay-
> LinkType (TLS/DTLS, what about plain UDP, SCTP, DCCP, etc.), but
> currently, this seems to be not possible (do the considerations
> of sec. 6.5.1.6 apply here?).

[major issue nr. 4].

> 4) sec. 6.3.4:
> How can a different signature algorithm be used if not all implementations
> support it? There is no possibility to provide the feedback, that the
signature
> algorithm is not acceptable at a particular node.

According to section 6.3.4, implementations MUST support RSA signatures
with SHA256 hashes. Does this imply that they must support only these
values, or may they choose to support other values from IANA TLS
Registry
(http://www.iana.org/assignments/tls-parameters/tls-parameters.xml#tls-parameters-15)?
If they may, the supported values are not possibly not homogeneously
supported across nodes in one overlay. Currently, there is no way for
one node to determine, if its valid signature was rejected by another
node, because the latter didn't support the used signature algorithm.

The possible solutions include:
1. unless specified by an extension, support only RSA signatures with
SHA256 hashes.
2. add a configuration parameter, on what signature and hash algorithms
must be supported and may be used. If one node can't support some of the
values, it may not join the overlay.
3. have a dedicated error message for this situation.

[major issue nr. 5].

5) sec. 10.5: handling of parallel JOIN requests and use of peer_ready
   it is not clear what should happen if two JNs try to join at the same
   time at the same AP (can they be processed in parallel or should they
   be processed in sequence).
   Furthermore, when MUST/SHOULD a JN send peer_ready Update - in step 9?

On 21.02.2013 06:03, Dean Willis wrote:
> Roland Bless
> --------------------
>
> 1) Under 10.5 Joining, add forward reference to 11.4
>
> Revised:
>
> JN MUST connect to its chosen bootstrap node as specified in Section 11.4.

OK.

> 2)  in 10.5 element 2, change "Acquire the routing table for" to
> "Acquire the routing table of".
>
> Done.

OK.

> 3) in 10.5 element 3:
> JP SHOULD send Attach requests to initiate connections to each of
> the peers in the neighbor table as well as to the desired finger
> table entries. Note that this does not populate their routing
> tables, but only their connection tables, so JP will not get
> messages that it is expected to route to other nodes."
> here it is unclear that the Attach requests are sent via the AP
> and what "desired" finger table entries means, e.g., in contrast to
> all?"
>
> Revised to:
>
>    3.  JN SHOULD send Attach requests to initiate connections to each of
>        the peers in the Neighbor Table as well as to the desired Finger
>        Table entries.  Note that this does not populate their Routing
>        Tables, but only their Connection Tables, so JN will not get
>        messages that it is expected to route to other nodes.

This fixes only upper/lower case, but doesn't answer my questions.

> 4) Sec. 10.7.2.:
> "If a finger table entry is found to have failed,"
> how is it determined to have "failed"? 10.7.1 is only
> related to neighbor failures...does the same definition
> apply here?"
>
> Copied definition from 10.7.1 to 10.7.2:;
>
> 10.7.2.  Handling Finger Table Entry Failure
>
>    If a Finger Table entry is found to have failed (as determined by
>    connectivity pings or the failure of some request), all references to
>    the failed peer MUST be removed from the Finger Table and replaced
>    with the closest preceding peer from the Finger Table or Neighbor
>    Table.

OK.

> 5)  Sec: 10.7.4.2.:
> "A peer SHOULD NOT send Ping requests looking for new finger table
> entries more often than the configuration element "chord-ping-
> interval", which defaults to 3600 seconds (one per hour)."
> This paragraph should probably moved some paragraphs down as it
> has been stated yet that a peer should actually send Ping requests
> at all (this is explained in the subsequent paragraphs)."
>
> Restructured as suggested.

OK.

> 6) Sec 12:
> throughout the figures: "Update" should be "UpdateReq"
> and Attach should be AttachReq?"
>
> Revised as suggested.

OK. Figure 6 may also indicate use of the flag peer_ready? What does
TP stand for?

> 7) Inconsistent capitalization of Neighbor Table and Connection Table
>
> Reviewed for consistency.

Ok.

> Polina Goltsman
> ------------------------
> 
> 1) PG Major issue 1: sec. 6.3.2:
> length field in Forwarding Header covers what exactly? it is not
> really clear whether the length field counts the whole message or in
> case of fragmentation only the fragment size. When reading Sec 6.7 is
> seems that the former is meant, but the definition could and should be
> clearer. Similarly, sec. 6.7 should be clear about this, e.g.,
> describing that all Forwarding Headers are identical for fragments of
> the same message with exception of the fragment field.
> 
> Revised:
> 
>    Any node along the path can fragment the message but only the final
>    destination reassembles the fragments.  When a node takes a packet
>    and fragments it, each fragment has a full copy of the Forwarding
>    Header but the data after the Forwarding Header is broken up in
>    appropriate sized chunks.  The size of the payload chunks needs to
>    take into account space to allow the via and destination lists to
>    grow.  Each fragment MUST contain a full copy of the via list,
>    destination list, and ForwardingOptions and MUST contain at least 256
>    bytes of the message body.  If these elements cannot fit within the
>    MTU of the underlying datagram protocol, RELOAD fragmentation is not
>    performed and IP-layer fragmentation is allowed to occur.  The length
>    field MUST contain the size of the message after fragmentation.  When
>    a message MUST be fragmented, it SHOULD be split into equal-sized
>    fragments that are no larger than the PMTU of the next overlay link
>    minus 32 bytes.  This is to allow the via list to grow before further
>    fragmentation is required.

See above.

> 2) PG Major issue 2: sec. 7.4.1.1: replica number handling is unclear
> it is unclear how
> replica numbers are incremented (e.g., per peer) and what receiving peers
> should actually do with this number. Is it important to store the value or
> would a boolean be sufficient so that the peer knows that it's a
> replica?"
> 
> Revised:
> 
> Added text explaining that replica numbers are allocated and
> interpreted by the topology plugin.
> 
> Explained how CHORD-RELOAD interprets the replica numbers.

See above.

> 3) PG Minor issue 1: Resource-ID used before introduction
> 
> Added explicative text in 1.1:
> 
>    The RELOAD network is not only a messaging network.  It is also a
>    storage network, albeit one designed for small-scale transient
>    storage rather than for bulk storage of large objects.  Records are
>    stored under numeric addresses, called Resource-IDs, which occupy the
>    same space as node identifiers.  Peers are responsible for storing
>    the data associated with some set of addresses as determined by their
>    Node-ID.  For instance, we might say that every peer is responsible
>    for storing any data value which has an address less than or equal to
>    its own Node-ID, but greater than the next lowest Node-ID.  Thus,
>    Node-20 would be responsible for storing values 11-20

Ok.

> 4) PG Minor Issue 2: Language in Forwarding and Link Management Layer
> definition clumsy.
> 
> Revised as suggested to:
> 
> Forwarding and Link Management Layer:  Stores and implements the
>       Routing Table by providing packet forwarding services between
>       nodes.  It also handles establishing new links between nodes,
>       including setting up connections for overlay links across NATs
>       using ICE.

Still, I don't think that "Routing Table" is correct here as this
belongs to the Topology Plugin. Must be "Connection Table".

> 5) PG Minor Issue 3: Change "link layer" to "overlay link layer" in
> definition of "overlay link layer".
> 
> Changed as suggested.

OK.

> 6) PG Minor Issue 4: Add "may" to last para of 1.2
> 
> Changed "nodes communicate with a central provisioning infrastructure"
> to "nodes may communicate with a central provisioning infrastructure"

OK.

> 7) PG Minor Issue 5: in 1.3 change TLS-PSK to TLS-PSK/TLS-SRP
> 
> Done.

OK.

> 8) PG Minor Issue 6: Change in Terminology definition
> 
> Changed "Terms in used this document are defined inline when used" to
> "Terms in this document are defined inline when used"

OK.

> 9) PG Minor Issue 7: Definition of Bootstrap node
> 
> Was:
> 
> Bootstrap Node:  A network node used by Joining Nodes to help locate
>       the Admitting Peer
> 
> Suggested change:
> 
> Bootstrap Node: A network node used by Joining Nodes to help
>   accessing the overlay by forwarding messages to peers
> 
> Retained original definition following review between editors and authors.

I still think that "locate" is not the best wording here.

> 10) PG Minor Issue 8: Clarify definition of Connection Table
> 
> Changed as suggested to:
> 
>    Connection Table:  Contains connection information for the set of
>       nodes to which a node is directly connected, which include nodes
>       that are not yet available for routing.

OK.

> 11) PG Minor Issue 9: Clarify "invalid" in definition of Node-OD
> 
> New text:
> 
>    Node-ID:  A value of fixed but configurable length that uniquely
>       identifies a node.  Node-IDs of all 0s and all 1s are reserved; a
>       value of zero is not used in the wire protocol but can be used to
>       indicate an invalid node in implementations and APIs; the Node-ID
>       of all 1s is used on the wire protocol as a wildcard.

OK.

> 12) PG Minor Issue 10: Change "plugin algorithm" to "topology plugin
> algorithm" in definition of Responsible Peer.
> 
> Changed as suggested.

OK.

> 13) PG Minor Issue 11: Change "An Usage" to "A Usage" in definition of Usage
> 
> Changed as suggested.

OK.

Regards,
 Roland and Polina

[P2PSIP] Revisions to draft-ietf-p2psip-base-24 m… Dean Willis
Re: [P2PSIP] Revisions to draft-ietf-p2psip-base-… Bless, Roland (TM)