[ppsp] AD review of draft-ietf-ppsp-peer-protocol-06

Martin Stiemerling <martin.stiemerling@neclab.eu> Tue, 30 April 2013 09:19 UTC

Return-Path: <Martin.Stiemerling@neclab.eu>
X-Original-To: ppsp@ietfa.amsl.com
Delivered-To: ppsp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A49FB21F9B56 for <ppsp@ietfa.amsl.com>; Tue, 30 Apr 2013 02:19:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.699
X-Spam-Level:
X-Spam-Status: No, score=-102.699 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, GB_I_LETTER=-2, J_CHICKENPOX_32=0.6, MANGLED_LIST=2.3, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id N6U3A55ZtuG1 for <ppsp@ietfa.amsl.com>; Tue, 30 Apr 2013 02:18:58 -0700 (PDT)
Received: from mailer1.neclab.eu (mailer1.neclab.eu [195.37.70.40]) by ietfa.amsl.com (Postfix) with ESMTP id E2C6921F9AB5 for <ppsp@ietf.org>; Tue, 30 Apr 2013 02:18:57 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by mailer1.neclab.eu (Postfix) with ESMTP id D202A103F06 for <ppsp@ietf.org>; Tue, 30 Apr 2013 11:18:56 +0200 (CEST)
X-Virus-Scanned: Amavisd on Debian GNU/Linux (netlab.nec.de)
Received: from mailer1.neclab.eu ([127.0.0.1]) by localhost (atlas-a.office.hd [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rl89ZrdX7dcj for <ppsp@ietf.org>; Tue, 30 Apr 2013 11:18:56 +0200 (CEST)
Received: from METHONE.office.hd (methone.office.hd [192.168.24.54]) by mailer1.neclab.eu (Postfix) with ESMTP id B2149103F05 for <ppsp@ietf.org>; Tue, 30 Apr 2013 11:18:51 +0200 (CEST)
Received: from [10.1.1.190] (10.1.1.190) by skoll.office.hd (192.168.125.11) with Microsoft SMTP Server (TLS) id 14.1.323.3; Tue, 30 Apr 2013 11:18:51 +0200
Message-ID: <517F8C7B.20106@neclab.eu>
Date: Tue, 30 Apr 2013 11:18:51 +0200
From: Martin Stiemerling <martin.stiemerling@neclab.eu>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130404 Thunderbird/17.0.5
MIME-Version: 1.0
To: ppsp <ppsp@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [10.1.1.190]
Subject: [ppsp] AD review of draft-ietf-ppsp-peer-protocol-06
X-BeenThere: ppsp@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: discussing to draw up peer to peer streaming protocol <ppsp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ppsp>, <mailto:ppsp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ppsp>
List-Post: <mailto:ppsp@ietf.org>
List-Help: <mailto:ppsp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ppsp>, <mailto:ppsp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Apr 2013 09:19:03 -0000

Dear all,

Please find here the review of your Area Director for 
draft-ietf-ppsp-peer-protocol-06. This includes all sections except 12 
and 13.

General Summary:
The draft is in its current form not suitable as Standards Track document.
A Standards Track RFCs provide a stable protocol specification which can 
be implemented in an interoperable way, but the current PPSP peer 
protocol specification lacks information so that multiple independent 
implementations can interoperate.

However, the draft is usable and I can see that it is a protocol 
specification that is on its way to be implementable in an interoperable 
way, but this requires work cycles from the WG.

But, the WG has to put energy in the draft to fix the issues below. I 
will return the draft to the WG for further processing.

The review is separated in:
- High-level issues
- Technical issues
- Editoral issues


***High-level issues:
1) Merkle Hash Trees
I have found the document very confusing on whether Merkle Hash Trees 
(MHTs) and the for the MHT required bin numbering scheme are now 
optional or mandatory. Parts of the draft make the impression that 
either of them or both or optional (mainly in the beginning of the 
document), while Section 5 and later Sections are relying heavily on MHTs.

My naive reading of the current draft is that you could rely on 
start-end ranges for chunk addressing and MHTs for content protection. 
However, I do know that this combination is not working.

If MHTs are really optional, including the bin numbering, the document 
should really state this and make clear what the operations of the 
protocol are with the mandatory to implement (MTI) mechanisms. The MHT, 
bins, and all the protocol handling should go in an appendix.

There is a call to make for the WG:
I do know that MHTs were considered by some as burden and they have 
called for a leaner way, i.e., the start-end ranges.
The call for the leaner way has been implemented in the document but not 
fully.

2) LEDBAT as congestion control vs. PPSPP
The PPSP peer protocol is intended for the Standards Track and relies in 
a normative manner on LEDBAT (RFC 6817). LEDBAT as such is an 
**experimental** delay-based congestion control algorithm.
A Standards Track protocol cannot normatively rely on an Experimental 
congestion control mechanism (or RFC in general).
There are ways out of this situation:
i) Do not use ledbat: this would call for another congestion control 
mechanism to be described in the PPSPP draft.
ii) Work on an 'upgrade' of the LEDBAT specification to Standards Track: 
Possible, but a very long way.
iii) Agree on having PPSPP also as Experimental protocol.

I'm currently leaning towards option iii), but this is my pure personal 
opinion as an individual in the IETF.

3) No formal protocol message definition
Section 7 and more specific Section 8 describe the protocol syntax of 
the protocol options and the messages, though Section 8 is talking about 
UDP encapsulation.
Section 7 is hard to digest if someone should implement the options, see 
also later, but Section 8 is almost impossible to understand by somebody 
who has not been involved in the PPSP working group. See also further 
down for a more detailed review of the sections.
To give an example out of Section 8.4:
This section describes the HANDSHAKE message and gives examples how such 
a HANDSHAKE message could look like.
But no formal definition of the message is given leaving a number of 
thins unclear, such as what the local channel number and what's the 
remote channel number is. This is implicitly defined, but that is not a 
good way of writing Standards Track drafts.

4) Implicit use of default values
There are a number of places all over the draft where default values are 
defined. Many of those default values are used when there are no values 
explicitly signaled, e.g., the default chunk size of 1 Kbyte in Section 
8.4 or Section Section 7.5. with the default for the Content Integrity 
Protection Method.
I have the feeling that the protocol and the surroundings (e.g., what 
comes in via the 'tracker') are over-optimized, e.g., always providing 
the Content Integrity Protection Method as part of the Protocol options 
will not waste more than 2 bytes in a HANDSHAKE message.
Further, I do not see the need to define a default chunk size in the 
base protocol specification, as this default can look very different, 
depending on who is deploying the protocol and in what context. This 
calls for a more dynamic way of handling the system chunk size, either 
as part of an external mechanisms (e.g. via the tracker) or in the 
HANDSHAKE message.

5) Concept of channels
The concept of channels is good but it is introduced too late in the 
draft, namely in Section 8.3, and it is introduced with very few words. 
Why isn't this introduced as part of Section 2 or Section 3, also in the 
relationship to the used transport protocol?
I.e., the intention is to keep only one transport 'connection' between 
two distinct peers and to allow to run multiple swarm instances at the 
same time over the same transport.
And how do swarms and channels correlate?




***Technicals:
- Section 2.1, 2nd paragraph, about the tracker:
I haven't seen a single place where the interaction with a tracker is 
discussed or where the tracker less operation is discussed in contrast. 
It is further unclear what type of information is really required from a 
tracker.
A tracker (or a resource directory) would need to provide more then IP 
address & port, e.g., the used transport protocol for the protocol 
exchange (given that other transports are allowed), used chunk size, 
chunk addressing scheme, etc

- Section 2.3, the 1st paragraph, 'close-channel':
This has been the first time where I stumbled over the channel without 
knowing the concept.

- Section 3.1: ordering of messages
The 1st sentence implies that ordering of messages in a datagram matters 
a lot. This is outlined later in the document, but I would add this as 
part of 3., i.e., the messages are processed in the strict order or 
something along this line.

- Section 3.1, 1st paragraph, options to include
I would not say anything about 'SHOULD include options' here, as this is 
anyhow described in Section 8.

- Section 3.1, 2nd paragraph:
"Datagrams exchanged MAY also contain some minor payload, e.g.  HAVE 
messages to indicate the current progress of a peer or a REQUEST (see 
Section 3.7)."
to be added, just to make it clear IMHO: ", but MUST NOT include any 
DATA message".

- Section 3.2, 2nd paragraph:
"In particular, whenever a receiving peer has successfully checked the 
integrity of a chunk or interval of chunks it MUST send a HAVE   message 
to all peers it wants to interact with in the near future."
This looks like a place where a lot of traffic can be send out of a 
peer, i.e., whenever a chunk arrives a HAVE message must be sent.
I don't believe that this should be mandated by the protocol 
specification, but there should guidance on when to send this, e.g., 
peers might be also able to wait for a short period of time to gather 
more chunks to be reported in HAVE. Or should in this case a single UDP 
datagram contain multiple HAVEs?

- Section 3.4 on ACKs
This section looks pretty weak, as ACKs may be sent but on the other 
hand MUST be sent if ledbat is used. I would simply say:
- ACK MUST be sent if an unreliable transport protocol is used
- ACK MAY be sent if a reliable transport protocol is used
- keep clarification about ledbat.

- Section 3.5:
Give text where INTEGERITY is described at least for the MTI scheme.

- Section 3.7, 2nd paragraph
  - all 'MAY' are actually not right here. Please remove or replace them 
with lower letters if appropriate.
  - It is not clear what the 'sequentially' means exactly. Is it in the 
received order?

- Section 3.8:
Please replace 'MAY' by can, as those are not normative behaviors but 
more the fact that peers can, for instance, request urgent data.

- Section 3.9
Same comment as for the Section 3.8 just above this comment.

- Section 3.9 waiting for responses
OLD
" When peer B receives a CHOKE message from A it MUST NOT send new 
REQUEST messages and SHOULD NOT expect answers to any outstanding ones."
NEW
" When peer B receives a CHOKE message from A it MUST NOT send new 
REQUEST messages and it cannot expect answers to any outstanding ones, 
as the transfer of chunks is choked."

- Section 3.10.2
This whole section about PEX hole punching reads very, very 
experimental. The STUN method is ok, but PEX isn't.
First of all, the safe behavior for a peer when it receives unsolicited 
PEX messages, is to discard those messages. Second, this unsolicited PEX 
messages trigger some behavior which may open an attack vector.
The best way, but this needs more discussion, is to include to some 
token in the messages that are exchanged in order to make avoid any 
blind attacks here. However, this will need more and detailed 
discussions of the purpose of this.

- Section 3.11
I don't see the 'MUST send keep-alive' as a mandatory requirement, as 
peers might have good reasons not to send any keep alive. Why not saying 
'A peer can send a keep-alive' and it 'MUST use the simple datagram...' 
as already described. Though there is also no really need to say MUST.

- Section 4
The syntax definition for each of the chunk addressing schemes is 
missing. This is not suitable for any specification that aims at 
interoperable implementations.

- Section 4.3.2
PPSPP peers MUST use the ACK message if an unreliable transport protocol 
is used.

- Section 4.4
Has been tested in an implementation?
I would like to understand the need for such a section, as in my 
understanding a peer implementation should chose one scheme and support 
this and there shouldn't be the need to convert between the different 
schemes.

- Section 5
This reads that MHTs are mandatory to implement while the document makes 
the impression that MHTs are optional.

- Section 5.3
"  so each datagram SHOULD be processed separately and a loss of one
    datagram MUST NOT disrupt the flow"

The MUST NOT is not a protocol specification requirement, but more an 
informative part saying that a lost message shouldn't impact the 
protocol machinery, but it can impact the overall operation.
What is the flow here in that sentence?

- Section 5.6.2.
An illustrative example explaining how the automatic size detection 
works is required here.

- Section 6.1, 4th paragraph:
Where do I find the 1 byte algorithm field in the swarm ID? The swarm ID 
is not really defined in a single place.

- Section 7.3
The described min/max versioning relies on the fact that there are major 
and minor version numbers. I cannot find any major and minor version 
number scheme in the draft.

- Section 7.4, Length field
It is not clear what the 'Length' field is referring to.
Further, it is not clear of the swam IDs are concatenated in one swarm 
ID option, of each swarm ID must be placed in a separate swam ID option.

- Section 7.6
MHTs are mandatory to support though MHTs are optional?

- Section 7.7
'key size ... derived from the swarm ID'. This relates to my high level 
comment no 4. on the use of implicit information. Either it is clearly 
specified how this information is derived or there is a protocol 
field/information about the size.

- Section 7.8
I would recommend to say that the default MUST be supported, but the 
peer must always signal what method it is supporting or at least using.

- Section 7.10
I have not understood how the 'Lenght' field relates to the message 
bitmap and how long the message bitmap can grow. The figure looks like a 
maximum of 16 bits?

- Section 8
I do not see the value of the text in the preface of Section 8. I would 
say that this text should say what is mandatory and what's not, i.e., 
MUST use UDP and MUST use LEDBAT.
Potentially saying that future protocol versions can also run over other 
transport protocols.

- Section 8.1 about Maximum Transfer Unit (MTU)
The text is discussing that a Ethernet can carry 1500 bytes. This is 
true, but the Ethernet payload is not the normative MTU across all of 
the Internet. For IPv6 the min MTU is 1280 bytes and for IPv4 it is 576 
bytes, though for IPv4 it can be theoretically much lower at 64 bytes.
It would move the definition of the default chunk size to a 
recommendation with text saying that this size has a high likelihood to 
travel end-to-end in the Internet without any fragmentation. 
Fragmentation might increase the loss of complete chunks, as one lost 
fragment will cause the loss of a complete chunk.
One way of getting an informed decision on whether chunks can travel in 
their size is to use the Don't Fragment (DF) bit in IPv4 and also to 
watch for ICMP error messages. However, ICMP error messages are not a 
reliable indication, but they can be some indication.

- Section 8.1 Definition of the default chunk size
There is no need to define a default chunk size, if the chunk size would 
be always signaled per swarm. This is another default/implicit value 
places that is unnecessary.

- Section 8.3: see also my comment no 3.
The concept of channels is introduced very late and with few words. A 
figure to explain  the concept will help a lot and also more formal text 
on what a channel is and how they are identified. Also what the init 
channel is.

- Section 8 in general:
There is no formal definition of the messages, just bit pattern examples.

- Section 8.4 (as example for the other Sections in 8.x):
  i) What is the '(CHANNEL' paramter? Is it actually a parameter?
  ii) it is implicit that the first channel no (0000000) is the remote 
peer's channel and that the second channel no (00000011) is the local 
peer's channel, right?
This isn't clear from the text, but my guess.

- Section 8.5
Can HAVE messages multiple bin specs in one message or do I have to make 
a HAVE message for each bin?

- Section 8.6
What is the formal defintion of a DATA message? That's completely 
missing or I have not understood it.

- Section 8.7
looks just underspecified, especially as this is the link to LEDBAT.

- Section 8.11
How are the chunks specified here? The formal syntax definition or 
reference to one is missing.

- Section 8.13
I'm lost on this section, as I haven't fully understood the concept of 
the PEX in this document. Especially not why there is the PEX_REScert.

- Section 11
The RFC required for protocol extensions of a standards track protocol 
looks odd. This must be at least IETF Review or Standards Action.



***Editorials:
- Abstract (and probably also other places), 1st sentence of,
PPSPP is not a transport protocol, just a protocol

- Section 1.1, 4th paragraph:
I would remove the reference to rmcat, as it is not yet clear what the 
outcome of the rmcat wg will be

- Section 1.3, on page 8, about seeding/leeching:
I would break it in to sub-bullets.

- Section 2.1 and following:
These are examples, isn'it? If so, this should be mentioned or clarified.

- Section 2.1: What is the PPSP Url?

- Section 2.3,  the 1st paragraph, detection of dead peers:
It would be good to say where this detection is described in the 
remainder of the draft. Just for completeness.

- Section 2.2, the very last paragraph, 'Peer A MAY also':
This 'MAY' is not useful here. I would just write 'Peer A can also', as 
there is nothing normative described here.

- Section 3.2, last paragraph:
What is the latter confinement? This is not clear to me.

- Section 3.9, last sentence
I am not sure to what the reference to Section 3.7 is pointing in this 
respect.

- Section 3.10.1 about PEX messages
The text says 'PPSPP optionally features...'. I have not understood if 
this optionally refers to mandatory to implement but optionally to use, 
or if the PEX messages are optionally to implement.

- Section 3.12
I'm not sure what this section is telling exactly. Isn't just saying 
that PPSPP as such does not care how chunks are stored locally, as this 
is implementation dependent?--
martin.stiemerling@neclab.eu

- Section 4.2, page 15, 1st paragraph:
OLD
'A PPSPP peer MAY support'
NEW
'The support for this scheme is OPTIONAL'

- Section 6.1.1
This section is not describing sign-all, but rather a justification why 
it may still work. This doesn't help at all.

- Section 7, 1st paragraph
Why is there a reference to RFC 2132?

- Section 7 in general
i) It is common to give bit positions in the figures where the syntax of 
options is described. This allows to count how many bits are used for a 
protocol field more easily and also way more reliable.
ii) Please add also Figure labels to the syntax definitions of the 
options. This makes it easier to reference them later on if needed.

- Section 8.1
1 kibibyte is 1 kbyte?

- Section 8.2, last paragraph
i ) "All messages are idempotent" in what respect?
ii) "or recognizable as duplicates" but how are the recognized as 
duplicates?

- Section 8.5, last sentence in brackets:
What is this last sentence about?

- Section 8.13
" If sender of
    the PEX_REQ message does not have a private or link-local address,
    then the PEX_RES* messages MUST NOT contain such addresses
    [RFC1918][RFC4291]."
What is this text saying? Do not include what you do not have anyway?

- Section 8.14
There is no single place where all the constants are collected and also 
documented what the default values or the recommended values. For 
instance in this Section 8.14 where the dead peer time out is set to 3 
minutes and also the number of datagrams that should have sent. I would 
make a section or subsection to discuss dead peers and how they are 
detected and just link to the keep-alive mechanism in Section 8.14.

- Section 11
This section needs to be overhauled once the document is ready for the 
IESG. The section is not wrong but can be improved to help IANA.

Regards,

   Martin

NEC Laboratories Europe
NEC Europe Limited
Registered Office:
Athene, Odyssey Business Park, West End  Road, London, HA4 6QE, GB
Registered in England 2832014