Re: [P2PSIP] AD evaluation: draft-ietf-p2psip-diagnostics-17

Wanted to check if a rev of this document is forthcoming?

Thanks,
Alissa

> On Sep 4, 2015, at 3:51 PM, Alissa Cooper <alissa@cooperw.in> wrote:
> 
> Thanks for the reply. Comments are inline.
> 
>> On Aug 20, 2015, at 8:22 PM, Songhaibin (A) <haibin.song@huawei.com <mailto:haibin.song@huawei.com>> wrote:
>> 
>> Hi Alissa,
>> 
>> Thank you for the comments. Please find my explanations in line with [haibin].
>> 
>>> Thanks. I did have one further thought for Section 8. It strikes me that the combination of diagnostic elements that this spec makes available could easily be used to fingerprint a peer in an overlay where peers may otherwise be attempting to remain anonymous. Although defenses against such fingerprinting are probably hard to come by (short of disallowing access to the diagnostics via the config file), the threat is worth noting at least.
>> 
>> [haibin]Actually we imply your concern with the sentence "It should also be noted that attackers can use diagnostics to analyze overlay information to attack certain key peers." But we could elaborate more in this section. 
>> 
> 
> Yes, further elaboration would be appreciated.
> 
>> 
>> From: Alissa Cooper [mailto:alissa@cooperw.in <mailto:alissa@cooperw.in>] 
>> Sent: Wednesday, August 19, 2015 12:04 AM
>> To: David Bryan
>> Cc: p2psip; draft-ietf-p2psip-diagnostics@tools.ietf.org <mailto:draft-ietf-p2psip-diagnostics@tools.ietf.org>
>> Subject: Re: [P2PSIP] AD evaluation: draft-ietf-p2psip-diagnostics-17
>> 
>> On Aug 17, 2015, at 8:54 AM, David Bryan <dbryan@ethernot.org <mailto:dbryan@ethernot.org>> wrote:
>> 
>> 
>> Thanks for the comments. I'll circle back with the team and will revise.
>> 
>> Thanks. I did have one further thought for Section 8. It strikes me that the combination of diagnostic elements that this spec makes available could easily be used to fingerprint a peer in an overlay where peers may otherwise be attempting to remain anonymous. Although defenses against such fingerprinting are probably hard to come by (short of disallowing access to the diagnostics via the config file), the threat is worth noting at least.
>> 
>> 
>> 
>> I have to check, but I actually think the first few versions of this (when it was individual) *may* be pre-5378, but will double check.
>> 
>> Ah ok.
>> 
>> [haibin] It is true that the first version of the individual draft is pre-5378.
> 
> Ok, my mistake then.
> 
>> 
>> Alissa
>> 
>> 
>> 
>> On Fri, Aug 14, 2015 at 6:33 PM, Alissa Cooper <alissa@cooperw.in <mailto:alissa@cooperw.in>> wrote:
>> I have reviewed this document in preparation for IETF last call. Before proceeding to last call, I have some comments and questions that I'd like to discuss. I've also included a list of nits that should be resolved together with any last call comments.
>> 
>> 
>> == Substantive comments and questions ==
>> 
>> = General 
>> 
>> I think this document should be listed as updating RFC 6940 in the document header.
>> 
>> The document contains a disclaimer for pre-RFC5378 work, which I assume is in error. If using xml2rfc to generate the text, using the most up-to-date version of xml2rfc to create the next version should fix this.
>> 
>> = Section 4.3:
>> 
>> "There have been proposals that RouteQuery and a series of Fetch
>>   requests can be used to replace the PathTrack mechanism, but in the
>>   presence of churn such an operation would not, strictly speaking,
>>   provide identical results, as the path may change between RouteQuery
>>   and Fetch operations. (although obviously the path could change
>>   between steps of PathTrack as well)."
>> 
>> This text is confusing. It doesn't explain why PathTrack is being defined rather than peers using RouteQuery plus Fetch. Furthermore, the document does not explain how peers are supposed to interpret the results of the PathTrack method in the event that the route does change in between steps. This seems like a substantial omission.
>> 
>> [haibin] This is a good question. As the route change is a problem (with little probability), we would like to retrieve diagnostics info in an efficient way, so one simple message is preferred. And you are right, it does not change the truth that route can change  in between steps. We need re-wording the section. The implication of route change is that, the result of the diagnostics cannot be trusted with one hundred percent, which should be noted to the users that perform the diagnostics.
> 
> Ok, new text would be appreciated.
> 
>> 
>> In general, this document provides very little information about how peers are expected to interpret and use the diagnostic information. I understand that peers can do whatever they want with it, but in several places the fact that this is not explained makes the reasoning behind the protocol design decisions opaque. I would suggest providing a little more context, at the very least by moving the examples into the body of the draft somewhere. 
>> 
>> [haibin] We can add some more explanations with diagnostics info interpretation. 
> 
> Ok, good.
> 
>> 
>> = Section 4.4:
>> 
>> I'm a little confused about the values given for the error codes. Do you want IANA to chose these specific values? Or are they just there as placeholders? The customary way of doing placeholders would be as follows:
>> 
>> Code Value         Error Code Name
>>      [TBD1]               Underlay Destination Unreachable
>>      [TBD2]               Underlay Time exceeded
>>      [TBD3]               Message Expired
>>      [TBD4]               Upstream Misrouting
>>      [TBD5]               Loop detected
>>      [TBD6]               TTL hops exceeded
>> 
>> These placeholders would then get used throughout the document.
>> 
>> [haibin] they are place holders. 
> 
> Ok, please clarify it in the document in that case.
> 
>> 
>> This section also says:
>> 
>> "In addition, this document introduces several types of error information in the error_info field in the case of Code 0x65. ... Here are some examples for the error info. ... The error_info field values of the Code 0x66 to 0x70 are to be application specific and defined by the particular overlay."
>> 
>> 
>> I'm not sure what is meant by all of this text. As defined in RFC 6940, error_info is an optional implementation-specific byte string. If you are intending to require implementations to include specific text in the error_info for the first error code, then you need to explain under what conditions the specific text needs to be included for each string. And the text about the other codes seems redundant with how error_info is already defined in RFC 6940.
>> 
>> 
>> [haibin] if a peer receives a diagnostic request found that it cannot reach its next hop due to underlay problem, it needs to explain with error_info what happened. I do not know what you mean by redundant with RFC 6940. They are different errors.
> 
> Maybe it’s just a language thing, but the first sentence saying “this document introduces …” makes it sound like the list of strings is more than just a list of examples. If these are just examples that implementations can choose to use or not, I would suggest the following re-phrasing:
> 
> OLD
> In addition, this document introduces several types of error
>    information in the error_info field in the case of Code 0x65.  These
>    are represented as an opaque UTF-8 text string.  Here are some
>    examples for the error info.
> 
> 
> NEW
> Here are some examples of errors that might be expressed using the error_info field in the case of Code [TBD1]:
> 
> 
>> 
>> = Section 5.2:
>> 
>> 
>> "ext_length : the length of the returned DiagnosticInfo information
>>      in bytes.  If the value is greater than or equal to 1, then some
>>      extended diagnostic information is requested."
>> 
>> This is defining the diagnostic response - why would the extension be requesting something? And if it is possible to use an extension in that way, the expected behavior from the initiator when it receives such an extension needs to be described, or at least it needs to be noted that a diagnostic response could provoke a request back to the initiator.
>> 
>> [haibin] it is possible to reword it as "....then some extended diagnostic information is responded.”
>> 
> 
> How about “is being sent in the response”?
> 
>> 
>> Also, in the struct the diagnostic info is called "diagnostic_info_contents" but in the text it is called "diagnostic_information." These should be the same I think.
>> 
>> [haibin] Yes. It will be corrected.
> 
> Ok
> 
>> 
>> = Section 5.3:
>> 
>> I'm concerned about the way STATUS_INFO is defined. First, why are there 16 different levels of congestion status? What are they supposed to signify? If there is no standardized way of measuring congestion defined, how is a node supposed to interpret STATUS_INFO received from different peers? That is, couldn't one peer's level 7 actually mean it is less congested than another peer's level 3, and won't that make the information somewhat meaningless to the initiator? Maybe this would be clearer if you could explain what you expect an initiator to do with the information about 16 different levels of congestion.
>> 
>> [haibin] Your concern makes sense. Actually it is intended to leave it for overlay implementers or a future standard to define the congestion level algorithms . This draft just defines the format and reserves the bits for their use. As the section says " This document does not provide a specific method for congestion, leaving this decision to each node.  One possible option for a node would be to take its CPU/memory/bandwidth usage percentage in the past 600 seconds and normalize the highest value to the range [0x00, 0x0F].  A future draft may define an objective measure or specific algorithm for this." I think we should change the node decision to overlay decision. And an overlay implementer may choose to not use all that 16 values.
> 
> Ok, further clarification would help.
> 
>> 
>> SOFTWARE_VERSION doesn't really seem like diagnostic information. While all of the other fields may be constantly changing and therefore it would make sense to inquire about them in a diagnostic fashion, SOFTWARE_VERSION seems more like data that nodes could register or exchange when they join the overlay. Why is it included as a diagnostic?
>> 
>> [haibin] Diagnostics usually take place when there are failures or abnormal behaviors. Software version incompatibility might be a cause.
> 
> Ok, fair enough. Still seems like an odd choice to me.
> 
>> 
>> Also, doesn't the SOFTWARE_VERSION type need a length or a specified delimiter that indicates the end of the string? Otherwise, how is the recipient supposed to know when to stop parsing it?
>> 
>> [haibin] Will add it.
> 
> Ok
> 
>> 
>> The definitions of EWMA_BYTES_SENT and EWMA_BYTES_RCVD seem problematic.
>> 
>> sent = alpha x sent_present + (1 - alpha) x sent
>> rcvd = alpha x rcvd_present + (1 - alpha) x rcvd
>> 
>> As written these equations are not right because sent/rcvd appear on both sides. It would be clearer to use last_sent and last_rcvd or some such on the right-hand side of these equations. But this begs some bigger questions:
>> 
>> - Does this place a requirement on all nodes implementing this specification to have to calculate these values every 5 seconds?
>> - How are the values calculated the first time?
>> - How was the value of 5 seconds chosen?
>> 
>> [Haibin] This is incorporated from the early RELOAD document. Maybe some author of the RELOAD protocol can give a better explanation.
>> 
>> Finally, I have trouble understanding why only some of the types are subject to access control as described in Section 7. It seems to me that in some situations all of the fields defined here could be considered sensitive by certain nodes or on certain networks. What is the justification for only making some of them subject to access control?
>> 
>> [haibin] Actually it says different level of access control can be made. For example, diagnostic information A can be accessed by node 1 and 2, but diagnostic information B can only be accessed by node 2. We can add some explanations as it seems not clear to readers.
> 
> Ok, will await the revision.
> 
>> 
>> 
>> = Section 6.1:
>> 
>> "The destination field MUST be set to the desired destination, which MAY be either a NodeId or ResourceID but SHOULD NOT be the broadcast NodeID."
>> 
>> What is the expected behavior if a Ping or a PathTrack does get sent to the broadcast NodeID? Or should these requirements really be MUST NOTs?
>> 
>> [haibin] I guess they should be MUST NOTs. 
> 
> Ok
> 
>> 
>> = Section 6.4:
>> 
>> "However, for a single hop
>>   measurement, the traditional measurement methods MUST be used instead
>>   of the overlay layer diagnostics methods."
>> 
>> What is meant by the traditional measurement methods?
>> 
>> [haibin] It mainly means IP layer ping.
> 
> Would be good to specify.
> 
> Thanks,
> Alissa
> 
>> 
>> = Section 9:
>> 
>> Please only use either RFC-XXXX or RFC-AAAA but not both as placeholders.
>> 
>> [haibin] RFC-AAAA is RFC 6940. RFC-XXXX is the placeholder. They will be corrected.
>> 
>> Please also use [TBDX] for values you expect to be allocated by IANA.
>> 
>> [haibin] ok.
>> 
>> == Nits ==
>> 
>> 
>> = General
>> 
>> 
>> The document seems to repeatedly explain that it is doing things "as specified in RFC6940." I don't think this necessary in many cases. It's worth doing a pass through and removing the places where it's obvious that this spec is building on RFC 6940.
>> 
>> [haibin] ok.
>> 
>> = Section 4.1:
>> 
>> 
>> OLD
>> The extensions strictly follow RELOAD
>>   specification on the messages routing, transport, NAT traversal etc.
>> 
>> NEW
>> The extensions strictly follow how RELOAD specifies message routing, transport, NAT traversal, and other RELOAD protocol features.
>> 
>> = Section 4.2.1:
>> 
>> 
>> s/as specified in Table 4 and specified in the RELOAD/as specified in Table 4/
>> 
>> s/new requests of the the/new requests of the/
>> 
>> = Section 4.3:
>> 
>> OLD
>> We define a simple PathTrack method for retrieving diagnostic
>>   information iteratively.  The mechanism defined in this document
>>   follows the RELOAD specification, the new request and response
>>   message use the message format specified in RELOAD messages.  Please
>>   refer to the RELOAD [RFC6940] for details of the protocol.
>> 
>>   The operation of this request is shown below in Figure 2.
>> 
>> 
>> NEW
>> We define a simple PathTrack method for retrieving diagnostic
>>   information iteratively.  The operation of this request is shown below in Figure 2.
>> 
>> = Section 4.3.1.2:
>> 
>> s/after that/after receiving a PathTrackAns where the next_hop node ID equals the responding node ID/
>> 
>> = Section 4.3.2.1:
>> 
>> s/PathTrack Response/PathTrack response/
>> 
>> = Section 4.4:
>> 
>> s/code. and we define/code. We define/
>> 
>> = Section 5:
>> 
>> s/Path_track methods/PathTrack methods/
>> 
>> = Section 5.1:
>> 
>> s/Note that is NOT/Note that it is not/
>> 
>> = Section 5.2:
>> 
>> s/Note that is NOT/Note that it is not/
>> 
>> s/consists one or more/consists of one or more/
>> 
>> = Section 6.1:
>> 
>> s/`MAY/MAY/
>> 
>> = Section 6.2:
>> 
>> s/return the response the initiator node/return the response to the initiator node/
>> 
>> s/The peer SHOULD/The intermediate peer SHOULD/
>> 
>> [haibin] Thanks again.
>> 
>> BR,
>> -Haibin Song
>> 
>> 
>> _______________________________________________
>> P2PSIP mailing list
>> P2PSIP@ietf.org <mailto:P2PSIP@ietf.org>
>> https://www.ietf.org/mailman/listinfo/p2psip
>> 
>> 
> 
> _______________________________________________
> P2PSIP mailing list
> P2PSIP@ietf.org
> https://www.ietf.org/mailman/listinfo/p2psip