Re: [secdir] Secdir review of draft-ietf-ccamp-dpm-06

Weiqiang Sun <sun.weiqiang@gmail.com> Thu, 16 August 2012 05:39 UTC

User-Agent: Microsoft-MacOutlook/14.2.3.120616
Date: Thu, 16 Aug 2012 13:39:07 +0800
From: Weiqiang Sun <sun.weiqiang@gmail.com>
To: Klaas Wierenga <klaas@cisco.com>, The IESG <iesg@ietf.org>, secdir@ietf.org, draft-ietf-ccamp-dpm.all@tools.ietf.org
Message-ID: <CC529784.22760%sun.weiqiang@gmail.com>
Thread-Topic: Secdir review of draft-ietf-ccamp-dpm-06
In-Reply-To: <21E590DF-6035-4790-9171-9B46F0A849E7@cisco.com>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
Subject: Re: [secdir] Secdir review of draft-ietf-ccamp-dpm-06
Precedence: list

Hello Klaas,

Thanks for the careful review. Please find our responses inline.

Best regards,
Weiqiang

On 8/15/12 7:56 PM, "Klaas Wierenga" <klaas@cisco.com> wrote:

>Generic:
>
>The central goal of the document appears to be to remedy the fact that
>applications think that the data path is available whereas it is not. It
>is to me completely unclear why:
>
>- the signalling process can not be changed to terminate only upon
>availability of a data path or even easier
According to existing RSVP-TE specifications (RFC 2205, 3029, 3473...),
the signaling process is required to act upon certain data path status
(e.g., successful setup/removal of a cross connection in the data plane
etc). In an implementation that is fully conformant to specifications and
everything works perfectly as expected, the data path should be in its
desired status (i.e., consistent with the control plane status) and no dpm
measurement is necessary. However, many reasons may cause inconsistency
and this can not be avoided by observing the control plane messages, or
changing the behavior of the control plane. For example, in one
implementation, intermediate node may propagate signaling messages to the
next hop, before the cross-connection configuration has been completed.
(Vendors may have very good reasons to do this, when control plane
performance becomes an important metric to report).

>
>- verifying the availability of the data path (send roundtrip packet, if
>successful ok)
>
>I can see that the metrics that you are defining may be of use in the
>control plane, but I fail to see how they may help the application, I
>find it hard to believe that any application would make use of these
>metrics beyond the binary value of "is data path available". And that is
>what you cite as the reason for this draft. So please expand the goal or
>argue why you need this set of metrics.

Often, applications use signaling messages as indications of availability
of data paths. In case of an un-conformant implementation, the measurement
will provide the necessary information to applications on when it is safe
to start sending data.

>
>Abstract:
>
>This was very hard for me to parse. For starters, "this delay" does't
>seem to refer to any delay previously mentioned. In fact it took me a
>while to understand  that it is referring to the time lapsed between
>finishing signalling and data path becoming available, I suggest
>rewording to make that more clear.
Good point here. I suggest the following change:
Before:
	The existence of this	delay and the possible failure of cross connection
programming, if not properly treated, will result in data loss or even
application failure.

After:
	The existence of the inconsistency between the signaling messages and
cross connection programing, and the possible failure of cross connection
programming, if not properly treated, will result in data loss or even
application failure.

>
>Section  3 (overview):
>
>please expand RRFD, RSRD, PRFD, PSFD, PSRD on first use

In Section 3, we already have:

o  RRFD - the delay between RESV message received by ingress node and
      forward data path becomes ready for use.

   o  RSRD - the delay between RESV message sent by egress node and
      reverse data path becomes ready for use.

   o  PRFD - the delay between PATH message received by egress node and
      forward data path becomes ready for use.

   o  PSFD - the delay between PATH message sent by ingress and forward
      data path becomes ready for use.

   o  PSRD - the delay between PATH message sent by ingress and reverse
      data path becomes ready for use.


Are you looking for something else?

>
>Section 5.3 (RRFD metric parameters):
>
>What is the unit of T? UTC?
In fact we does not impose any requirement on the selection of T, as have
been done in the IPPM documents (RFC 2679, 2681). To me, machine local
time is more preferable. But using UTC may also be an option.

>
>Section 5.6 (RRFD discussion):
>
>It seems to me that the delays that are introduced by the time to
>propagate the signal, the delay introduced by the interfaces etc. are
>well in the various milliseconds range, doesn't that invalidate any
>measurements in that range? Or can you argue that those introduce a fixed
>delay so that variations are due to what you are trying to measure.
>The same holds for the other metrics
Exactly. On a set of programed cross-connections (an LSP), we get a fixed
delay and hence it can be separated from the measured delay. In the
current document, we have the following discussions. Hope these address
your concern.
	o  The accuracy of RRFD is also dependent on the time needed to
	   propagate the error free signal from the ingress node to the
	   egress node.  A typical value of propagating the error free signal
	   from the ingress node to the egress node under the same
	   measurement setup MAY be reported.  The methodology to obtain such
	   values is outside the scope of this document.

	o  The accuracy of this metric is also dependent on the physical
	   layer serialization/de-serialization of the test signal for
	   certain data path technologies.  For instance, for an LSP between
	   a pair of low speed Ethernet interfaces, the time needed to
	   serialize/deserialize a large frame may not be negligible.  In
	   this case, it is RECOMMENDED that the ingress node uses small
	   frames.  The average length of the frame MAY be reported.



>
>
>
>Section 11.2 (median of metric):
>
>It seems to me that the median is of little value if the majority of the
>values are undefined, is that why you have tyne failure count? If so,
>please say so.
The median definition still holds even though the majority of the values
are undefined (and will not be counted in). To address you concern, I
suggest the following text:
	When the number of defined values in the given sample is small, the
metric median may not be typical and SHOULD be used carefully.

>
>Section 12 (Security considerations):
>
>It is unclear to me what the result of an evil MitM manipulating values
>of the metrics could do. Can they for example introduce a denial of
>service by reporting high delays? Can they prioritise their own traffic
>by making competitors for bandwidth think the data path is not there yet?
Measurements defined in this document are believed to be carried out in
controlled network environments, e.g., in laboratories, so IMHO MitM
attack is really out of scope. :)

>
>Hope this helps,
Thank again! Let us know in case you have more concerns.

>
>Klaas

[secdir] Secdir review of draft-ietf-ccamp-dpm-06 Klaas Wierenga
Re: [secdir] Secdir review of draft-ietf-ccamp-dp… Adrian Farrel
Re: [secdir] Secdir review of draft-ietf-ccamp-dp… Weiqiang Sun