Re: [Pce] Last call comments on draft-ietf-pce-monitoring-02.txt

JP Vasseur <jvasseur@cisco.com> Mon, 19 January 2009 09:20 UTC

Return-Path: <pce-bounces@ietf.org>
X-Original-To: pce-archive@megatron.ietf.org
Delivered-To: ietfarch-pce-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 35CA728C19C; Mon, 19 Jan 2009 01:20:54 -0800 (PST)
X-Original-To: pce@core3.amsl.com
Delivered-To: pce@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5916828C19C for <pce@core3.amsl.com>; Mon, 19 Jan 2009 01:20:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0
X-Spam-Level:
X-Spam-Status: No, score=x tagged_above=-999 required=5 tests=[]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bEOnzUXjUBjK for <pce@core3.amsl.com>; Mon, 19 Jan 2009 01:20:52 -0800 (PST)
Received: from ams-iport-1.cisco.com (ams-iport-1.cisco.com [144.254.224.140]) by core3.amsl.com (Postfix) with ESMTP id 1B3493A6930 for <pce@ietf.org>; Mon, 19 Jan 2009 01:20:45 -0800 (PST)
X-IronPort-AV: E=Sophos;i="4.37,288,1231113600"; d="txt'?scan'208,217";a="31230955"
Received: from ams-dkim-2.cisco.com ([144.254.224.139]) by ams-iport-1.cisco.com with ESMTP; 19 Jan 2009 09:20:28 +0000
Received: from ams-core-1.cisco.com (ams-core-1.cisco.com [144.254.224.150]) by ams-dkim-2.cisco.com (8.12.11/8.12.11) with ESMTP id n0J9KSmY028682; Mon, 19 Jan 2009 10:20:28 +0100
Received: from xbh-ams-332.emea.cisco.com (xbh-ams-332.cisco.com [144.254.231.87]) by ams-core-1.cisco.com (8.13.8/8.13.8) with ESMTP id n0J9KS6k024817; Mon, 19 Jan 2009 09:20:28 GMT
Received: from xfe-ams-332.cisco.com ([144.254.231.73]) by xbh-ams-332.emea.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 19 Jan 2009 10:20:28 +0100
Received: from ams-jvasseur-8713.cisco.com ([10.55.201.132]) by xfe-ams-332.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 19 Jan 2009 10:20:22 +0100
Message-Id: <F6A827EF-E5D4-4F9C-B963-30018E9CF3DA@cisco.com>
From: JP Vasseur <jvasseur@cisco.com>
To: Adrian Farrel <adrian@olddog.co.uk>
In-Reply-To: <CC6BB4B4C3D34ABFA564FF76138CD348@your029b8cecfe>
Mime-Version: 1.0 (Apple Message framework v930.3)
X-Priority: 3
Date: Mon, 19 Jan 2009 10:20:21 +0100
References: <34CD72EB03F6410D9C8BBB1C40758C86@your029b8cecfe> <CC6BB4B4C3D34ABFA564FF76138CD348@your029b8cecfe>
X-Mailer: Apple Mail (2.930.3)
X-OriginalArrivalTime: 19 Jan 2009 09:20:23.0463 (UTC) FILETIME=[283C0B70:01C97A17]
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; l=431824; t=1232356828; x=1233220828; c=relaxed/simple; s=amsdkim2001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=jvasseur@cisco.com; z=From:=20JP=20Vasseur=20<jvasseur@cisco.com> |Subject:=20Re=3A=20[Pce]=20Last=20call=20comments=20on=20d raft-ietf-pce-monitoring-02.txt |Sender:=20; bh=VIn1njOcA+M18PEBAjpVGwv/XXlNiN658nl3XqkhGYo=; b=M1/dRNXYc2v6LIQP4yZqjdf85Y/vilp32vwkMcEa3nn5oYORwapuHL73mH h7mbqFxvKFGpNvWFCgWYfwRgGWsODRImXBiKusRhvAcg8nMTzXSD7Bxva+Ql mlOUYykOEX;
Authentication-Results: ams-dkim-2; header.From=jvasseur@cisco.com; dkim=pass ( sig from cisco.com/amsdkim2001 verified; );
Cc: pce@ietf.org
Subject: Re: [Pce] Last call comments on draft-ietf-pce-monitoring-02.txt
X-BeenThere: pce@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Path Computation Element <pce.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/pce>, <mailto:pce-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/pce>
List-Post: <mailto:pce@ietf.org>
List-Help: <mailto:pce-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pce>, <mailto:pce-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0518266793=="
Sender: pce-bounces@ietf.org
Errors-To: pce-bounces@ietf.org

Hi Adrian,

Many Thanks for the review. See below.

On Dec 19, 2008, at 12:01 AM, Adrian Farrel wrote:

> Hi,
>
> A few nitty comments on this draft.
>
> Adrian
>
> ===
>
> Abstract. Second sentence.
>
> "In PCE-based environments, it is thus critical..."
> I don't think this follows as a conclusion from the previous  
> sentence. In
> particular, there is no definition of a "path computation chain" and  
> no
> explanation of the process involved in PCE that leads to the  
> conclusion that
> certain things need to be monitored or troubleshooted.
>
> I suggest inserting a new second sentence to give some background.  
> Something
> like...
>
>  Path Computation Clients (PCCs) send computation requests to PCEs,
>  and these may forward the requests to and cooperate with other PCEs
>  forming a "path computation chain".
>
> You can then put in a paragraph break and continue with your text.
>

Added.

> ===
>
> Section 1
>
> You need to expand "PCC" and "TE LSP" on first usage as the  
> Terminology section comes later.
>

Done.

> ===
>
> Section 3
>
> Please include a reference to draft-farrel-rtg-common-bnf-07.txt.
> At this stage you should (and can safely) make this a normative  
> reference. The RBNF draft will go to IETF last call in early January  
> and will be ahead of this draft in the processing queue.
>

Added.

> ===
>
> Section 3
>
> s/Computation Monitoring request/Computation Monitoring Request/
>

Fixed.

> ===
>
> Section 3.1
>
> <request> is shown to optionally contain the RRO.
>
> The text that follows refers to the ERO.
>
> I think the text is wrong.
>

Fixed.

> ===
>
> Section 3.1
>
> s/metric(s)/metrics/
>

Fixed

> ===
>
> Section 3.1
>
> Examples 1 and 3 should also include a note that the requested  
> information is returned in a PCMonRep.
>

Indeed, the text has been expanded and clarified: "In all of the  
examples above, a PCRep (in-band request) or PCMonReq (out of band  
request) message is sent in response to the request that reports the  
computed metrics."

> ===
>
> Section 3.2
>
> Are you sure that you want to send an Error message in response to a  
> bad PCMonRep?
>
> Sending an Error in response to a PCMonReq is useful because it  
> cancels the request and means the sender does not continue to wait  
> or a PCMonRep. But sending an Error in response to a PCMonRep does  
> not seem to have any use except to load the network. For example,  
> what is the state of the outstanding PCMonReq in this situation?  
> What is the sender of the PCMonRep supposed to do when it receives  
> the Error message?
>
> More appropriate would be for the receiver of the bad PCMonRep to  
> raise an alert to the operator and to put the sender of the bad  
> message on a blacklist so that it never send it another PCMonReq.
>

Well yes but ... error message are not only useful for the PCC to  
clean states but for troubleshooting on both sides, especially on the  
PCC side where the absence of reply can either be interpreted as a  
lost request, overloaded PCE, malformed request, .... Furthermore, it  
is inline with the general approach taken by PCEP when a message is  
received that does not carry a mandatory object.

Fair enough ?

> ===
>
> Section 3.2
>
> It would be useful to include references for the objects listed in  
> the BNF as you do in Section 3.1.
>

Text added.

> ===
>
> Section 3.2
>
> I'm slightly confused by the BNF.
>
>  <metric-pce>::=[<PCE-ID>]
>                 [<PROC-TIME>]
>                 [<CONGESTION>]
>
> This means that <metric-pce> can be completely empty.
> Surely <PCE-ID> is mandatory within <metric-pce> ?

oops ! [] removed, thanks. It should read

<metric-pce>::=<PCE-ID>.

It would not help not knowing which PCE the metrics apply to ;-)

>
>
> ===
>
> Section 4
>
> You state that the P and I flags in the new objects are cleared.
>
> I think you mean "SHOULD be set to zero and MUST be ignored"

Yes, fixed.

>
>
> ===
>
> Section 4
> Second paragraph.
>
> The case of "in band" monitoring seems to be described twice.
>

Fixed.

> ===
>
> Section 4.2
>
> The object has previously been referred to as the PCE-ID object.  
> That seems like a better name. Can you check the whole I-D to make  
> sure this is consistent.
>

Fixed: PCE-ID used everywhere.

> ===
>
> Section 4.3
>
>  If allowed by policy, the PCE includes a PROC-TIME object within a
>  PCMonRep or a PCRep message if the P bit of the MONITORING object
>
> This conflicts with Section 4.1
>
>  P (Processing Time) - 1 bit: the P bit of the MONITORING object
>  carried in a PCMonReq or a PCReq message is set to indicate that the
>  processing times is a metric of interest, in which case a PROC-TIME
>  object MUST be inserted in the corresponding PCMonRep or PCRep
>  message.
>
> I prefer the "if allowed by policy" which means you need to change  
> the text Section 4.1.
>

Right new text:

P (Processing Time) - 1 bit: the P bit of the MONITORING object  
carried in a PCMonReq or a PCReq message is set to indicate that the  
processing times is a metric of interest. If allowed by policy, a PROC- 
TIME object MUST be inserted in the corresponding PCMonRep or PCRep  
message. The P bit MUST always be ignored in a PCMonRep or PCRep  
message.

> ===
>
> Section 4.3
> s/algorithm(s)/algorithms/  (twice)
>

Fixed.

> ===
>
> Section 4.3
>
>  Flags: 18 bits - No Flags are currently defined:
>
> Looks like 16 bits, and one flag is defined!
>

fixed, thanks.

> ===
>
> Section 4.3
> s/computation(s)/computations/
>

Fixed

> ===
>
> Section 4.3
>
>  Unassigned bits are considered as reserved and MUST be set to zero on
>  transmission.
>
> Move this text to immediately after the definition of the E bit
>

Done.

> ===
>
> Section 4.3
>
>  More granularity may be introduced in further revision of this
>  document to get a monitoring metric for a general request of a
>  particular class (e.g. all PCReq of priority X).
>
> I guess you decided not to do this. Delete the text.

Deleted.

>
>
> ===
>
> Section 4.4
>
> - Rather than one bit and a reserved field, shouldn't you have a
> flags field (with one bit assigned) and a reserved field.

Yes, Flag field is now 8 bits (one allocated) + Reserved (8bits) +  
Congestion duration field (16 bits)

>
> - You need to explain that setting the C bit and a congestion
>  duration of zero has meaning. If it has no meaning, you don't
>  need the C bit.

It is stated in the text: "When cleared this indicates that the PCE is  
not congested and the "Congestion Duration" field MUST be set to  
0x0000".
C=0 means "no congestion on the PCE".

>
> - Although it verges on telling the implementer how to write code
> I think some advice is needed on what to do when a
> CONGESTION object is received with the C bit set. There
> are two aspects...
> - How long has the CONGESTION object taken to propagate
>    from the reporting PCE to the PCC (might not be one hop)?
> - What should the receiver do with the Congestion Duration
>    value?

You already know what I think about this ;-))) It is in my opinion  
very much implementation dependent, don't you think ?
First of all, specifying what the PCC should be in this document is a  
bit awkward. Indeed, it could be for pure management
purposes (out of band) or in the context of a particular path  
computation request (e.g. after a PCreq local timeout). Even
in that latter case, the PCC may decide to back-off and retry or  
select another PCE, depending on the urgency of the request
which is usually context dependent (the PCC may decide to select  
another PCE for an already retransmitted request if the
priority is > X, ....). Furthermore the PCC may also be a function of  
the reported congestion value, ...

I would argue that documenting this behavior is out of scope (not a  
monitoring issue) and very much implementation/application specific.

>
> - It may also be worth noting whether a PCE in the middle of a
>  chain is allowed to look at the CONGESTION information
>  that it receives from a downstream PCE and plans to pass on
>  to an upstream PCE or PCC.

Interesting observation. Text added:

"It is worth noting that a PCE along a PCE chain involved in the  
monitoring request may decide to learn from the congestion information  
received by one of downstream PCE in the chain."

>
>
> ===
>
> Section 6
>
>  Reception of a PCMonReq message: upon receiving a PCMonReq message:
>
> Can you sort out the over-use of colons?

Fixed: "Upon receiving a PCMonReq message:"

>
>
> ===
>
> Section 6, case 1)
>
> I think you should refer explicitly to Section 6.9 of [PCEP] because  
> that includes not only the error message, but some important back- 
> off and shutdown procedures.
>

100% agree, very good catch, thanks.

OLD:

1) As specified in [I-D.ietf-pce-pcep], if the PCE does not support
    the PCMonReq message, the PCE peer must send a PCErr message with
    Error-value=2 (capability not supported).
NEW:
1) As specified in [I-D.ietf-pce-pcep], if the PCE does not support
    the PCMonReq message, the PCE peer MUST send a PCErr message with
    Error-value=2 (capability not supported). According to the procedure
    defined in section 6.9 of [I-D.ietf-pce-pcep], if a PCC/PCE  
receives unrecognized
    messages at a rate equal of greater than specified rate, the PCC/ 
PCE must send a
    PCEP CLOSE message with close value="Reception of an unacceptable  
number of
    unknown PCEP message. In this case, the PCC/PCE must also close  
the TCP session
    and must not send any further PCEP messages on the PCEP session.

Note that we may want to also use this procedure to handle DoS  
attacks. More below.

> ===
>
> Section 6, case 2)
>
> You should refer back to Section 5.

Yes.

>
> But note also that you have an explicit case (section 4.3) where  
> policy causes the message to be processed but an object to be left  
> out of the reply.
>

Not sure to see which case you're referring to ?

> ===
>
> Section 6
>
> You need a similar set of text for the in band case.
>

Right, text added:

  Upon receiving a PCReq message that carries a MONITORING and
    potentially other monitoring objects (e.g.  PCE-ID object):

    1) As specified in [I-D.ietf-pce-pcep], if the PCE does not support
    (in band) monitoring, the PCE peer MUST send a PCErr message with
    Error-value=2 (capability not supported).  According to the  
procedure
    defined in section 6.9 of [I-D.ietf-pce-pcep], if a PCC/PCE receives
    unrecognized messages at a rate equal of greater than specified  
rate,
    the PCC/PCE must send a PCEP CLOSE message with close
    value="Reception of an unacceptable number of unknown PCEP message.
    In this case, the PCC/PCE must also close the TCP session and must
    not send any further PCEP messages on the PCEP session.

    2) If the PCE supports the monitoring request but the monitoring
    request is prohibited by policy, the PCE must follow the procedure
    specified in section 5.

    3) If the PCE supports the monitoring request and that request is  
not
    prohibited by policy, the receiving PCE MUST first determine whether
    it is the last PCE of the path computation chain.  If the PCE is not
    the last element of the path computation chain, the PCReq message
    (with the MONITORING object and potentially other monitoring objects
    such as the PCE-ID) is relayed to the next hop PCE: such next-hop  
may
    either be specified by means of a PCE-ID object present in the PCReq
    message or dynamically determined by means of a procedure outside of
    the scope of this document.  Conversely, if the PCE is the last PCE
    of the path computation chain, the PCE originates a PCRep message
    that contains the requested objects according to the set of  
requested
    PCE states metrics listed in the MONITORING and potentially other
    monitoring objects carried in the corresponding PCReq message.

+

Upon receiving a PCRep message that carrries monitoring data, the
    message is processed, additional monitoring data is added according
    to this specification and the message is forwarded upstream to the
    requesting PCE or PCC.

> ===
>
> Section 7.2
> s/MAY/may/
>

fixed

> ===
>
> Section 7.6
>
>  An implementation SHOULD allow handling PCReq messages with
>  a higher priority than PCMonReq messages.
>
> And when the monitoring is in band?
>

Yes, text added "An implementation SHOULD allow the configuration of a  
second limit for the PCReq message requesting monitoring data. "

> ===
>
> Section 8
>
> It would be helpful to IANA if you split this into subsections for  
> each registry/subregistry.
> Also nice if you could try to cut and paste from the registry naming  
> used in [PCEP].
>

Done.

> ===
>
> Section 8
>
> The IANA section doesn't seem to be complete.
>
> For example:
> - The flags field in the Monitoring Object
>  (be careful to get the bit numbering correct :-)
> - The flags field in the PROC-TIME object
> - The flags field in the CONGESTION object
>

Yes ! Text Added.

> ===
>
> Section 9
>
> Does the information gathered by monitoring represent any additional  
> vulnerability? Could an attacker gain interesting information by  
> snooping this?
>
> I think the Congestion object is a good and lightweight way to  
> attack a PCE deployment.

Yes indeed but similarly to regular PCReq messages for which we have  
suggested security mechanism.

> You should note this and suggest that this makes the use of security  
> mechanisms important. You may also need to node that there is a  
> chain of trust model here such that even if one PCE ensures that it  
> uses security, the information it supplies can be changed on a hop  
> further upstream. Therefore, a consistent security model across all  
> cooperating PCEs is desirable.

Right but again this is true for regular PCRep message.

Text Added (let me know if you think that this is sufficient):

9.  Security Considerations

    The use of monitoring data can be used for various attacks such as
    denail of service attacks (for example by setting the C bit and
    congestion during field of the CONGESTION object to stop PCCs from
    using a PCE).  Thus it is recommended to make use of the security
    mechanisms discussed in [I-D.ietf-pce-pcep] to secure a PCEP session
    (authenticity, integrity, privacy, DoS protection, etc) to secure  
the
    PCMonReq, PCMonRep messages and PCE state metric objects defined in
    this document.  An implementation SHOULD allow limiting the rate at
    which PCMonReq or PCReq messages carrying monitoring requests
    received from a specific peer are processed (input shapping), or  
from
    another domain (see also section 7.6).

>
>
> ===
>
> Section 11.2
>
> [I-D.ietf-pce-disco-proto-isis] is not referenced (did you run I-D  
> nits?)

Reference deleted.

>
>
> [I-D.ietf-pce-of] and [I-D.ietf-pce-pcep-xro] are normative since  
> they define objects that you cite explicitly.
>
>


Diffs Attached. New Boiler template not yet included.

Many Thanks.

JP.

> _______________________________________________
> Pce mailing list
> Pce@ietf.org
> https://www.ietf.org/mailman/listinfo/pce

_______________________________________________
Pce mailing list
Pce@ietf.org
https://www.ietf.org/mailman/listinfo/pce