[Pce] Preliminary feedback on I.-D. draft-pouyllau-pce-enhanced-errors-02.txt
Ramon Casellas <ramon.casellas@cttc.es> Thu, 22 July 2010 10:30 UTC
Return-Path: <ramon.casellas@cttc.es>
X-Original-To: pce@core3.amsl.com
Delivered-To: pce@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4DFBE3A67F0 for <pce@core3.amsl.com>; Thu, 22 Jul 2010 03:30:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.601
X-Spam-Level:
X-Spam-Status: No, score=0.601 tagged_above=-999 required=5 tests=[BAYES_50=0.001, J_CHICKENPOX_82=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id byFbBXccM9OQ for <pce@core3.amsl.com>; Thu, 22 Jul 2010 03:30:37 -0700 (PDT)
Received: from aquila.cttc.es (aquila.cttc.es [84.88.62.230]) by core3.amsl.com (Postfix) with ESMTP id 2CA0E3A6B1E for <pce@ietf.org>; Thu, 22 Jul 2010 03:30:28 -0700 (PDT)
Received: from castor (postfix@castor.cttc.es [84.88.62.196]) by aquila.cttc.es (8.14.3/8.14.3/Debian-9ubuntu1) with ESMTP id o6MATqMV003721 for <pce@ietf.org>; Thu, 22 Jul 2010 12:29:52 +0200
Received: from [84.88.61.50] (pcrcasellas.cttc.es [84.88.61.50]) by castor (Postfix) with ESMTP id 003B72FC2BB for <pce@ietf.org>; Thu, 22 Jul 2010 12:29:28 +0200 (CEST)
Message-ID: <4C481E93.4020207@cttc.es>
Date: Thu, 22 Jul 2010 12:33:55 +0200
From: Ramon Casellas <ramon.casellas@cttc.es>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.7) Gecko/20100713 Thunderbird/3.1.1
MIME-Version: 1.0
To: pce@ietf.org
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0 (castor); Thu, 22 Jul 2010 12:29:29 +0200 (CEST)
X-Scanned-By: MIMEDefang 2.67 on 84.88.62.230
Subject: [Pce] Preliminary feedback on I.-D. draft-pouyllau-pce-enhanced-errors-02.txt
X-BeenThere: pce@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Path Computation Element <pce.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/pce>, <mailto:pce-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/pce>
List-Post: <mailto:pce@ietf.org>
List-Help: <mailto:pce-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pce>, <mailto:pce-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jul 2010 10:30:39 -0000
> We have submitted an updated version of draft http://www.ietf.org/id/draft-pouyllau-pce-enhanced-errors-02.txt > The major update is - as mentioned during IETF 77 meeting - a new section describing some potential scenarios of usages of the error and notification types specified by the draft. Dear Draft authors, all As requested, please find below some preliminary feedback on the draft. I am not aware of previous comments so apologies in advance if some of the questions I raise have been addressed. I considered that some early feedback may be appropriate in view of Maastricht, but some of the points may not be valid be due to my yet limited comprehension of the draft. Some comments also reflect my subjective views :) As a short executive summary =========================== I would agree that extending PCEP to convey error / notification processing indications (fatal, not fatal, transient, and to be forwarded) and secondarily related information (ncluding a list of potential targets) could useful but I see a few open issues in its current form, as detailed below. I also agree that the fact that different PCEs have different support for errors is a good use case for the applicability of the proposed extensions * First, is the fact that the error hierarchy type/value as found in RFC5440 and other normative documents is no longer maintained. In those aforementioned docs, the error type indicates a "family" of errors and the value is a "refinement". Instead, in the proposed I.-D. a given error type indicates a "processing indication" rather than the actual error cause, which is delegated to the only the error value itself? did I miss something? (see point in Section 4) * Second, I would think (as it seems the case in other errors) that the notion of "unrecoverable" and whether an error should cause the shutdown of the connection, is an attribute of every particular concrete error pair, and belongs to its description. For example, receiving a message with protocol version ">1" triggers an error and causes the PCEP connection to close. I would also think that closing the TCP/PCEP connection is a local decision, and I am not sure of the idea of "this is a serious error and _you_ should close the connection", rather "this is a serious error and _I_'ll be closing the connection" (on a side note, I always thought that the implication that the endpoint receiving a CLOSE message should close the connection does not prevent the PCE from closing it itself.) * Third, the DLO seems a means to indicate the set of destinations for an error/notification but raises questions on how to map data (or even control) plane resources to PCEs responsible for them and the fact that in a chain of requests, the history is lost. I still fail to see concrete cases for DLO other than ("all your other peers") which is still not clear to me and could lead to flooding and more complex procedures. In short, in my particular humble view, giving "context" to errors where "context" means "processing indications" or "destination targets" or "conveys more detailed fine grained information" could use for example TLVs rather than blocking several error types for this purpose (as an illustrating example, we often add a ASCII_TLV to PCEP_ERROR objects to add some descriptive error cause). Ideally, the actual error (x/y) should be descriptive enough to state whether a) the error needs to be forwarded as is or not b) the error should cause the connection to be shutdown and c) the error/notification could, eventually, be interpreted as a warning without implying that the request has been canceled. Detailed review ========================= Section 2.2. ---------------- * Section 2.2 is, in my opinion, a bit misleading and confusing, when discussing the dead-timer (although it does not impact the rest of the I.-D). The dead-timer is not related to the latency to get a response to a request (PCRep message), but the ability to detect a dead connection, which can be kept alive by means of KeepAlives. It is possible to have latencies in path computations higher than deadtimers, where KeepAlives are used accordingly. Thus, the text "If the PCC does not receive any reply before the dead timer is out" and the text “it supposes that the deadtimer is long enough to support end to end distributed path computation” is, imho, not right. The latency to get a Reply is implementation defined. These are two orthogonal aspects, the deadtimer is local to a given adjacency, and can be very low yet the end to end path computation be notably higher. * I would suggest changing PCEReq and PCERep by PCReq and PCRep (seems to be the common notation) * Rather than stating “there are two types of Notifications”, I would suggest “[RFC5440] defines two types of notifications”. Section 2.3 ---------------- * Also related to a bullet in Section 4, note that the RP (rplist) bound to PCEP_ERROR object is also optional, as the I-D says is the case for notifications. Section 4 ------------ * “PCE errors are always request specific”. Why are you assuming this? The RP list is optional and nothing prevents you from adding new error types and new error values for, for example, PCE status. Clearly, RFC5440 states “The PCErr message is sent…in response… or unsolicited manner” (for what is worth, the parsing and treatment of PCNtf and PCErr messages is quite similar in our case) * Minor detail: Error Types 16 / 17 are already assigned to P2MP – could be worth adding a IANA TBD * Typo: must backward -> must send backwards / forwards * One of my main concerns of using error types 16-19 is the following: in all other cases the error type conveys some kind of meaning/semantics. Error “X” means that e.g. “P2MP” has failed. Error types are a “family of errors” and error values are a “refinement”. Using OOP terms one could conceive a hierarchy with a base class, error types inheriting base class and error types/values (“concrete errors”) another refinement. In the proposed approach this is no longer the case. Personally, I would be for the idea of extending PCEP for support of “status quo” / “propagation” as proposed in the draft, but by means of TLVs. Since PCEP_ERROR objects support optional TLVs, Errors can be tagged with TLVs specifying error processing rules (which, in turn, could be applied to existing error pairs). In other words: processing indications can be conveyed by other means, while leaving error type/value pairs to indicate concrete errors (with a two level hierarchy). * If an error / notification is a "warning" and a response is still to be expected, I don't know whether an alternative means could be to extend the RBNF of PCRep to allow for such a warning to be "embedded" in the response, rather than a separate PCErr message that is sent previously. Likewise, I would think that whether the fact of sending an error or a notification "cancels" or "implies that no response is to be expected" or, on the contrary "a response is to be expected" is either a property of a concrete error or notification pair, or, in a more strict manner, PCEP mantains the motto : "a request causes a reponse or an error or a notification. The response can be positive (1+ paths) or negative(no path). An error always implies a no response (and as far as i know, also the notification bound to a RP) and warnings are embedded in the response (as when adding an object with the I flag which means "warning I was not able to take this into account". Diffusion List object (DLO) ------------------------------ The purpose of a DLO seems to be to indicate the a generic set of "destinations" for a given error / notification. I am still not sure of this (I may need more time to digest the use cases). At first sight, I would say that this is a good example of the need of generalizing the use of PCEid identifiers (using IPv4 / IPv6 addresses) that refer to PCEs regardless of their actual IP addresses (or addresses used in the actual TCP connection), playing a similar role to router addresses / node Ids/ router ids/ loopbacks, etc. A DLO could indeed be, in its simplest form, the list of PCEs to be notified (something similar to the MONITORING document). the I.-D. proposes using ERO-like objects for the DLO, which raises the questions of : a) mixes data a control plane entities and b) mapping data plane resources (e.g. an unnumbered interface ID) to the one (or more) PCEs responsible for the domain where that resource is found. Another of my main concerns is that I still fail to see the use cases where a given PCE could send an error including a DLO with a set of peers for which it has no visibility (typically, a given PCE involved in a PCE or domain chain does not have the information of the “history” of the end-to-end request. At most, one could keep the original endpoints, but that’s all. Finally, I am not sure of the procedure regarding DLO when a deployment opts for “non persistant” connections, in which the TCP/PCEP connection is closed after every request. The reception of a DLO including a peer for which there is no PCEP session in the UP state should : a) trigger the establishment of the connection for the sole purposes of forwarding an error? B) trigger an error to the peer sending the ERROR with the DLO (this imho should not be the case to avoid the error on error) How do you plan to recognize that the remote endpoint of a TCP/PCEP connection is a PCC or a PCE? (this is implied by the use of TT field in DLO). PCEP does not easily provide a means to recognize it unless somehow pre-configured. Section 5 --------------- * Section 5.1 In the description of Error Type 16, it is said that Error type 16, not critical, implies that the PCEP session needs not to be shutdown. I tend to think this is a property of every concrete error pair. Also I don’t agree with the example: a Metric with a bound value of -1 is an encoding error, it is a “contract” a PCE may not be able to fulfill (especially if processing bit is set to 1) and must trigger an error, it should not be allowed to “expect a response”. In general there are two different aspects: whether an TCP/PCEP connection must be shutdown after a given error (it is somehow related to the “severity” of the error: a malformed message, a different PCEP version, etc) and whether the error is a “warning” or not. I would agree with the extension of “warnings” meaning that the response “may follow”. I don’t agree with the example. However, warnings could also be conveyed by (yet to be defined) notifications. * Section 5.2 like mentioned before, the ability to “propagate upstream” an error is a good addition, but it may be a property of a given error (modified by policies of the PCEs in the chain) Typos ---------- * Section 2.1.1 – maybe the word “throw” is too exception-specific, I would suggest “returns” * If PCE… (unfinished sentence/typo) * … and sends back / and send back (could send back) • in details -> in detail I hope this is somehow useful, and open to discussion. Thanks for reading and best regards Ramon. -- Ramon Casellas, Ph.D. Research Associate - Optical Networking Area -- http://wikiona.cttc.es CTTC - Centre Tecnològic de Telecomunicacions de Catalunya, PMT Ed B4 Av. Carl Friedrich Gauss 7 08860 Castelldefels (Barcelona) - Spain Tel.: +34 93 645 29 16 -- Fax. +34 93 645 29 01
- [Pce] Preliminary feedback on I.-D. draft-pouylla… Ramon Casellas
- Re: [Pce] Preliminary feedback on I.-D. draft-pou… POUYLLAU, HELIA (HELIA)
- Re: [Pce] Preliminary feedback on I.-D. draft-pou… Gino Carrozzo