Re: [Idr] I-D Action: draft-ietf-idr-error-handling-03.txt
"Chris Hall" <chris.hall@highwayman.com> Tue, 11 December 2012 11:37 UTC
Return-Path: <chris.hall@highwayman.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7C40721F8496 for <idr@ietfa.amsl.com>; Tue, 11 Dec 2012 03:37:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.34
X-Spam-Level: **
X-Spam-Status: No, score=2.34 tagged_above=-999 required=5 tests=[AWL=-2.121, BAYES_00=-2.599, GB_SUMOF=5, HELO_MISMATCH_UK=1.749, HOST_MISMATCH_NET=0.311]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rOsdaS8JChUv for <idr@ietfa.amsl.com>; Tue, 11 Dec 2012 03:37:24 -0800 (PST)
Received: from smtp.demon.co.uk (mdfmta009.mxout.tch.inty.net [91.221.169.50]) by ietfa.amsl.com (Postfix) with ESMTP id 452BC21F8488 for <idr@ietf.org>; Tue, 11 Dec 2012 03:37:15 -0800 (PST)
Received: from mdfmta009.tch.inty.net (unknown [127.0.0.1]) by mdfmta009.tch.inty.net (Postfix) with ESMTP id A7136128416; Tue, 11 Dec 2012 11:37:14 +0000 (GMT)
Received: from mdfmta009.tch.inty.net (unknown [127.0.0.1]) by mdfmta009.tch.inty.net (Postfix) with ESMTP id 7AE84128415; Tue, 11 Dec 2012 11:37:14 +0000 (GMT)
Received: from hestia.halldom.com (unknown [80.177.246.130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mdfmta009.tch.inty.net (Postfix) with ESMTP; Tue, 11 Dec 2012 11:37:14 +0000 (GMT)
Received: from hyperion.halldom.com ([80.177.246.170] helo=HYPERION) by hestia.halldom.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.76) (envelope-from <chris.hall@highwayman.com>) id 1TiO9A-0005hj-Gj; Tue, 11 Dec 2012 11:37:12 +0000
From: Chris Hall <chris.hall@highwayman.com>
To: idr@ietf.org
References: <20121121191321.6164.6887.idtracker@ietfa.amsl.com> <50AD2986.90705@cisco.com> <058b01cdd3b4$9f5193b0$ddf4bb10$@highwayman.com> <8ED5B0B0F5B4854A912480C1521F973A0F4940@xmb-rcd-x13.cisco.com> <94913EE5-2864-4EE2-B474-9631430B1E22@ericsson.com> <068701cdd478$2cf01cf0$86d056d0$@highwayman.com> <CAEGVVtBy-zdLz8hVajLnuAqgzfgQHrseK4r-N9=pOZGtqV7LbA@mail.gmail.com>, <074d01cdd536$173f5830$45be0890$@highwayman.com> <9474D8DC-30FF-4C52-9504-15CBCC47E7D8@ericsson.com> <07df01cdd661$f28ef7c0$d7ace740$@highwayman.com> <36E98AE5-3EF8-4738-9982-42B9CA0BAAF5@rob.sh>, <005001cdd6da$099f1e90$1cdd5bb0$@highwayman.com> <828AAFF5-0260-4AA6-BBDC-6C1F69919837@ericsson.com> <009001cdd6ff$1c982530$55c86f90$@highwayman.com> <2F3EBB88EC3A454AAB08915FBF0B8C7E10DD99@eusaamb109.ericsson.se>
In-Reply-To: <2F3EBB88EC3A454AAB08915FBF0B8C7E10DD99@eusaamb109.ericsson.se>
Date: Tue, 11 Dec 2012 11:37:07 -0000
Organization: Highwayman
Message-ID: <013301cdd793$dcac5be0$960513a0$@highwayman.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQHwJ9rDNhpCAk7gfRWZlMlTSLUu6QFwpw6KAjDRnx0CVlUcVAFHaBeAARUnQBoBYBPk8QGjHInVAU6Z2PwCWugrJwCjhW3CAl9IPxQBWty0zQCWm32nAay9fNuXIKxf4A==
Content-Language: en-gb
X-MDF-HostID: 22
Subject: Re: [Idr] I-D Action: draft-ietf-idr-error-handling-03.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/idr>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Dec 2012 11:37:25 -0000
Jakob Heitz wrote (on Mon 10-Dec-2012 at 18:09 +0000): > On Monday, December 10, 2012 9:52 AM, Chris Hall wrote: .... > Once you have a malformed update, NOTHING is certain. > We limit the damage and call for human intervention. OK. That's the "mission statement". As we get into the detail I think I see where my disconnect is. The draft gets very picky about NLRI and NLRI attributes. The moment it sees any malformation in those, it throws up its hands and hits the session-reset button. The draft allows repeat attributes of the same type, except for NLRI attributes. The draft uses a fair amount of pencil explaining why simply discarding UPDATE messages is unsafe, which suggests that "treat-as-withdraw" is the minimum requirement if session-reset is to be avoided. [BTW, why are repeat NLRI attributes deemed a cardinal sin (session-reset, already) ? If they are valid, the semantics are entirely clear... here are some more prefixes to be updated/withdrawn.] The introduction to the draft states: "The goal ... is to minimise the impact on routing ... while maintaining protocol correctness .... removing the routes carried in the malformed UPDATE from the routing system." >From all that, I have been working on the assumption that "limit the damage" implies, at a minimum, not proceeding unless all NLRI have been identified, and can therefore be "treated-as-withdraw" -- or, at least, not proceeding unless reasonable steps (TBD) have been taken to identify all NLRI most of the time (also TBD). As you say, with a malformed update "NOTHING is certain", so this is Tricky, and the receiver has to take some view on what are reasonable steps. I have suggested verifying the attribute "framing" as a minimum requirement for that. It has also been suggested that, having found a malformed attribute, the receiver should stop stepping through attributes by attribute length, and scan octet-wise looking for NLRI attribute(s). However, if the sender is helpful, and places NLRI attributes ahead of all others (as required by the draft) then the receiver can stop worrying about trying to deal with attributes where "NOTHING is certain", and can process the NLRI (effectively) separately -- provided the receiver has some means of knowing the sender is being helpful. However, it seems that identifying all the NLRI is not as important as I had understood it to be, so the receiver need take no special steps to extract the NLRI. So, if the sender is helpful, things will generally work better, but otherwise it makes no difference to the receiver. I don't agree, but that seems to be the approach. > > I can quite believe that, in practice, few BGP implementations (if > > any) send more than one of the above forms of NLRI in a single > > UPDATE message. > > > > But that is not a requirement of the RFC or the draft -- so the > > receiver is not (strictly speaking) entitled to assume it. > It assumes the best it can to limit the damage. What do you suggest that should be ? In a world were "NOTHING is certain", does it matter if different implementations make different assumptions ? Or should the specification require consistent behaviour in the face of uncertainty, if nothing else, to avoid increasing that uncertainty ? > > What is the receiver supposed to do if it has not found any NLRI > > at the point that it hits a malformed attribute ? > Reset the session. We have established that the error-handling is not required to find all NLRI. That is, we are happy proceeding with a session where there are some NLRI which we would prefer to have "treated-as-withdraw", but could not. In effect, we are prepared to tolerate a measure of "UPDATE-discard" (Appendix A of the draft, notwithstanding). I suppose there is a difference between knowing that we have missed some NLRI and not knowing whether we have found all NLRI. However, given the pain associated with session-reset, is there a good reason for accepting one degree of "UPDATE-discard" and not another ? > > So, the receiver scans the attributes, and on the first malformed > > one it stops. Yes ? Or, perhaps it ploughs on to the end > > stepping past malformed attributes, and truncating the final > > attribute if it overruns the 'Total Attributes Length'. Yes ? > It doesn't stop. So the parsing of attributes takes the Attribute Length of a malformed attribute at face value. Since "NOTHING is certain", it doesn't have much choice. If the final attribute overruns the 'Total Attributes Length' (or there is an incomplete attribute header at the end) one thing is certain, the attributes are badly broken -- the sender has been unable to complete the simple task of correctly "framing" the attributes. I think the draft expects this case to be taken as a malformation of the attribute. The draft delegates the definition of malformation to the relevant documentation for each Type of attribute. If there is intended to be a general or default way of dealing with this case, then I think the draft needs to specify. If the sum of the Attribute Lengths is exactly the 'Total Attributes Length' (but some attribute is malformed) then "NOTHING is certain", but it is possible/probable that all attributes have been identified. [The degree of confidence may be increased if the Flags and Length of known and well-known attributes are correct, and there are no repeated attribute types, and perhaps other "semantic" information is taken into account.] This all matters rather more if one is trying to identify all NLRI attributes, but not one jot or iota otherwise apart from the diagnostics. The draft has a binary approach to attributes, they are either malformed or not malformed (well-formed). Is there room for a "semantic error" ? That is, an attribute which is well-formed as far as its Flags, Type, Length and, perhaps, internal structure are concerned, but make no sense at all. The result may still be "treat-as-withdraw" (say) but the error does not cast doubt on the attributes which follow. The distinction would improve the diagnostics. So, it appears that the draft is taking a pretty relaxed view of what is acceptable -- since "NOTHING is certain" why sweat it ? I am worried by the option to "attribute discard". If things are not certain, is "treat-as-withdraw" not the safer option ? An ATOMIC_AGGREGATE attribute, for example, is considered malformed if it has any length other than 0. Since nobody gives a rodent's posterior about this attribute, it seems to make perfect sense to throw it on the floor if it is malformed. Except, except, the length of an attribute also affects the attributes around it. An ATOMIC_AGGREGATE attribute with a length of (say) 700 octets is such obvious nonsense ! And that's to simply be discarded ? Surely either the sender has departed the reservation in something of a hurry, or some earlier (possibly undetected) attribute error has thrown the parser off track ? The balance of probabilities has to be that this is a symptom of some problem deeper than a meaningless value for a meaningless attribute... surely ? If the attribute length is indeed invalid, then the sum of all the attribute lengths is probably going to be wrong, so this will decay into "treat-as-withdraw". Nevertheless, I struggle to see the point of applying "attribute discard" in any case of attribute malformation. IMO "attribute discard" will be appropriate for some "semantic errors", only. .... > > Any NLRI that might be in the UPDATE, but are not visible > > because of the malformed attributes, are simply ignored. > > Yes ? > yes. > Again: > Once you have a malformed update, NOTHING is certain. > We limit the damage and call for human intervention. Well, as above, this means it is OK to let (some, indeterminate) NLRI fall into a state where they should have been withdrawn or are now out of date. So, why should session-reset *ever* be required ? Damage-limitation-wise, avoiding session-reset is the big win... so why not keep going no matter what, and let the human beans deal with it... much better than falling into a cycle of session-reset/restart/reset/... ? >> ... Also, I kinda suspect that the human bean will be >> greatly assisted if it is clear which NLRI have been affected >> (and which definitely have not). > The human bean will not rely on ANY routes or information > from the broken session. Well, even more reason to stop being picky at the protocol level, and hence: "treat-as-withdraw" where you can, and "UPDATE-discard" the rest ? Chris
- [Idr] I-D Action: draft-ietf-idr-error-handling-0… internet-drafts
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Enke Chen
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Saikat Ray (sairay)
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Enke Chen
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Shyam Sethuram
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Brian Dickson
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Enke Chen
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Brian Dickson
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Robert Raszuk
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Rob Shakir
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… bruno.decraene
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… bruno.decraene
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Brian Dickson
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
- Re: [Idr] I-D Action: draft-ietf-idr-error-handli… John Leslie