[trill] my comments on draft-ietf-trill-oam-framework

"Romascanu, Dan (Dan)" <dromasca@avaya.com> Thu, 04 April 2013 08:56 UTC

Return-Path: <dromasca@avaya.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id A638A21F961A for <trill@ietfa.amsl.com>; Thu, 4 Apr 2013 01:56:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.974
X-Spam-Status: No, score=-102.974 tagged_above=-999 required=5 tests=[AWL=-0.375, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id b345ATtWSDcI for <trill@ietfa.amsl.com>; Thu, 4 Apr 2013 01:56:28 -0700 (PDT)
Received: from p-us1-iereast-outbound.us1.avaya.com (p-us1-iereast-outbound.us1.avaya.com []) by ietfa.amsl.com (Postfix) with ESMTP id 9CD0121F9619 for <trill@ietf.org>; Thu, 4 Apr 2013 01:56:28 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgEFALh+MVHGmAcF/2dsb2JhbAA6CoJmv16BABZzgh8BAQEBAgESKDgMDQEVFRRCJgEEGxMHh2sGAZ99hCqcQ41SgRqDF2EDkwKJWopSgwiCJw
X-IronPort-AV: E=Sophos;i="4.84,766,1355115600"; d="scan'208";a="5521153"
Received: from unknown (HELO co300216-co-erhwest.avaya.com) ([]) by p-us1-iereast-outbound.us1.avaya.com with ESMTP; 04 Apr 2013 04:56:27 -0400
Received: from unknown (HELO AZ-FFEXHC01.global.avaya.com) ([]) by co300216-co-erhwest-out.avaya.com with ESMTP; 04 Apr 2013 04:54:00 -0400
Received: from AZ-FFEXMB04.global.avaya.com ([fe80::6db7:b0af:8480:c126]) by AZ-FFEXHC01.global.avaya.com ([]) with mapi id 14.02.0328.009; Thu, 4 Apr 2013 04:56:26 -0400
From: "Romascanu, Dan (Dan)" <dromasca@avaya.com>
To: TRILL Working Group <trill@ietf.org>
Thread-Topic: my comments on draft-ietf-trill-oam-framework
Thread-Index: Ac4xEkkxNdzOrNdAThmijtqtgZEQwg==
Date: Thu, 4 Apr 2013 08:56:25 +0000
Message-ID: <9904FB1B0159DA42B0B887B7FA8119CA0C1543@AZ-FFEXMB04.global.avaya.com>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: [trill] my comments on draft-ietf-trill-oam-framework
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trill>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Apr 2013 08:56:29 -0000


Here are my comments on draft-ietf-trill-oam-framework-01.txt

I have divided them in Technical and Editorial. 


T1. The references are broken in many ways. Many references in the text are not included in the References sections - runing id-nits shows this immediately. Listed references point to documents that do not exist [TRILL-OAM], [TRILL-IP]. 

T2. I have mixed feelings about the usage of RFC 2119 capitalized keywords in this document. My preference would be to drop them completely, especially as the document is targeting Informational status. If they stay, we need to carefully review the usage, right now it's quite inconsistent. 

T3. The document does not include a section on operational and manageability considerations. I believe that this is necessary, and this section should include as a minimum: 
- what parameters need to be configured on Rbridges in order to activate TRILL OAM - activation of specific OAM functions, periodicity of OAM messages, activation of synthetic traffic (periodic or on demand)
- how are these parameters configured? Are there recommended default values? 
- how are results of Performance Management used or retrieved?

T4. Several terms are used according to the terminology defined in RFC 6905 - Section, Flow, Connectivity, Continuity Indication, etc. I suggest to mention this in Section 1.2

T5. Are the references to L2VPN and MPLS-TP (and the respective [RFC6136] and [RFC6371]) in Section 1.2 needed at all? I do not see these technologies mentioned in any other part of the document, and if there is no relevant relationship with the OAM functions on these layers I would suggest to drop these. 

T6. In 2.4 we have an example why using 2119 capitalized keywords is problematic:

> When considering multi-domain scenarios, the following
   rules must be followed: TRILL OAM domains MUST NOT overlap, but MUST
   either be disjoint or nest to form a hierarchy (i.e. a higher
   Maintenance Domain MAY completely engulf a lower Domain).

If I was to chose, I would prefer not to use capitalization, or use capitalization only for the first 'must' which is not capitalized in the text. 'MUST either ...' is a strange combination. One may argue that nesting is a form of overlapping, and actually 'MUST not overlap' is should rather be 'MUST not partially intersect'. 

T.7 In Section 3.1: 

> Additionally
   methods must be provided to prevent OAM packets from being
   transmitted out as native frames.

If capitalized RFC 2119 keywords are to be kept, this 'must' must be a 'MUST' (because of the critical impact on the bits on wire) 

T.8 In section 3.2: 

>  When running OAM functions over Test Flows, the TRILL OAM should
   provide a mechanism for discovering the flow entropy parameters by
   querying the RBridges dynamically.

I am wondering why this is a 'should' and not a 'must'. Are there any exception cases? 

T.9 Section 3.4: 

>  RBridges must be able to identify OAM messages that are destined to
   them, either individually or as a group, so as to properly process
   those messages. 

OK - so a method must be in place. 

   It may be possible to use a combination of one of the unused fields
   or bits in the TRILL Header and the OAM EtherType to identify TRILL
   OAM messages. 

' one of the unused fields or bits in the TRILL Header' - which ones? Are we talking about an update to RFC 6395? Where is this defined? 

Is the 'and' between the TRILL header combinations and the OAM EtherType really an 'and' or rather an 'or'? 'OAM EtherType' is rather 'TRILL OAM EtherType'? 

   [RFC6325] does not specify any method of identifying OAM messages.
   Hence, for backwards compatibility reasons, TRILL OAM solutions must
   provide methods to identify OAM messages through the use of well-
   known patterns in the Flow Entropy field; for e.g., by using a
   reserved MAC address as the inner MAC SA.

Is this an alternate method to the ones described in the previous paragraph? Then, the 'must' is probably rather a 'may' or should'. What ' reserved MAC address' are we talking about? Is this something that we should ask from the IEEE RAC? Where is this specified? 

T.10 Section 4.1.1 - What does 'Period mis-configuration' mean? Is this about the periodicity of the OAM messages? If so this defect is about OAM configuration, not about continuity check. If it's something different, please explain or provide a reference. 

T.11 Section 4.2: 

>  On-demand fault management functions are initiated manually by the
   network operator and continue for a time bound period.

The last part of the statement is not always true, as some fault management functions may be one-time actions or tests.  

T.12 Section 5. - Performance Management - I am not thrilled about this section at all. Note that RFC 6905 talks about Performance Monitoring (not Management) and it's not a mandatory function. Terminology aside, I dislike generating synthetic traffic in order to measure packet loss - how much traffic? Limiting this seems mandatory. Who compares what was sent with what was received?  5.2 talks about hardware-based time-stamping to measure Packet Delay. This implies some kind of clocks synchronization between RBridges. What are the requirements? 

T.13 The Security Considerations section should be more explicit on what 'exploitation of the OAM message channel' means. Specifically the threats related to overloading the network and the Rbridges with OAM messages should be mentioned, as well as the impact of interrupting the OAM services. 



E1. In Section 1: 

>  Some
   characteristics of a TRILL network that are different from Ethernet
   bridging ...

s/Ethernet bridging/IEEE 802.1 bridging/

E2. Check for completeness of the acronyms list in section 1.1. For example ECMP is used a few times in the document, but not listed here and never expanded. 

E3. In several places (2.1.3 is the first) I found 'For e.g.,'. I believe that this is a pleonasm, either use 'For example', or just e.g. 

E4. In section 2.2: s/TRILL OAM processing can be modeled as a layer/TRILL OAM processing can be represented as a layer/

E5. In Section 2.3: 

>  Network OAM mechanisms provide fault and performance management
   functions in the context of a representative 'test' VLAN or fine-
   grained label [TRILL-FGL].
What does 'representative' mean here? 

E6. 'faults in connectivity or performance' - 'connectivity faults and performance degradation' is probably better

E7. In Section 2.6: 

> MEPs are the active components of TRILL OAM: MEPs source TRILL OAM
   messages proactively or on-demand based on operator invocation.

The usage of the term 'proactively' seems to me to be incorrect. I would rather use 'periodically'. Also s/operator invocation/operator configuration actions/

E8. In Section 2.6: 

> - MUST support the UP MEP function on a TRILL virtual port (to 
     support OAM functions on Sections)

I would add for clarity inside the brackets '... as defined in 'rfc6905'