Re: [forces] AD Review of draft-ietf-forces-ceha

Jamal Hadi Salim <hadi@mojatatu.com> Wed, 24 July 2013 12:58 UTC

Return-Path: <hadi@mojatatu.com>
X-Original-To: forces@ietfa.amsl.com
Delivered-To: forces@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A002411E812C for <forces@ietfa.amsl.com>; Wed, 24 Jul 2013 05:58:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.376
X-Spam-Level:
X-Spam-Status: No, score=-102.376 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, J_CHICKENPOX_44=0.6, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jzhffUfHF2oM for <forces@ietfa.amsl.com>; Wed, 24 Jul 2013 05:57:55 -0700 (PDT)
Received: from mail-ve0-f177.google.com (mail-ve0-f177.google.com [209.85.128.177]) by ietfa.amsl.com (Postfix) with ESMTP id 85C4A11E80E6 for <forces@ietf.org>; Wed, 24 Jul 2013 05:57:55 -0700 (PDT)
Received: by mail-ve0-f177.google.com with SMTP id cz10so6917834veb.36 for <forces@ietf.org>; Wed, 24 Jul 2013 05:57:54 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=1U3QcgeQ73bTTQlXrVhQ9SeMjKZdPo7WT2KHQQj9Y58=; b=OwQSifkWL2thqunVVqBdJIFtz1t8Nsp6jseUE4AEyZ0R5KKe6QGQbgRYws6lFEOp/L UTMp0FPrWn1EwuwIa6WpxDWtBdrprwtBfFx4oQ95ugwR8gYnjDpqrUP2etoLJ5nIZEa+ CEBoY0RE2lDTgAfwDvLOFPmrKN0hzet2VK62j5c8RdFHRO6+km0qIrzvZ3bdbwUC9jc7 7vGAgll9zKSO1kTpSCnNa1QV+Bvk+PXjkcWjEoQT5HeEfK/oPxSKatQ3+N6zJcFQ+l9z CNwT3oVt1AevgCG9gfY2fP7HdRcYd3vwYvQ5+BxdtfGe+7VQb/aCLqbwb2QqbBO7e7Mf W8wA==
X-Received: by 10.52.69.199 with SMTP id g7mr5920783vdu.73.1374670674211; Wed, 24 Jul 2013 05:57:54 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.58.247.197 with HTTP; Wed, 24 Jul 2013 05:57:34 -0700 (PDT)
In-Reply-To: <02b001ce884a$a1585040$e408f0c0$@olddog.co.uk>
References: <02b001ce884a$a1585040$e408f0c0$@olddog.co.uk>
From: Jamal Hadi Salim <hadi@mojatatu.com>
Date: Wed, 24 Jul 2013 08:57:34 -0400
Message-ID: <CAAFAkD-iZ7nPmt=LPuP_fepgR5iGh0gGu9ex1Q5M11L37bhrmg@mail.gmail.com>
To: Adrian Farrel <adrian@olddog.co.uk>
Content-Type: multipart/alternative; boundary="20cf307d045e385bf004e24175bc"
X-Gm-Message-State: ALoCoQnTqu01Tg6CfTnXXF7eZgE7BVfqoJrT+ChLvEE39cPvsgZohAzEbXpgDBpfw2JpIXTwYyKN
Cc: "forces@ietf.org" <forces@ietf.org>, draft-ietf-forces-ceha.all@tools.ietf.org
Subject: Re: [forces] AD Review of draft-ietf-forces-ceha
X-BeenThere: forces@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: ForCES WG mailing list <forces.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/forces>, <mailto:forces-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/forces>
List-Post: <mailto:forces@ietf.org>
List-Help: <mailto:forces-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/forces>, <mailto:forces-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Jul 2013 12:58:01 -0000

Hi Adrian,
Thanks for the review. Comments below.

On Wed, Jul 24, 2013 at 4:48 AM, Adrian Farrel <adrian@olddog.co.uk> wrote:

>
>
> Section 1
>
> Need to fix the 2119 boilerplate to include a reference to [RFC2119]
>
>

ok.


> ---
>
> Section 1
>
> OLD
>    The following definitions are taken from [RFC3654]and [RFC3746]:
> NEW
>    The following definitions are taken from [RFC3654] and [RFC3746].
>    They are repeated here for convenience, but the normative definitions
>    are found in the referenced RFCs.
> END
>
>
Ok.


> ---
>
> Section 1
>
>    o  ForCES Protocol -- The protocol used at the Fp reference point in
>       the ForCES Framework in [RFC3746].
>
> You need to explain what "Fp" is.
>
>
Figure 1 points to what Fp is, but if it didnt jump at you right away then
we should
explain it.

---
>
> Section 1
>
>    o  ForCES Protocol Layer (ForCES PL) -- A layer in the ForCES
>       architecture that embodies the ForCES protocol and the state
>       transfer mechanisms as defined in [RFC5810].
>
> How can this definition come from 3654 or 3746 when it has a reference to
> 5810?
>
>
3654 and 3746 talked about the protocol in generalities.
5810 talks about the protocol. It came later and is only added
there for sake of clarity of what we mean when we say PL.



> ---
>
> Figure 1
>
> Maybe more normal to use "CEn" rather than "CEN"
>
>

Ok.


> ---
>
> Section 2
>
>    The master CE controls the FEs via
>    the ForCES protocol operating in the Fp interface.
>
> s/in/on/
>
>
Ok.


> ---
>
> Section 3
>
>    To achieve CE High Availability (HA), FEs and CEs MUST inter-operate
>    per [RFC5810] definition which is repeated for contextual reasons in
>    Section 3.1.
>
> While you are repeating some of the material from 5810, you are also
> restating some of it in new words, and adding text.
>
> This gives a real problem with determining where the normative
> definition sits. We have to fix this!
>
> Can you determine which sections are informational for this document and
> which contain new text?
>
>
Ok, we'll review, however, note that there is intent in this document
to provide clarity in what is prescribed in 5810.
This is stated in Section 2.1, to quote:

"
The problem scope addressed by this document falls into 2 areas:
   1.  To describe with more clarity (than [RFC5810]) how current cold-
       standby approach operates within the NE cluster.
   2.  To describe how to evolve the [RFC5810] cold-standby setup to a
       hot-standby redundancy setup to improve the failover time and NE
       availability.
"



> Possibly the whole of Section 3 is just informational. If this is the
> case you can replace the text quoted above with the following:
>
>    To achieve CE High Availability (HA), FEs and CEs MUST inter-operate
>    per [RFC5810].  The normative description of cold standby for CE HA
>    is provided in [RFC5810].  This section provides a more wordy
>    description of the procedures and is purely informational.  In the
>    event of any discrepancies between this text and that in RFC 5810,
>    the text in RFC 5810 takes precedence.
>
> ---
>
>

The goal is for someone reading 5810 and not getting clarity on the
cold-standby HA\
to read this document instead. I think the part where you say "purely
informational"
may be overriding that intent.


> Figure 2
>
> Will it be obvious to the reader of this document what is meant by
> "Asso Estb,Caps exchg"
>
>
Will fix.


> ---
>
> Section 3.1.1
>
> "CEID" is used without expansion. Although sometimes I find "CE ID" for
> example in 3.1.2.
>
> ---
>
>

Will fix for consistency.


> Section 3.1.1 has
>
>    The FE connects to the CE specified on FEPO CEID component.  If it
>    fails to connect to the defined CE, it moves it to the bottom of
>    table BackupCEs and sets its CEID component to be the first CE
>    retrieved from table BackupCEs.
>
> This is not a problem, but is unusual. In many redundancy cases, the
> primary object remains the favorite even when it has failed so that
> when there is a restoration opportunity (such as a failure of the new
> primary) it will resume its position as primary.
>
>
Depends. I actually have seen the sticky prioritization scheme you have
described being
requested for, but that desire subsidises after we describe that we
 provide for the master
CE to change the mastership if the older CE shows  up again i.e whatever
the FE does
it could be overriden by the CE.



>
> The question here is perhaps whether there is any distinction between
> CEs except their role as primary or backup.
>
>

There is that implicit prioritization which says the ordering of the CEs in
the table
implies their priorities.



> ---
>
> Figure 3 has "CEFTI" without explanation.
>
> ---
>
>

It comes from 5810 - will fix.


> Section 3.1.1 has
>
>    If the FE's FEPO CE Failover Policy is configured to mode 1, it
>    indicates that the FE will run in HA restart recovery.  In such a
>    case, the FE transitions to the Not Associated state and the CEFTI
>    timer [RFC5810] is started.  The FE MAY continue to forward packets
>    during this state.
>
> This use of "MAY" implies to me that it is at least as common that the
> FE does not forward packets in this state. Is that the intention?
>
> ---
>
>

Abuse of MAY on our part. We'll fix.
[There is another knob which says whether to forward packets or not.
If that knob is on we continue to forward packets otherwise we dont]



> Section 3.1.1
>
> "HB" is used without expansion.
>
>

Will fix.


> ---
>
> Section 4.2
>
> "CEHB" and "FEHB" are used without expansion
>
>
Will fix.


> ---
>
> Need to fix the line-length problems in Figure 4
>
> ---
>
>
We'll fix


> Figure 5 implies that the association with the primary CE is the first
> association formed. Is this a requirement?
>
>

It is suggested in 5810 i.e that is part of the implicit prioritization
from above.
The administrator's listing of the controllers in a specific order
essentially
states which one needs to be connected-to first.
But if it is unavailable, then the next one in the list is connected to etc.



> ---
>
> Section 5 needs to give clear and unambiguous instructions to the IANA.
> It seems that the text in Section 5 is currently a placeholder for the
> correct text.
>
>

Will fix.


> The shepherd write-up notes the need for this section to be reviewed by
> "ForCES IANA experts". If these people are not to be found in the ForCES
> working group, then they exist nowhere. So the WG needs to look at this
> section and work out what it should really say.
>
>
There are allocated ForCES IANA experts ;-> I will let the shepherd speak
for
himself - my gut-feel is he meant those people (some speaking in the third
person)
review those changes requested.


> ---
>
> Section 6 is too sparse.
>
> The Security Section in RFC 5810 is sound, so that is not the issue.
> However, in considering HA you are considering a more complex scenario
> where each CE must have its communications secured with the FE, and each
> CE must be authenticated to the FE. That needs discussion.
>
> Additionally, can the system be disrupted by simulating CE failure or by
> disrupting CE-FE communications?
>
>
I cant think off the top of my head what new security issue
could be introduced merely because we are running HA (but i have not been
sleeping
well either, so likely missing something). We'll get back to you.



> ---
>
> Section 7.2
>
> I think 5812 is used in a normative way.
>
>
Will fix.

Thanks for the review Adrian.

cheers,
jamal