[bess] John Scudder's Discuss on draft-ietf-bess-evpn-fast-df-recovery-10: (with DISCUSS and COMMENT)

John Scudder via Datatracker <noreply@ietf.org> Tue, 20 August 2024 00:12 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: bess@ietf.org
Delivered-To: bess@ietfa.amsl.com
Received: from [10.244.2.52] (unknown [104.131.183.230]) by ietfa.amsl.com (Postfix) with ESMTP id 0B120C1CAF32; Mon, 19 Aug 2024 17:12:31 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: John Scudder via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
X-Test-IDTracker: no
X-IETF-IDTracker: 12.22.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <172411275169.1985200.11309811863338610562@dt-datatracker-6df4c9dcf5-t2x2k>
Date: Mon, 19 Aug 2024 17:12:31 -0700
Message-ID-Hash: 3LIXLXXUTBOFWFQETOCLJMRFRQSWLI6E
X-Message-ID-Hash: 3LIXLXXUTBOFWFQETOCLJMRFRQSWLI6E
X-MailFrom: noreply@ietf.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-bess.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: draft-ietf-bess-evpn-fast-df-recovery@ietf.org, bess-chairs@ietf.org, bess@ietf.org, matthew.bocci@nokia.com
X-Mailman-Version: 3.3.9rc4
Reply-To: John Scudder <jgs@juniper.net>
Subject: [bess] John Scudder's Discuss on draft-ietf-bess-evpn-fast-df-recovery-10: (with DISCUSS and COMMENT)
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/h0is4D4nAsqKrcxC7cghj9O3Vk0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Owner: <mailto:bess-owner@ietf.org>
List-Post: <mailto:bess@ietf.org>
List-Subscribe: <mailto:bess-join@ietf.org>
List-Unsubscribe: <mailto:bess-leave@ietf.org>

John Scudder has entered the following ballot position for
draft-ietf-bess-evpn-fast-df-recovery-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ 
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bess-evpn-fast-df-recovery/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

# John Scudder, RTG AD, comments for draft-ietf-bess-evpn-fast-df-recovery-10
CC @jgscudder

Thanks for this document. I appreciate the fact it was straightforward to read
and review, and of course, I appreciate the work on a useful protocol extension!

## DISCUSS

I have two concerns I'd like to discuss.

### "Unreasonable" SCT offsets

Apart from a few words in the Security Considerations section, you don't
provide any instructions to the implementer about what to do with unreasonable
SCT offsets, in particular unreasonable SCT future offsets. The language from
Section 5 is,

   It is left to implementations to
   decide what consists an "unreasonably large" SCT value.

The quotation marks imply that you think the implementor should do something
about unreasonably large SCT values, but the spec doesn't supply any clarity
about it. That's too bad, it should. I accept the argument in the preceding
Security Considerations sentences (not quoted) that the ability to signal
unreasonable SCT values doesn't introduce a new vulnerability vis-à-vis the
pre-existing specifications. Nonetheless, it does seem like this specification
should advise implementers to impose an upper limit on what SCT future offset
they will accept. The quoted sentence is a backhanded hint in that direction. I
think you should make it explicit, probably in Section 2 since that seems to be
where elements of procedure mostly go (see also my later remarks about this in
the COMMENTS section).

One way to address this would be to insert something into Section 2.2, along
the lines of,

   - Initialize SCT_delay to 0.
   - If an SCT timestamp is present (etc etc), then,
     - If the SCT timestamp is further in the future than max_SCT_delay,
       SCT_delay = max_SCT_delay.
     - If the SCT timestamp is less than SCT_skew time units in the future,
       SCT_delay = 0. (This includes the case where it is in the past.)
     - Else SCT_delay = (SCT timestamp value) - (present time) - SCT_skew

and then write 9.1 in terms of "wait SCT_delay before proceeding to 9.2".

This is crudely written, I don't expect you to use the text above, although
you're welcome to if you want. I'm just trying to demonstrate one way to close
the gap.

In addition to inventing SCT_delay and max_SCT_delay, I've incorporated skew
into the above (and renamed it SCT_skew). An underspecification of skew is
another, lesser, concern I had (see COMMENTS) and this seemed the natural place
to work it in.

### Assumptions about synchronized clocks

Although the document assumes synchronized clocks, it doesn't provide any
details about its assumptions, e.g. how close synchronization needs to be and
what the consequences are if the requirement isn't met. Normally,
specifications assuming synchronized clocks will provide at least some basic
analysis.

I think probably if you fix the max_SCT_delay thing, the analysis would be easy
because it would in effect bound the maximum effect of clock desynchronization,
and I bet the analysis would amount to "if you don't have badly desynchronized
clocks, it acts like RFC7432." (But this is just a guess, you are the experts.)

If you choose to add a short analysis, I could see it being a bullet in Section
1.4.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

## COMMENTS

### Section 1.4, this sentence no verb

   *  The fast DF recovery solution maintains backwards-compatibility
      (see Section 4) by ensuring that PEs any unrecognized new BGP
      Extended Community.

The second clause should have a verb, but it doesn't. I'd propose a rewrite if
I knew what you were trying to say.

### Section 1.4, "normalizes"

   *  The fast DF recovery solution is agnostic of the actual time
      synchronization mechanism used, and normalizes to NTP for EVPN
      signalling only.

I don't understand what you mean by this, in particular by "normalizes to NTP
for EVPN signalling only". Can you rephrase it? (Also, BTW, "signaling" has
only one ell.)

### Sections 2 and 3 aren't well-divided

Section 2 is called "DF Election Synchronization Solution". Based on that, I
would assume that the entire solution would be specified in this section and
its subsections. Section 3 is called "Synchronization Scenarios". Based on
that, I would assume it contains examples, intended to help the reader
understand the operation of the solution.

Unfortunately, that's not entirely the case. Section 3 contains instances of
normative, or should-be-normative, text. I think you should restructure these
two sections to put the entire solution in Section 2 and keep Section 3 as
examples. By the way, I did find the examples very useful, thank you for
including them. I've flagged suggestions for restructuring in other more
specific comments.

For that matter, there's also normative text in Section 5: "the receiving PE
SHALL treat the DF Election at the peer as having occurred already, and proceed
without starting any timer to futher delay service carving". That's not a
consideration, that's a requirement, and I think it should also go into Section
2. The approach suggested in my DISCUSS would fix this issue. (Also
s/futher/further/.)

### Section 2, several things about skew

   The receiving partner PEs add a skew (default = -10ms) to
   the Service Carving Time to enforce this mechanism.  The previously
   inserted PE(s) must perform service carving first, followed shortly
   by the newly insterted PE, after the specified skew delay.

The first sentence says that the default is -10 ms, so the receiving partner
adds -10 ms, or more simply put, the receiving partner subtracts 10 ms. I think
the whole quote holds together but it’s unnecessarily hard to follow. Perhaps
something as basic as,

NEW:
   The receiving partner PEs subtract a skew (default = 10ms) from
   the Service Carving Time to enforce this mechanism.  The previously
   inserted PE(s) must perform service carving first, followed shortly
   by the newly inserted PE, after the specified skew delay.

Note I also made the correction s/insterted/inserted/

Of somewhat greater concern, I don't think Section 2 is precise enough about
when and how the skew is applied. It's possible to work it out by referring to
the example in Section 3, but it shouldn't be necessary to do this to glean
what an implementation MUST do.

The approach suggested in my DISCUSS would probably make the problem go away.

### Section 2.1, that's not 8 octets

```
   A new BGP extended community is defined to communicate the Service
   Carving Timestamp for each Ethernet Segment.

   A new transitive extended community where the Type field is 0x06, and
   the Sub-Type is 0x0F is advertised along with the Ethernet Segment
   route.  The expected Service Carving Time is encoded as an 8-octet
   value as follows:

                        1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type = 0x06   | Sub-Type(0x0F)|      Timestamp Seconds        ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~  Timestamp Seconds            | Timestamp Fractional Seconds  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

Did you mean six octets? Because that’s what the diagram shows... unless you're
calling 0x060F part of the service carving time, which doesn't make much sense.
I don't know, maybe you meant that it's an eight-octet value encoded in six
octets, but you explain that just fine in the text that follows. I think you'd
be better off simplifying the introductory text to something like this,

NEW:
   A BGP extended community, with Type 0x06 and Sub-Type 0x0F, is defined
   to communicate the Service Carving Timestamp for each Ethernet Segment:

                        1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type = 0x06   | Sub-Type(0x0F)|      Timestamp Seconds        ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~  Timestamp Seconds            | Timestamp Fractional Seconds  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Note I also elided "new" because fairly soon after publication as an RFC, the
community will no longer be "new". Then you let the following text supply the
additional detail that I've removed from my NEW text.

### Section 2.1, "outside of the scope of this document"

“A DF Election operation occurring exactly at the Era transition boundary some
time in 2036 is outside of the scope of this document.”

That’s fine as long as the consequences of such an incident wouldn’t be
serious. Have you analyzed this? A few words to this effect would be desirable;
the rollover is less than twelve years in the future and your specification is
likely to still be in use then.

I bet that, again, the change suggested in my DISCUSS would make this easy to
cover because even a naïve implementation would treat era rollover as nothing
worse than severe clock skew, and the window of vulnerability would be bounded
to max_SCT_delay. I'd still encourage you to say a few words about it, to
console any readers doing a Y2036 review sometime in the future.

### Section 3.1, concurrent elections

   In the case of multiple concurrent DF elections, each initiated by
   one of the recovering PEs, the SCTs must be ordered chronologically.
   All PEs shall execute only a single DF Election at the service
   carving time corresponding to the largest (latest) received timestamp
   value.  This DF Election will involve all active PEs in a unified DF
   Election update.

This looks and feels exactly like normative specification, which makes me think
it belongs in Section 2. I also wonder if you want SHALL instead of "shall".

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues.

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments