Re: [Idr] BGP autoconfiguration - draft-ymbk-idr-l3nd - 3rd message in response

Susan Hares <shares@ndzh.com> Sat, 05 March 2022 21:15 UTC

Return-Path: <shares@ndzh.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DC16A3A0CA5 for <idr@ietfa.amsl.com>; Sat, 5 Mar 2022 13:15:28 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.062
X-Spam-Level:
X-Spam-Status: No, score=-4.062 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DOS_OUTLOOK_TO_MX=2.845, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uem035n-WqLa for <idr@ietfa.amsl.com>; Sat, 5 Mar 2022 13:15:25 -0800 (PST)
Received: from hickoryhill-consulting.com (50-245-122-97-static.hfc.comcastbusiness.net [50.245.122.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 33B6D3A0CA4 for <idr@ietf.org>; Sat, 5 Mar 2022 13:15:24 -0800 (PST)
X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=50.107.120.176;
From: Susan Hares <shares@ndzh.com>
To: 'Jeffrey Haas' <jhaas@pfrc.org>, idr@ietf.org
Date: Sat, 05 Mar 2022 16:14:59 -0500
Message-ID: <035301d830d6$19a04ca0$4ce0e5e0$@ndzh.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Content-Language: en-us
Thread-Index: AdgwtR1Ij7oFBaCERCejtE2LK8s9ww==
X-Authenticated-User: skh@ndzh.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/okag3pfVvipvkGUtfqvj0PenNCQ>
Subject: Re: [Idr] BGP autoconfiguration - draft-ymbk-idr-l3nd - 3rd message in response
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 05 Mar 2022 21:15:29 -0000

Jeff:

This message is the 3rd message responding to your review of L3DN.  
There will be two more responses (1 on ULPC and 1 on your observations)

This message focuses on the optional PDU for encapsulations.  
My high level comments are here, and the detailed comments are inline. 

Nonce - identifies a PDU
Serial number - identifies the data base state the Sending L3DN peer has

Currently: Only Hello PDUs have a nonce to detect duplications. 

Ack messages have: 
Acked-PDU - type of message the ack responds to
Etype - error type (hint on receivers view on error)
Error code - protocol error codes (see section 16.5) 
Error hints - 4 bytes beyond. 

If we wished to add bytes, we could 
Include nonces on Encapsulation PDUs, 
Vendor PDUs, and ULPC PDUs.  

The tradeoff is state vs. error handling for 
a protocol running on top of a reliable transport (TCP). 

Sue 

-----Original Message-----
From: Idr [mailto:idr-bounces@ietf.org] On Behalf Of Jeffrey Haas
Sent: Tuesday, March 1, 2022 4:22 PM
To: idr@ietf.org
Subject: [Idr] BGP autoconfiguration - draft-ymbk-idr-l3nd

Working Group,

One of the topics for our last IDR interim was an attempt to conclude
discussions on BGP autoconfiguration proposals prior to sending adoption
requests to the Working Group.  Just prior to that call going out, we were
requested to add one additional proposal for consideration that hadn't yet
been published: l3nd[1] and l3nd-ulpc[2].

The authors have requested a presentation slot at the upcoming IETF 113 for
this proposal.  Since this is a very late entrant, I'd like to try to get
discussion of the properties of this proposal started on the mailing list
using the same discussion points we've had from prior interim meetings.

The Working Group adopted requirements and some of the prior proposal
discussions are part of the design team document.[3]

Below, find my observations on the draft with questions and commentary
intermixed.  I finish this response with my high level analysis and personal
opinion on what these properties accomplish.

[Snip] 
Encapsulations:
[Point 1] 
- The text for addressing conflicts needs to specify the Etype along with
  the Error Code. 
  + It can't be 0, because that says No Error.
  + It's probably 1 as a warning?
  + In general, normative text needs to specify Etype + Error Code together.

Sue: 
Ack does give you type of PDU (4-7) and Etype: 
Ack Error code 0 - OK I received it
Ack Error code 1 - Error in reception from peer,  Warning 
Ack Error code 2 - Error in reception from peer, recommend a restart
Ack Error code 3 - Error in reception, Hopeless messed up. Call an operator.

Are you implying Error Hints needs to be specified on Ack? 
Example: 
Ack [Acked-PDU Encapsulation, Etype [0-3], Hints(4 bytes)]
Do you think 4 bytes of hints are important to standardize? 
[Point-1-end]

[Point-2 - Acks in transmission] 
[Pont2a- multiple Acks] 
- 10.1, the procedure for expecting an ACK seems conflicting with the
  retransmission text in 9.1.  
   In that, it says if you're expecting an ack  and don't get one, 
   the session should be closed.  However, it's implied
  (but unclearly so) that you should have one pending ack at a time. 
[Sue]
Thank you for the comment.  
The message diagram (pages 6-7) shows that a box transmitting an 
Encapsulation PDU (types 4-7) expects an ACK for that type of PDU. 
Only one ACK per Encapsulation PDU sent.
[end-sue-2a ]
[Point2b- multiple Acks]  
 If  retransmits of an encapsulation pdu are possible, this means multiple
  outstanding acks may be needed.  In such a case, serial numbers being
  acknowledged are more desirable.


[Sue]
You are correct that this case could occur: 
Box-1: Encapsulation PDU (PDU type = 4 (IPv4) serial-id: 10)  ---> 
                             Box-2: (missing time window)
Box-1: Encapsulation PDU  (PDU type=4 (IPv4), serial-id: 10) --> 
  <--- Box2: Sends ACK (Type: 3, Acked PDU: 4, Etype=0) 
   <--- Box2: Sends ACK (Type:3 Acked PDU:  4,  Etype = 0)

With reliable transport, this case would mean 
the application processing the encapsulation PDU was delayed.

An inclusion of the None field on the Ack would uniquely identify the 
DB and the ACK as well as the DB.
 
[End--2b]
[Point 2c- Acks an retransmission] 
  + In general, the text here for retransmit vs. ack seems to be a mismatch
    between the expectations of working on a reliable stream protocol like
    TCP.
[Sue: The delay is likely to be caused by the application processing of the 
PDU with encapsulation.  The stream sends the data, but the management 
plane delays replying to the data stream. ]
[End point-2c] 

[Point-2d] 
  + Similarly, the "link is broken below layer 3" comment seems a mismatch
    since TCP shouldn't pay direct attention to that.
[Sue: Actually this point considers implementation issues for the 
the massive data centers with multiple L2 links under L3.  
TCP/TLS can re-route itself to another pathway to connect 
to the host port - even though the original Hello link is now down. 
[End Pont-2d] 

[Point 2e]
- 24 bits is a peculiar length for a counter.  Why this size?
[Sue] Alignment with the L3DL specification, but it could change
to 4 octets if WG wished.  Alternatively, L3DL and L3DN common 
mechanisms could be helpful.  
[Point-2e]
[End Point-2] 

[Point-3 Encapsulation flags] 
[Point-3a Encapsulation flags]
Encapsulation Flags:
- 10.2 typo on Encapsulation
- Primary is interesting operational state.  What should an implementation
  do if it gets conflicting primary entries?
[Sue on point-3a]
Consider: 
Box-1: Send Encaps PDU (type =4 IPv4)  [DB-serial number: 10]
                        [Encaps=Announce, Encaps flag=primary] 
                       Prefix: 192.2.2.5 /32 --> 
                        Box-2: ACK PDU (3) (acked pdu=4, DB: 10) 
Box-1: Send Encaps PDU (type =4 IPv4)  [DB-serial number: 11]
                        [Encaps=Announce, Encaps flag=primary] 
                       Prefix: 192.2.3.5 /32 --> 
                        Box-2: ACK PDU (3) (acked pdu=4, DB: 11)

In this case the box-1 is changing the primary interface. 

Getting a conflicting entry for primary interface for 
peering between box-1 and box-2 for bgp auto-configuration 
means that Box-1 sends a conflicting PDU. 

Sequence: Reject new encapsulation with primary 
Box-1: Send Encaps PDU (type =4 IPv4)  [DB-serial number: 10]
                        [Encaps=Announce, Encaps flag=primary] 
                       Prefix: 192.2.2.5 /32 --> 
                        Box-2: ACK PDU (3) 
                             (acked pdu=4, DB: 10, Error Type = 0, DB-10) 
Box-1: Send Encaps PDU (type =4 IPv4)  [DB-serial number: 10]
                        [Encaps=Announce, Encaps flag=primary] 
                       Prefix: 192.2.3.5 /32 --> 
                        Box-2: NAK PDU (3) (Acked PDU: 4 Error Type=1, Error
code= 1, DB: 10)
Here it appears that the announcement is broken. 

Note - You can pull this from Section 10.2:  
   "Each Encapsulation interface address in an Encapsulation PDU is
   either a new encapsulation be announced (Ann/With == 1) 
  (yes, a la BGP) or requests one be withdrawn (Ann/With == 0).  Adding an
   encapsulation which already exists SHOULD raise an Announce/Withdraw
   Error (see Section 16.5); the EType SHOULD be 2, suggesting a session
   restart (see Section 9 so all encapsulations will be resent". 
[Response on Point 3-a] 

[Point-3b - Broken Prefix in pile of Encapsulation prefixes]
Encapsulations (varying):
- You can pack more than one prefix in a PDU for a given serial number.  In
  the event that a single prefix is an error (see question about about Etype
  for such cases), are all prefixes in the bundle intended to be rejected? 
  + How would the ack receiver know which prefix is the bad one if the whole
    bundle isn't rejected?
[Sue: 3-b] Thank you for mentioning this point. 
We do need to toss the whole bundle if it is malformed. 
[Sue-3b end] 

[Point-4] 
MPLS Label List/Encapsulation:
- After our "fun" dealing with label issues covering RFC 3107 and others,
  please make sure to tell people the obvious: the label stack needs to be
  well formed and bottom of stack bit should only be set on last entry.
Sue: Thank you for mentioning this fact. 
I think the tossing of Malformed Encapsulation lists, can be 
Handled in section 10.1.  
The definition of a malformed label list will be updated in section 10.5 
[Point-4 end] 

[Point -5] 
Hello discussion:
- The English of the first paragraph is in need of significant work for
  clarity.
- I think this is trying to say that in the presence of a LAG that
  transmission over more than one LAG member might be okay.
- I also think that it's trying to say that once you have a session up you
  can stop hellos on the individual member links.
- I think the intent of the second paragraph is that if a link is on a
  multi-access network and non-p2p behavior is desired, keep sending hellos.
[Sue:  Thank you for letting me know the English is confusing. 
I have rewritten the text for the next revision.] 

ULPC draft:

3.1:
:   A peer receiving BGP ULPC PDUs has only one active BGP ULPC PDU for
:   an particular address family on a specific link at any point in time;
:   receipt of a new BGP ULPC PDU for a particular address family
:   replaces any previous one.

Okay, implicit replace.  That's fine.

:   If there are one or more open BGP sessions, receipt of a new BGP ULPC
:   PDU does not affect these sessions 

Do not disturb existing sessions.  That's fine.

:   and the PDU SHOULD be discarded.

Probably not fine.  If the session bounces, you likely intend for it to
connect to the newly advertised entry.

:   If a peer wishes to replace an open BGP session, they MUST first
:   close the running BGP session and then send a new BGP ULPC PDU.

This seems to assume a tight coupling between the implementations for BGP
and for the discovery mechanism.  A peer bounce without having first
advertised the new address can mean a race condition where the bounced peer
continues trying (and possibly failing) to establish the connection with the
old peer.

It'd be cleaner to remove the prior binding first.

However, there's no withdraw procedure currently specified for these TLVs to
remove peer addresses, only to advertise them.

:   For each BGP peering on a link here MUST be one agreed encapsulation,
:   and the addresses used MUST be in the corresponding L3DP IPv4/IPv6
:   Announcement PDUs.  If the choice is ambiguous, an Attribute may be
:   used to signal preferences.

How?  Among other things, the attributes are bound to a session and not any
individual address.

3.1.2/3.1.3 Prefix-Len is probably not appropriate for the addresses.  If
there's an intent to permit peering from more than one address in the
subnet, how should the implementation pick?  From state learned in l3nd
encapsulation bindings for an overlapping subnet?

3.1.4 Authentication: You're referring to multiple possible ways security
could work.  This likely means multiple authentication types and should be
its own code point.

3.1.5 "No flags are currently defined"... and then defines GTSM.


:   As the ULPC PDU may contain keying material, see Section 3.1.4, it
:   SHOULD BE signed.

What does signed mean in this instance?  Do you mean that the transport
session should be given secrecy properties by TLS?

:   Any keying material in the PDU SHOULD BE salted and hashed.

I'm very confused here.

The material that a BGP peer needs for bringing up a session is the key to
pass to the relevant algorithm like TCP-MD5 or TCP-AO.  Typically when we
discuss salt and hashing, we're discussing the ciphertext from those
operations.  

Are you suggesting you send the ciphertext with the intent that an
implementation can consult its own tables to find what input password for
that salt should be used?  If so, this is a flavor of pre-shared key.  It'd
also require coordination of salt values between the implementations.

If this is an obvious technique that I'm not familiar with, a pointer to the
procedure would be useful.

-----

Misc:
- The embedded attempts at humor decrease the clarity of the document for
  readers that are less familiar with English and should be excised.
- In the PDU diagrams, explicitly document what the field lengths are.
  Don't make people interpret ASCII diagrams.
- The HELLO packet is itself without security.  In circumstances in which a
  man-in-the-middle attack is possible and raw TCP is offered as an option,
  active interception and replacement of the HELLO is sufficient to cause
  the less secure mechanism to be used.
- Section 7, the text about GTSM as a SHOULD on SYN should simply be GTSM on
  the TCP session.  GTSM applies to all of the packets in a TCP session.
- Section numbering under encapsulations could be more consistent.  For
  example, each section for a PDU type.  MPLS Label List probably shouldn't
  be a section at the same level as the mpls encapsulation pdus that use it.
- The session is long-lived but not protected similarly to BGP; i.e. tcp-md5
  or tcp-ao.  Given that the intent of the mechanism is to setup BGP which
  may use those properties, but uses TLS for protection, this isn't
  surprising.  But it means that the sessions are vulnerable to reset
attacks
  those TCP mechanisms are intended to protect against.

-----

Jeff's read of the proposal:

1. The proposal is session based.  While the prior discussion was primarily
in the context of draft-xu, sessions were seen as potentially problematic
for a number of reasons.  They also had some possibly useful properties.
1a. Session based makes large amounts of state easy to deal with.  That
said, the requirements show we need very little state for a given session.
That's true even in the ULPC portion of this proposal.
1b. These sessions are long-lived.  You'll have one for each interface you
want to run auto-discovery on.  This minimally doubles the number of active
sockets your stack will need because for each BGP session you get, you'll
have a lingering discovery session.  Perhaps they don't need to be
long-lived if the only thing you're using this for is BGP
auto-configuration?
1c. If you're going to use some flavor of TLS (including DTLS) as we
discussed during the requirements phase as a possible mechanism, you need a
session.
1d. If you're going to use something that rides on top of TCP, you have all
of the TCP vulnerabilities that TCP-MD5 and TCP-AO are intended to mitigate.
Using those mechanisms to mitigate the mechanism means that you're already
solving the keying issue... and perhaps don't need TLS at all.

2. The hello is completely unprotected.  This means there's an edge case
man-in-the-middle condition.  Protecting it means you've already solved the
keying discussions and wouldn't need TLS.

3. TLS crypto infrastructure is operationally weighty if you're covering the
usual things:
- What certificate authority are you using?
- How is revocation handled, if at all?
- If you want to use certificates for authentication, you now have to get
  those distributed to all of your participating routers.
- Certificates have lifetimes - although potentially very long lived ones.
  You have to now worry not only about usual key rollover considerations
  that all of the likley mechanisms need, but also making sure you don't
lose
  your auto-discovered sessions on a bounce because a certificate expired.

I'm loosely aware of the ACME work in IETF and perhaps the intent is that
these things are no longer quite as operationally cumbersome.  However,
that's an entire ecosystem that needs to be integrated into your router
operations if that's the case.

4. TLS provides integrity to the transport session and its contents, even if
it can't protect against attacks on TCP itself.  Privacy is optional,
although given the somewhat odd ULPC security text it seems that a scenario
under considertion is some form of distribution the pre-shared keys or
something similar.  However, the majority of the state for ULPC and l3nd
isn't exactly secret; it's stuff you'll see in ARP and ND if you're on link.

I don't think keychain information is considered something from most models
to need privacy. 

5. The proposal likely shouldn't call itself BGP-like.  The explicit ack
model minimally makes that not the case.  The state machine isn't terribly
comparable.

In particular, the Serial Number restart infrastructure seems to be very
heavy-weight for something running over a reliable transport like TCP.  But
this perception is tied to the idea that the BGP piece of this is small.
And even if there's a modestly large number of interfaces on the link, the
protocol should be able to send the entire bulk of them in a small number of
TCP frames.  (Compare vs. BGP send rates in even low end implementations.)
So, why complicate the mechanism with a highly stateful restart mechanism?
I suspect this makes more sense for the layer 2 proposal.

6. The sheer weight of this thing would make more sense if it's a more
general mechanism that is already deployed and the small BGP piece is along
for the ride.  I understand the motivation for a feature like this at layer
2, especially in the lsvr context.

I don't understand the use case for something like this in its current form
at layer 3.  What's that use case?

7. The crypto itself makes the autoconfiguration mechanism a point of attack
on the system rather than something to mitigate the attack.  The overhead to
deal with an on-link attack on TCP-MD5/-AO (symmetrical ciphers) is lower
than the overhead of TLS negotiation.  Rate limiting the creation of
incoming l3nd sessions might help things to some extent, but then you're
worrying similarly about TCP session exhaustion as well.


-- Jeff

[1] https://datatracker.ietf.org/doc/html/draft-ymbk-idr-l3nd-00
[2] https://www.ietf.org/rfcdiff?url2=draft-ymbk-idr-l3nd-ulpc-02.txt
[3]
https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-autoconf-considerat
ions-02

_______________________________________________
Idr mailing list
Idr@ietf.org
https://www.ietf.org/mailman/listinfo/idr