February 2, 2021 Attendees: Jie Dong Linda Dunbar Jeff Haas Susan Hares Wim Hendrickx Warren Kumari Acee Lindem Kausik Majumdar Rober Raszuk Jeff Tantsura Missing: Keyur Patel John Scudder Jeff's Plan for the day: 1. Walk through additions sections 2. Walk through peer [unless otherwise noted Jeff = Jeff Haas] Acee - IPsec using transport mode that makes sense. TP use obvious. Jeff: TCP/Quic how do secure it with IPSec. Glue in for RFC 4301 - top level pointer (https://tools.ietf.org/html/rfc4301) If there is a better RFCs, let us know. We do have customers at Juniper that use Transport mode for IP-SEC You've got a a transport endpoint. Securing the thing means that, one you determine the end point, can you get there. Warren: It is sometimes unclear to me how likely the IP-SEC this is going to be used in the Internet. Jeff: We do have customers. Warren: In Data Centers? Jeff: We see the IPsec in the Exchange Jeff Tantsura: I do see the TCP-AO in customers. Jeff: How do you put in the bootstrap what to let other notes what security need. Most people use key chain. Warren: The authentication in BGP itself is funky. We see that the BGP Open Jeff: To send an open you need to set-up a TCP session, you do not have a TCP. Kausik: Do you think the underlying connection is encrypted? Jeff: The BGP transport is a non-encrypted session. You need BGP-sec for that portion. Kausik: [missed + no recording] Jeff: Why do we need some of this information regarding the session? We need to have enough information in discovery so the peer can decide if it want to open a session, and if we do does the peer have enough information to succeed. This moves us into 7.2. The discovery protocol needs to have security. Properties are: confidentiality, authentication, and what do we need to do on top of the information. I do not think we need confidentiality. Why am I considering the bgp-autoconf as a transport protocol? When you do the exact same thing for L2 PDU? Linda: You could call it as L2 PDUs... Jeff: What happens are you doing something as fragmentation layer? L3DL has a fragmentation layer. It has a discovery mechanism to discover peer. At the point L3DL has peers, it provides for a fragmentation layer. Linda: We are looking for the BGP session end points. [editori: missed + no recording. Please fill in the information.] Jeff: You have to decide about the peers end points, authentication/security and and then general information [families, AS, device roles]. Only things passed are the things necessary prior to an OPEN. [disconnect due to Sue's web-ex error?] Robert: You stopped talking when you were trying to answer me. Sue: Were you considering the LLDPv2 protocol? Warren: The delivery of the LLDPv2 protocol was later down in the protocol. )see Jeff: Transport considerations involve the fragmentation for the protocol. (See section 4.4.1 in the document.) Transports at each layer have the following scope 1) L2 unicast - same L2 topology direct transmit 2) L2 multicast - across multiple switches has some challenges 3) L3 point-to-point - link set-up with 2 addresses (try to peer) 4) L3 multicasts a) "all routers" [link local multicast) generally works b) L3 multicast for a group - may not pass through L2 multicast. 5) L7 multicast - discovery information in BGP (NLRI for auto-configuration) BGP LSVR (OSPF), LISP type discover This scope is the transport scope. For L2, this is your network in the DC. If you trust it, then you are good. For L3, if you trust your L3 network then you are good. After you go beyond your data-center, less trust exists. Jeff: Hitting the high points from the security AD. 1) integrity of the data 2) authentication of the data (Is the information what you sent) 3) confidentiality - Can other people look at it? At this point, we are going toward what Warren discusses as trust domains. Warren: If your security depends on knowing the peer and more, then If you look for security through obscurity, this not the Jeff: I tried to go through the protocols and determine if we have fragmentation or security issues. Kausik: We did mention the issues you might want to discuss in the document. Jeff: The protocol comparison is not to determine quality, but to provide a description of the proposals. Acee: Do you some of the data-centers have 1500 data set or are these are larger? Warren: Some still have a limit of 1500 bytes in L2 Frame. Recording starting at 10:33 Starting to discuss at section 3.3: Jeff: Section 3.3 looks the requirements for the session selection for auto-discovery protocols. The section now includes what happens if you have more than one mechanism. If you receive information from L2 protocol and L3 protocol, the implementation needs to make a sane design choice on what to bring. One of the design choices we need to discuss is if the bgp auto-configuration protocols is a "hello" protocol [what happens when you lose your hello.] Jeff: For example, if you lose your Hello in draft-xu-idr-neighbor-autodiscovery] you tear down your session. [Note: Discussion point 1: Disconnect of bgp-autoconf protocol (hellos)] Jeff: Starting with section 4.0 LLDP is rather boring. Section 3.1 information: LLDP carries most of the required information. a) LLDP does not do security or have security on top of it. Section 3.2: Relationship with BGP a) [separate auto-discovery mechanism] b) It has enough information some amount of IP information. The IP information is typically limited to management addresses. It does not replace ARP, it does not replace other discovery protocols. [Warren comment added here] c) LLDP does not have fragmentation. Everything must fit within 1 Ethernet frame. We do mention that LLDPv2 is trying to address this limitation. d) The LLDP proposal does not talk about having more than one discovery mechanism. Discussion on Jeff's proposal: Warren's comment on LLDP and this proposal. My issue was "does not do L3". It is possible for one of the TLVs to send the [IP] information. It is a personal grip that LLDP should include this information. Jeff's response: We'll clean up the information the text. [Continuing with Section 4.1.2 on L3DL] Jeff: The L3DL information is a bit of L3DL is a bit of glue on top of things. purpose: The primary purpose for L3DL is the bootstrapping of an LSVR deployment. The key properties is that it has a lot of [necessary] properties to bring up an L3 session. This is a general interesting property on its own, but not specifically for BGP. b) A proposal [I-D.ietf.lsvr-l3dl-ulpc] does allow for carrying additional information. This proposal is incomplete. It has the AS number, IP addresses, and "stuff" about authentication. The information regarding authenticaiton c) L3DL does explicitly cover authentication of the bgp session. (Underspecified, but it is in the document. d) L3DL also has a security mechanism for itself. Comments: Kausik: What is the focus? Are we giving overview details or are we recommending a specific information? Will we suggesting one mechanism or multiple mechanisms. Are we going to suggest for certain deployments in Data Center to use a particular mechanism? What is our approach? Jeff: The approach that I'm taking (and editing the document toward) is to (1) describe the properties we need in an bgp auto-discovery mechanism and then (2) describe whether these properties exist in protocol mechanisms. The choice of which one would you choose is the next step after this step. I want to make sure that people know what the properties are in each proposal. At this point the DT, can go back and point out what was missing. [DT/IDR] can then go back and determine if something should be added in to a specific protocol for [bgp auto-configuration discovery] or if there is something that the group dislikes about a protocol's mechanism. The group [DT/IDR] can go back and change the mechanism or abandon it and do something new. Kausik: This is a fine approach. Do we go back and look at the security mechanisms and find there is a concern? What if we have two different approaches? Will be having two different data center approaches running in the data center? Or will we be recommending one approach for the data centers. If we use two different approaches for the data center, it may be confusing in an implementation. Jeff: I am not trying to answer this point because I have an opinion. Juniper does more than one auto-discovery mechanism with a "tie-breaking approach". One of the possible outcomes of this work is to decide there is exactly one mechanism we should use. Or this DT will decide whether we work on one data center approach or multiple approaches. We could have an alternate approach for ISPs. Even with [that approach] there is some date where there will be more than one mechanism. At that point, implementations will need to have a clean interactions. We can do some of that work as part of that work as part of our discussions. Kausik: Sounds good. We can look at both types of L2 work for now and see how it goes. Jeff: Let me offer my personal observations on the l2 candidates. LLDP was partially chosen because it is an existing protocol. LLDP will pass things unmolested in the switches. Juniper has done a prototype of this LLDP work and the switches pass the LLDP frames forward happily with the data. The fact that LLDP passes things through l2 switches is a good set of properties. The people who have worked in switching environments know that this is not always true. Despite the fact that IEEE is usually very clear about what should happen, the switches do not follow the IEEE protocol. For example, the L2 multicast of frames does not work cleanly because different switches based on different switches have issues. For example, if the silicon works fine the micro-code may have to be changed. Some times the silicon will simply not all the function. These factors encourage protocol designers to overload existing features rather than design new features on switches. L3DL is completely new. It will encounter the incremental deployment problems with the L2 environment. in their case, the the LSVR has a set of their own requirements. In this case, they just happen to say that the protocol might fit this usage. I do not know if the LSVR working and Randy in particular has a strong opinion on its use for BGP auto-configuration. The LSVR L3DL protocol works for their function. 10:52 Warren: I admit my bias about the L3DL. I assume the juniper implementation on LLDP is private. Jeff: Yes [it is]. We would be guilty of code point squatting if it were public. Warren: What do you learn? Jeff: 3 easy things: 1)The "group-id" or role is not well defined. The role is a bit abstract and needs refinement. Wide sharing of device roles is going to take some work in this text. The text is abstract. We give the easy examples of CLO Fabric right now of identifying level 1 CLO fabric versus Level 2 CLO fabric. This is a clear example upon which people can write code. 2) The security stuff was under specified. If the text added something as simple as support for MD5-key chains, and give a key chain id, then this [information in the draft] would give implementers something to code toward 3) Pigging backing of information on top of LLDP Pigging back of the information into LLDP [frames] would be injected from the side form either the BGP process or from OSPF into LLDP. This injection is where most of the interesting programming head aches come. One thing I think Sue will drive us to if we allow this state to be included in the IETF protocol, does IETF get involved in creating an API for BGP. Warren: Last question related to this implementation on LLDP, did you have any recursive [next-hop] loopback issue? If so, was it weird? Jeff: We had a next-hop types mechanism that we use for all forwarding information. It is a little tricky on how that interacts at the bootstrapping semantics. Your IGP or static routes can over-ride the nexhop. In our case, we made it an incredibly low-preference route and that was enough to make it happen. If you have inconsistency, you will be screwed. Warren: There are many ways to shoot your foot. Wim: There is a implementation of the LLDP that is commercial and shipping. I can send you a link. Jeff: Shipping code is good. What were you ideas on code point allocation? Wim: We took a step what was available. I agree that all the grouping and policy was under specified. We had some interest from customers so that we just decided to implement it. If the WG decides different things, we are not biased to it. If people want to play or get experience with it, there is some there is some implementation in the field. The data centers which have deployed it are rather small, but it an implementation. I just wanted to say that there is come experience. Jeff: Excellent! Even as a proof of concept. How was the complexity? Wim: To operate it is easy except for all these group ids. I'm not sure it serves alld the use cases or all the needs for all the people. Some improvements on that level. There is not a large operational population, we had a few customers had it and use it. We did not have to do additional items. It is not widely used other. It is shipping. We did not implement the keychain information. We had a trust model. MD5 is there. We tie to a BGP peer list group attribute. This is how we solved the AS path numbers (and other stuff). Lots of what you see in [the protocol specification] is there. Jeff: That last point is made very specifically when Linda asked about regarding 2 host subnets. We will need to look at the configuration semantics for this case. Saying at BGP a auto-configuration is turned on is necessary, but you will typically need to apply some level of template. Whether you use a "bgp group" or something else. Wim: This is what we did. We do support loopback in this implementation. I hope the image will be public and you can download from github so that it is completely open and everyone can use it? Acee: Did you support any authentication like MAC-sec LLDP? Wim: No we did not. It is completely trust based model from an LLDP point of view. It is LLDP semantics without of the security. Whether that is good enough, please let me know. Jeff: If you did LLDP, I am willing to trust the other side. Wim: This is what we did so far. Whether that's good enough, ... Jeff: We'll save that conversation for later.. Jeff: To touch on the remaining items in the last 10 minutes of the call: 5.1) draft-xu-idr-neighbor-autodiscovery: a) This draft has most of the items we need. It does not quite handle fragmentation. It does involve a piece of bi-directional state (as most of the others do). It is the LLDP hello machinery where you form adjacencies with neighbors. It has the semantic (unlike other proposals) that if you lose your hello session, you are required to tear down the session. One of Robert's proposals does cover that a little, but I will get to that in a moment. b) The document is fairly well specified. The document is heavy weight in terms of format (state machines and other descriptions). However, the protocol is well specified. === 5.2 draft-raszuk-idr-bgp-auto-session-setup Jeff: a) The multicast of the BGP OPEN would work b) Document has "hand waving" on kporitons of the document, which is come from these initial protocols. c) Issues - While you can derive Peering address from the source address of the packet, BGP identifier and the AS number from BGP OPEN packet plus the AFI/SAFI set from the capabilities, the remaining information is mixed up on what good be present. The GTSM support is not going to be in this proposal. Support for BFD is not going to be there, but the capaiblity for BFD could be useful. There are additional things that would have be added to the current BGP OPEN message if it was going to leveraged for the bgp auto-discovery process. Since BGP PDUs can be very large, there needs to be a discussion on whether a fragmentation mechanism is required for this protocol. [Discussion point: Does BGP open need fragmentation mechanism?] If you decide to have one open message for discovery and another open message for the normal open message, these procedures need to be discussed in this document. Jeff: Robert and Kausik do you have any additional points? [on section 5.2] Time 22 minutes: Robert: You've compressed so much into so little time. I guess we can take it one point at a time when we get to the point of discussing this section in the document. This is nothing else except using RR as a controller to discovery your peers. Jeff: This is your AS approach (Section 5.2) rather than section 5.3. I was talking about 5.2 rather than 5.3. Your draft-raszuk-idr-bgp-auto-discovery discovery your additional peers. 5.2 looks at your first peer. Robert: I was discussing the tie-breaking one. [Section 5.3-section 5.4] Jeff: Section 5.3 (draft-raszuik-idr-bgp-auto-discovery) is sending new bgp peer information over a parallel BGP session. A few missing items in this [section 5.3] proposal but adding this information is not a "big deal". The overhead here is that you have already gone through some level of discovering your bgp peer already. This proposal has merit, but it is solving a slightly different problem than the other proposals (section 4.1.1, 4.1.2, 5.1 and 5.2) which discovery your first session. Proposals described in section 5.3 (I-D.razsuk-idr-bgp-auto-discovery], section 5.4 [I-D.acee-ospf-bgp-rr], can be added easily. For example, [I-D.acee-ospf-bgp-rr] can add information to the LSAs easily. The distribution of the data, the fragmentation semantics, and [?authentication] are already there. [Section 5.5 - Discovery at Level 7] Section 5.5 is discovery at layer 7. These protocols include LISP, [RFC6830] or bitcoin. You can bootstrap BGP by any type of API discoverying the API state. [section 5.6 - Link Local Discovery] Jeff: If you have a 2 peer subnet, you could simply bring up the node based on configuration template. As Wim discussed, this type of solution requires a configuration template on your device. You are effectively doing promiscuous style peering. These are the proposal to-date. [25:00 minutes] Jeff: Summary: There is nothing that is specified as required that cannot be put into any one of the proposals. The fragmentation may be challenging for some cases, but it is possible. Beyond that point, most of the discussions are about the complexity of state machines, interactions with the security mechanisms, the transport layer you are carrying the information over, and trying to avoid inadvertent disclosure of information you want to keep private. Sue: Would you repeat the set of open issues? After that I would like to hear if Robert feels his description in section 5.2 is correct? Robert: Why are we jumping outside the DC for this document. For example, Section 5.3 is for IXP - so I thought it was outside of the scope of this document focused on the DC (data center). Jeff: Section 5.3 can be taken out if you wish. What I wanted to make sure is that we discussed if it is a non-DC case is: "what properties do you need once you discover the session?" Robert: OK,you mean discover the peer. I see the point. Jeff: Even if it is for an IXP case or a ASBR for PE-CE connection, I want to make sure what happens after you discover the peer. If we done the analysis, then even if we only build the protocol for the DC case, we know what the protocol [for the IXP, ABSR for PE-CE, or others] will need. Warren: To me this information is useful background. We might put in a section that states "Here is some other related work." Or may be the addition of this [additional information] is just confusing the document. Jeff: I feel the document is close to having all the necessary discussion point in the document. Hopefully, we can go through a few rounds of clean-up and publish the document. Jie: Perhaps we could put such sections in the appendix. Jeff: This is a valid point. Jeff: I think we have most of the information. Warren needs to put something in on the trust model. I think the open points can go through another round of clean-up. We can [review] what is missing from the current round of proposals. I'm not sure we should spend a great deal of time discussing the proposals in detail. The whole point is that we are identifying the properties we want and then take this back to IDR for feedback and comments. At that point, the IDR can have a beauty contest on what specifies your need in which context. Potentially the answer could be more than one bgp auto-configuration protocol. Kausik: I think it makes sense to find what is missing from the existing proposals. This [analysis] will solidify the requirements and bring [clarity] to the aspects of the protocols in [this draft]. Sue: Jeff do you have time to continue? if you do, I am looking for what points do we still need to discuss in this document? Jeff: 1) put in the missing security properties for the proposals, 2) discuss how fragmentation is still messy, and 3) Talk about the trust models and how that impacts the blast radius for the [impact] of each protocol After these three, we need a serious round of clean-up to make sure we have a consistent narrative. I think we have the necessary pieces in here, but the story is a bit jumbled for a clean read-through. Warren - I have one thing to add this regarding "adding the security things from each protocol". In many proposals the security issue are hard to determine what exactly the security text is or if there is any security text. It would be helpful for people who have favorite protocols to help determine what the security features are in the their proposal. Jeff: I'm going to add to that request is [a long term issue on transport security]. Our AD Deborah Brungard has requested that I give the MPLS WG aid to write down the transport security consideration. The IETF as a whole does not have clearly defined protocols for transport security. This is a point that Warren has seen numerous discussions (over and over) in IESG discussions. As much as I have a good handle on it [the issue of transport security profiles], it is true that even the skill developers of BGP like to hand-wave about security the BGP. It resolves to a concept that: a) if your stack supports turning it on, it is still TCP to you. If TCP protected by something fancy, that is not your [the developer's] headache, unless you running TCPDUMP (to figure out a problem). b) What security people are looking for is one half operational and one half protocol. For example on the operational aspect do I put in an "key ID" for a keychain id. IETF is progressing a key chain yang model in WG LC in the RTG working group. Acee can tell us how the yang models inter-operate. This is being done as a local Yang model. You are telling someone via the keychain id something about a security property without disclosing too much about it [the security details]. The text is difficult because everyone of the authors find that the security stuff [considerations for TCP] is hard things to write. Honestly, IDR should have something that is a boiler plate for [BGP transport]. Deborah Brungard set me up to do it, but my "getting around to it" has not happened for 3 years. Jeff: The "group id"/"roles" is an easier thing for us to write. Do we want to try to do a bit of the work up-front in this DT so we have something concrete to discuss. Or do we want to defer this work to IDR discussions. Wim: One of the things of the draft is to put the discussion in perspective and give guidance. Then I want to take into account and put it IDR as a "next phase". Jeff: We actually need it for what goes in there [the protocol (?)]. Randy was not fond of roles and felt roles could be peeled away from the protocol. We have an easy example for BGP CLO fabrics where it is very easy. One possible way to deal with is to give the CLO roles with numbering [details] and suggest numbering for every other use case that is open to the user. Sue: I will send you problem report on L3DL security issue when using the multicast that goes through multiple Groups. Acee: Is Group ID the same as role? Acee: Thank you Sue: Thank you Jeff for your hard work. Adjourn until next week.