Re: [dhcwg] draft-stenberg-pd-route-maintenance

"David W. Hankins" <David_Hankins@isc.org> writes:
> As threatened in the WG meeting, some comments to work on section
> 2.1 of this document, offered in a playful spirit even if my
> writing style doesn't convey that.

Thanks for the comments! I have been fairly busy with random stuff that
actually has deadlines, so I have been shifting through my backlog
gradually. This one I've postponed because I wanted to evaluate your
comments thoroughly :)

Disclaimer: I'm apparently from different background, as I have some idea
about how some big ISPs work, but very, very little idea about what happens
in the cable-land. Therefore, you may have read to my draft something that
wasn't there in the first place.

> In general, I'd like to see this document try and imagine the "best"
> deployment of the described architecture, and then point out flaws
> in it, rather than to imagine all the worst ideas and point out
> just how bad it could get if you really tried.
>
> I get the idea also that while writing this section, the author
> switched between the idea of, say, perl scripts writing DR
> configuration syntax down an SSH channel, and a routing protocol
> approach.  I think in that case you should separate this into two
> different sections (as the first few bulletpoints apply to the
> former but not the latter).

Well, I'm not sure if those two are substantially different. Currently,
there isn't a way of doing it - either within a routing protocol, or
arbitrary configuration-pushing protocol whether it is some super-ugly r/w
SNMP MIB, BEEP/SSH+XML, or something else.

Third-party route injection lacks some of the local state you'd need to
have re-instated within the DR, and obviously that configuration
synchronization via some other transport is not specified either.

>    Considering the backend provisioning system is the only component in
>    the system that actually requires significant amount of nonvolatile
>    storage, from data system point of view it would be ideal to have the
>    backend provisioning system responsible for maintaining the routing
>    state as well.
>
> Consider also: the provisioning system is usually colocated or in
> direct communication (providing configuration state for) the DHCP
> server.  This puts it in a unique position: it is authoritative
> for what prefix is assigned to what purpose.  By definintion it is
> the one thing that is never 'confused' due to out-of-order delivery
> or similar, there are also fewer "protocol components" involved
> (system->routing vs system->dhcpv6->routing).

Depends on the deployment model, I think. Typically, you always want to
have just one, definitive replicated provisioning system, but you may have
a number of servers using it, and number of relays per each server. (This
happens, for example, in v4 space within some ISPs I know of. And I doubt
it'll change with v6.)

> Beyond that I think the summary of this approach's advantages is well
> stated.
>
> I basically don't agree with, or think they are too broadly stated, with
> any of the bulletpoints.  I can do a blow- by-blow if you think that's
> helpful, but I suspect it would be easier to go the other route and say
> how I would have written this section and you can point out what that
> misses (proposed draft text in quotes):

I'm curious about your comments regarding the points, especially regarding
the scalability angle (which I find depressing); if you require the
provisioning system to effectively monitor all DRs (at least), you decrease
the scalability of the overall system, no?

> 1) A DR might be a more 'physical' layered device that does its best
>    work ("core competency") at the physical layer, and really only
>    manages marginally to do the minimum required at any subsequent
>    layers.  To ask such a device to implement everything required
>    from a full IP/TCP/UDP stack on up to BGP (or "some routing
>    protocol") is really asking quite a lot.  It is amazing enough as
>    it is that they manage 'static routing', and might even be 'pinged'
>    without crashing.  Let's not talk about SNMP.
>
>    Simply stated, it is a bad idea to ask CMTS manufacturers to
>    implement BGP.  The resulting abomination would be a horror.

I did not really say CMTS, and intentionally so :-) I'm all for stupid DR
though, but that is mostly because I prefer the RR-based solutions in the
section 2.3 so I am possibly biased.

> 	"Routing may not be the DR manufacturer's core competency,
> 	 so using a full routing protocol may exceed reasonable
> 	 expectations in software development investment."

I think that as a general statement, this is too broad. Nowadays the 'edge'
devices are gaining more and more strange functionality, not less, I'm
afraid. Most of the IP networks seem to be moving towards smart edges and
dumb fat pipes in the core, from my point of view.

> 2) The groups of people responsible for DHCP network deployments
>    are generally not the same group of people for the ISP's core
>    network.
>
>    Simply stated, Sysadmins very rarely know the lay of the BGP
>    land, and probably would lose their fingers if they touched its
>    state in any way.
>
> 	"The existence of multiple administrative sub-domains
> 	 within a network that are responsible for the design
> 	 and operation of core versus edge components complicates
> 	 integration of DR devices with the core network's routing
> 	 protocol.  DR operators rarely have either direct access
> 	 to their network's routing protocols, or experience with
> 	 using or troubleshooting them, as these are roles usually
> 	 relegated to another office."

I doubt BGP's called for anyway, as within the same AS (such as CMTS
deployment, or within-IS DR+server+backend), some IGP would be more
appropriate. That is, as long as outside-AS routing doesn't change - and I
doubt it would for most cases.

> 3) A DR may be expected to implement a kind of extra sensory perception
>    for the RR.  For example, let us imagine that the 'Network' and 'RR'
>    are attached via VLANs, transported by DR's.  This ensures that every
>    RR (which might be a customer-premise device) is in a broadcast
>    domain with no other-customer RR's, and only shares the broadcast
>    domain with a 'router' in the "Network" box in your ASCII art.  This
>    may even be the most desirable scenario, but it necessitates that the
>    DR construct the VLAN connection, and again we have all kinds of issues
>    entering into detecting remote reboots and remote configuration state
>    among fault-tolerant endpoints.  DHCPv6 is a good idea for such
>    configuration state, and some mechanism for the DR to get its brains
>    back if it shifts roles or reboots is necessary.
>
>    So the point here is that even with such a mechanism covering the
>    routing state, we still need a mechanism to recover DR configuration
>    state.
>
> 	"Even if the routing state is carried via a technique such as
> 	 this, it may not remove the need for a means for DR devices
> 	 to re-acquire their RR-specific configuration state.  It also
> 	 presents a form of de-centralization to convey this information
> 	 in two different places, which may not be desirable."

_All_ RR-related DR state needs to be recovered somehow, that is clear;
however, the similar state could be recovered using the same mechanism as
the routing state, no? I'm not sure I see the value in separating the
maintenance of the routing/configuration state, especially in the 2.1 where
the endpoints of the recovery transaction (backend/DR) are the same.

What's the added value of separating this data? The whole 'routing state'
<> 'configuration state' <> 'routability' wording was a subject of
significant discussion when we were originally writing this draft :-> The
original intent was that _all_ related network configuration for the DR in
the 2.1 case would be remote restored by the backend provisioning system,
but apparently the end result didn't quite describe the initial vision.

> 4? In the 3) scenario it's probably also wasteful to apply a globally
>    routable prefix to every Network<->RR connection (using only two
>    addresses), so the RR is only going to have a global address if it
>    chooses one from the delegated prefix.  I think this bridge is usually
>    gapped in IPv6 networks using link state routing protocols (so BGP
>    points a prefix to an address that is found to have a route over a
>    locally-addressed link).
>
>    Conceding to point 1 above, we have to require a much more simple
>    means to allow a DR to be the bridge in this locally-addressed gap.
>
>    I don't know a lot about this scenario, so I'm hesitant to try and
>    say more.  I'm already out on the proverbial limb in terms of my own
>    experiences.

I don't have much idea about practical v6 deployments either, which is
unfortunate. I asked for feedback from some people who do here but it
didn't get in before -00 publication deadline, but figured I'd chat on the
topic anyway in the IETF. There I got assorted comments, which were
coincidentally quite amusing; I believe 3 different interest groups pointed
out some things about "this being the way to go, and those others are
clearly not realistic", and of course, each supporting different approach.

> I find it strangely curious that for the second time today on two
> different topics, I am making a distinction between 'configuration'
> state and 'routing' state being carried by DHCPv6.

Arguably, the current 'routing' solution for DHCPv6 is bit incomplete. The
'configuration' part seems good enough, but the question is should DHCPv6
be complete 'routing' solution (god, I hope not?)..

> At any rate, with those three points alone, and the introductory
> paragraph you already have, the idea is generally well documented that
> although the technique might work well in certain deployments (you
> cover the main 'good' points in that introduction paragraph), it is
> not a universal solution, and there's a documented need to fulfill
> requirements in other use cases.

I'll probably incorporate (at least content, if not exact wording) of those
3 paragraphs for -01, if/when it shows up :)

> The 4th thing is probably a quagmire getting far too close to the
> details.

Indeed. It sounds like nasty quagmire to me too, and from my point of view
(i.e. protocol specification), doesn't seem critical to elaborate more on.

-Markus

_______________________________________________
dhcwg mailing list
dhcwg@ietf.org
https://www1.ietf.org/mailman/listinfo/dhcwg