Re: grow: anycast ops draft

Geoff Huston <gih@apnic.net> Sun, 07 November 2004 15:56 UTC

Received: from darkwing.uoregon.edu (root@darkwing.uoregon.edu [128.223.142.13]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA28926 for <grow-archive@lists.ietf.org>; Sun, 7 Nov 2004 10:56:58 -0500 (EST)
Received: from darkwing.uoregon.edu (majordom@localhost [127.0.0.1]) by darkwing.uoregon.edu (8.12.11/8.12.11) with ESMTP id iA7Fp3tq025980; Sun, 7 Nov 2004 07:51:03 -0800 (PST)
Received: (from majordom@localhost) by darkwing.uoregon.edu (8.12.11/8.12.11/Submit) id iA7Fp3sQ025979; Sun, 7 Nov 2004 07:51:03 -0800 (PST)
Message-Id: <6.0.1.1.2.20041107073145.0218a6d8@localhost>
Date: Sun, 07 Nov 2004 07:32:02 +1100
To: grow@lists.uoregon.edu
From: Geoff Huston <gih@apnic.net>
Subject: Re: grow: anycast ops draft
Sender: owner-grow@lists.uoregon.edu
Precedence: bulk

[note - bounced on non-member address - delayed while list owner
 was in transit]

Hi,

some comments after reading through the doc....

2.  Terminology


    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
    document are to be interpreted as described in RFC 2119.

I see no use of these words in the document - in which case its unnecessary.

3.2  Goals


    4.  Triangulation of traffic sources, in the case of attack (or
        query) traffic which incorporates spoofed source addresses;

Are you sure you mean 'triangulation'? I read this as being able to
determine precise location from three sets of relative distance
measurements. in a two dimensional plane

   5.  Improvement of query response time, by reducing the network RTT
        between client and server with the provision of a local Anycast
        Node.

You are assuming that routing systems converge on  the lowest RTT in
selecting a best path. This is not universally true at either the local or
the global level.

4.4.1  Signalling Service Availability


    When a routing system is provided with reachability information for a
    Service Address from an individual node, packets addressed to that
    Service Address will start to arrive at the node.  Since it is
    desirable for the node to be ready to accept requests before they
    start to arrive, a coupling between the routing information and the
    availability of the service at a particular node is desirable.

Some would say "essential' rather than just "desireable".

                                                        This can be
achieved using
    routing protocol implementations on the same servers which provide
    the service being distributed.

I disagree - if the service hangs but the box remains up then this is not
being achieved. Perhaps you could explain this technique in further detail.

     Another
    approach is to tunnel requests from nodes that cannot handle
    individual services to other nodes that can, perhaps using an IGP
    which extends over tunnels between nodes, in which servers
    participate.

You appear to be assuming, without actually saying so, that an anycast
service instance has an associated unicast address and that the service
needs to be available over this unicast address. as well as over the
anycast address for this technique  to work. You should note that this may
be a sub-optimal solution when compared to withdrawal of the instance of
the anycast address from the routing system.


4.4.3  Equal-Cost Paths


    Some routing systems support equal-cost paths to the same
    destination.  Where multiple, equal-cost paths exist and lead to
    different anycast nodes, there is a risk that request packets
    associated with a single transaction might be delivered to more than
    one node.

This confuses multicast with anycast. I suspect you mean that in the case
where a transaction requires multiple packets (or even where there is
fragementation of a single packet), then all the packets may not be
delivered to the same anycast instance. Also, strictly, its the case that
forwarding systems support load balancing of traffic across multiple paths
to the same destination.

4.4.4.  Route Dampening

"especially stable Global Nodes"

Whatever they are, it sounds like I want one. I'm not sure these exist, in
that stability of a route depends of the originator, the transits and the
receiver. In fact I'm really not sure what point is being made in this para.

4.4.5  Reverse Path Forwarding Checks

I know what you are trying to say here, but the words are certainly
unclear. What you are saying is that when a router receives a response from
any anycast instance, it may have learned of a 'best' forwarding to another
instance of this anycast address that uses a different forwarding decision.
Where strict RPF is in use the incoming response is appearing at the router
from the 'incorrect' interface and RPF will cause the packet to be discarded.

   Care should be taken to ensure that strict-mode RPF is not enabled in
    peer networks connecting to anycast nodes.

I disagree with the equivocation here - What you are saying is that RPF and
anycast are incompatible wiith asymmetrical paths, and as the Internet is
today highly asymmetrical the conclusion you are driving to is that RPF and
anycast are mutually incompatible. In the interests of calling a dirt
manipulation device a bloody spade, I would recommend you make this
conclusion overt (!).

4.4.6  Propagation Scope

   Local Nodes advertise covering routes for Service Addresses in such a
    way that their propagation is restricted.  This might be done using
    well-known community string attributes such as NO_EXPORT [6] or
    NOPEER [11], or by arranging with peers to apply a conventional
    "peering" import policy instead of a "transit" import policy, or some
    suitable combination of measures.

If you want to talk about Bad Practices, another mechanism is to simply
advertise the service on a /32 and rely on other AS's using ingress prefix
length filters to remove the anycast from propagating. (whats this use of
the term "AS:es" for the plural of AS by the way?)



5.2  Self-Healing Nodes


    As is described in  having the Anycast Node avoid black-holing
    traffic in the event of a failure on the software or subsystem
    providing the service should be avoided.

As described in  --- where? (missing ref)

6.  Security Considerations

I was expecting to see a discussion about the potential mitigations in the
event that one instance of any anycast services is compromised without
touching the other instances. This form of localized attack at the service
level is often undectected in an anycast instance as a single operational
service monitor cannot simultaneously query all anycast instances of the
service simultaneously. (you mention this in the intro under the term of
monitoring the service, but there are security implications in terms of
detectability of security breaches of the service).

Anycast also makes hijacking a service much easier, and this, too, should
be noted and analysed in this section. How do I know that I am talking to
an authentic instance of the anycast service? Is this a service-specific
problem? In which case are some services wisely never placed on anycast?

You should also talk about the cascading attack. If it is possible to
remove an anycast routing instance from the routing system through an
attack, then a sustained attack can work. By sucessively removing anycast
instances the routing system automatically redirects the attack to the next
anycast instance without the attacker having to do anything at all. This is
probably not desireable.

Also:

I was looking for some comment on tradeoffs in the interdomain case between
route aggregation and the need to inject a new routing
entry for the anycast service - i.e.interdomain anycast is not  something
that naturally scales as it appears to inject a routing entry per anycast
service, which, arguably, is at a finer granularity than per-host routes.
You touch upon this in 4.4.2, but it appears to be saying that in the
interdomain case you can't get away with a /32 advertisement, so use a /24
and things will work just fine. I suspect that you need much stronger
caveats than that and explicitly note that an anycast service in the
inter-domain space burns up a routing slot for the service. and the
interdomain routing system does not expand infinitely to accommodate all
possible services being implemented in such a manner. This implies that
anycast is not a general technique for service resilience and all the other
Good Things you note in section 3.2, but in teh inter-domain case is
actually a specialized solution technique used predominately for common
infrastructure services, such as, surprise, surprise, instances of the DNS
roots.




At 07:03 AM 2/11/2004, Joe Abley wrote:
>Kurtis Lindqvist and I squeezed this draft through before the cut-off:
>
>   http://www.ietf.org/internet-drafts/draft-kurtis-anycast-bcp-00.txt
>
>>Abstract
>>
>>    As the Internet has grown, many services with high availability
>>    requirements have emerged.  The requirements of these services have
>>    increased the demands on the reliability of the infrastructure on
>>    which those services rely.
>>
>>    Many techniques have been employed to increase the availability of
>>    services deployed on the Internet.  This document presents
>>    operational experience of wide-scale service distribution using
>>    anycast, and proposes a series of recommendations for others using
>>    this approach.
>
>Anycast distribution of services (the practice of injecting reachability
>information for an address into a routing system in multiple places, from
>autonomous nodes) is not at all a new idea; however, there are an
>increasing number of people using anycast approaches to distribute
>high-profile services, and few published guidelines for doing so. The aim
>of this document is to provide some guidelines.
>
>There are many, many people with experience deploying anycast services
>whose thoughts and opinions are not in that -00 draft. Hopefully some of
>them are on this list, and they can help us make the draft better.
>
>We have a brief slot in the grow meeting next Tuesday to introduce this
>draft, and we plan to propose that it becomes a working group document. If
>people on this list have a chance to review the draft before then, that
>would be great.
>
>
>Joe
>
>_________________________________________________________________
>web user interface: http://darkwing.uoregon.edu/~llynch/grow.html
>web archive:        http://darkwing.uoregon.edu/~llynch/grow/

_________________________________________________________________
web user interface: http://darkwing.uoregon.edu/~llynch/grow.html
web archive:        http://darkwing.uoregon.edu/~llynch/grow/