Re: [Anima-signaling] GRASP issue 51: Flooded objectives

Toerless Eckert <eckert@cisco.com> Tue, 09 August 2016 06:39 UTC

Return-Path: <eckert@cisco.com>
X-Original-To: anima-signaling@ietfa.amsl.com
Delivered-To: anima-signaling@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 56F5A12B026 for <anima-signaling@ietfa.amsl.com>; Mon, 8 Aug 2016 23:39:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.768
X-Spam-Level:
X-Spam-Status: No, score=-15.768 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.247, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UfVS0-fWS7R0 for <anima-signaling@ietfa.amsl.com>; Mon, 8 Aug 2016 23:39:12 -0700 (PDT)
Received: from alln-iport-7.cisco.com (alln-iport-7.cisco.com [173.37.142.94]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 561EC12B00A for <anima-signaling@ietf.org>; Mon, 8 Aug 2016 23:39:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=10487; q=dns/txt; s=iport; t=1470724752; x=1471934352; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=H7sn5NivQx2fe2kT6kr0bZObnKaaOXaHH19e/JyrsJQ=; b=GEzN9nMrigDx96XsNlSG43T6b+PP3WrMOP+UIyLp9Bs+1ZL9kofgyYhY QYdqPrOavZdabyo+0aUOnz4N1RFtpxSndmkVajW+cxFmlAvBuETKGoU4+ De2WD6qc1LHzWbGYCuCOvm9LELVOCJA3vAnHfEZpF1kGZDoUfGPJ1zlvp 0=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0AzAgAxealX/4sNJK1dg0WBUrcQgg+Bf?= =?us-ascii?q?YJmgzcCgUA4FAEBAQEBAQFdJ4ReAQEEATo/BQsLGAklDwVJExuIDgjDPQEBAQE?= =?us-ascii?q?BAQEBAgEBAQEBAQEBAQEBHIp3hCMBAQEEhXEFiCWGaIQ/KYVEjwAKgWuNWIRDg?= =?us-ascii?q?iGFUIN4HjaCDwMcgWwcMoYXgTYBAQE?=
X-IronPort-AV: E=Sophos;i="5.28,493,1464652800"; d="scan'208";a="308482274"
Received: from alln-core-6.cisco.com ([173.36.13.139]) by alln-iport-7.cisco.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 09 Aug 2016 06:39:11 +0000
Received: from mcast-linux1.cisco.com (mcast-linux1.cisco.com [172.27.244.121]) by alln-core-6.cisco.com (8.14.5/8.14.5) with ESMTP id u796dAtG017612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 9 Aug 2016 06:39:11 GMT
Received: from mcast-linux1.cisco.com (localhost.cisco.com [127.0.0.1]) by mcast-linux1.cisco.com (8.13.8/8.13.8) with ESMTP id u796dAAo031048; Mon, 8 Aug 2016 23:39:10 -0700
Received: (from eckert@localhost) by mcast-linux1.cisco.com (8.13.8/8.13.8/Submit) id u796d9YJ031047; Mon, 8 Aug 2016 23:39:09 -0700
Date: Mon, 8 Aug 2016 23:39:09 -0700
From: Toerless Eckert <eckert@cisco.com>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
Message-ID: <20160809063909.GZ21039@cisco.com>
References: <47c158b3-92f4-8a36-1bc1-c5c19afdc6b3@gmail.com> <20160801085529.GT21039@cisco.com> <97815aec-a36f-8dda-6798-eb0c1c5eda17@gmail.com> <20160805083457.GH21039@cisco.com> <f541b926-cde7-db52-a0fa-226761d0c3c8@gmail.com> <20160808140326.GA6295@cisco.com> <a09050d0-ce14-d628-13b6-28e73a79d238@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <a09050d0-ce14-d628-13b6-28e73a79d238@gmail.com>
User-Agent: Mutt/1.4.2.2i
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima-signaling/7t-idVYhD2eTp8ECFRqzUW8zF10>
Cc: Anima signaling DT <anima-signaling@ietf.org>
Subject: Re: [Anima-signaling] GRASP issue 51: Flooded objectives
X-BeenThere: anima-signaling@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Mailing list for the signaling design team of the ANIMA WG <anima-signaling.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima-signaling>, <mailto:anima-signaling-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima-signaling/>
List-Post: <mailto:anima-signaling@ietf.org>
List-Help: <mailto:anima-signaling-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima-signaling>, <mailto:anima-signaling-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Aug 2016 06:39:14 -0000

Inline

On Tue, Aug 09, 2016 at 01:22:48PM +1200, Brian E Carpenter wrote:
> Thanks Toerless, this is a very useful discussion in terms of detecting
> unspoken assumptions, so I have to continue with more questions:
> 
> Now here's another hidden assumption. The GRASP spec currently says (in the
> security considerations):
> 
>    - Authorization and Roles
> 
>       The GRASP protocol is agnostic about the role of individual ASAs
>       and about which objectives a particular ASA is authorized to
>       support.  It SHOULD apply obvious precautions such as allowing
>       only one ASA in a given node to modify a given objective, but
>       otherwise authorization is out of scope.
> 
> I have been working on the assumption that only one ASA is allowed to
> register itself to manage a given objective in a single GRASP instance.
> 'Manage' means being a source of synchronization or flooding or being
> a negotiation responder. Why? Because otherwise, other nodes will see
> inconsistent behaviour by the node in question.
> 
> I don't understand an application scenario where it's useful to have two
> ASAs managing the same objective in the same node. Why would you allow
> two Registrars in the same node, for example?

I gave that two registrars example in an earlier mail: semless software
upgrade/downgrade:
  - v1 of registrar running
  - start v2 of registrar. It sends message (eg: GRASP) to v1 instance
  - v1 withdraws its GRASP objective announcements but continues
    to accept EST connections
  - v2 instance starts sync flood of objective.
  - After a while v1 has no ongoing EST connections, and hopefully
    we also will have expiry of flood objective, so eg: after it has
    no active EST connection AND 3 * timeout of flood-objective, v1 can
    declare itself done. Maybe signal to v2 instance for purpose of
    diagnostics - o am done - and quit.

Maybe i will also have two versions of registrar run in parallel when
we come up with a different protocol beside EST for example. Those of
course would run in parallel. And in another mail i did have the pseudo-code
for the client-code searching for best objective. And on that side it 
doesn't make a difference whether different version/type(eg: protocol)
of an objective are running on the same autonomic node or different ones.
But for deployment, it is a lot easier if you have to bother about
variety of ASA only on a single node.

ASA driver for some type of service with different
models, eg: printer driver. Each printer has different printer driver,
but maybe the objective is just "printer". 

Parallelization. Ok yes, you should write high performance ASA
multi-threaded. But maybe its easier not to and just have a cluster of
these ASA on a node. Thats even how today a lot of services are scaled.

What you are trying to do is "exclusivility". Which is a nice option.
But it shouldn't be the mandatory base for all objectives. If we have
a good enough example for exclusivity, it should be added to the API,
aka: it would have to be something GRASP library tries to guarantee.

> But in fact, as far as I can see, GRASP discovery could work in such a
> case, returning two different discovery responses from the same node, with
> different GRASP-managed ports. (I suspect my code would already support that.)
> That would allow synchronization or negotiation independently with either of
> the ASAs. Hence the SHOULD in the above extract.

Right. I would rephrase this like: "ASAs supporting specific objectives
may want exclusivity for that objective, eg: no other ASA on the same
autonomic node offering that objective. In that case, the GRASP library needs
to support approproate API for ASA to offer this option. Such API
options are outside the scope of this spec".

> However, if we also allow two different ASAs in the same node to flood the
> same objective, that does require some extra tagging, as you say below.
> So I want to be very clear about it. Why do we want to do that?
> 
> Since it would be a protocol change, we would need to present it to the whole WG.

Sure.

> > Therefore the
> > "source" needs to have another distinguisher beside the IPv6 address of the
> > autonomic node. This is not an issue that the individual ASA can solve becaue
> > the cache is kept by GRASP.
> > 
> > The distinguisher can be any ID, but i proposed to use the ASA's TCP port number
> > through which it runs GRASP becaude this ID does already exist and is unique.
> > So no new ID-allocation code to write.
> > 
> > This TCP port number will NOT be used to construct the transport port of any
> > service connection!
> > 
> > Example:
> > 
> >     Registrar ASA objective announcement:
> >         source = { ACP-ULA-IPv6-address, proto=TCP, Registrar-GRASP-TCP-Port }
> >         ....
> >         Payload:
> >         EST-adress: { proto=TCP, port=<Registrar-EST-port> }
> > 
> > Registrar ASA starts, it gets a TCP port for its GRASP unicast messages
> > GRASP puts that <Registrar-GRASP-TCP-Port> into the source of all Objectives from this ASA.
> > 
> > Then the Registrar ASA opens a TCP server port for the EST protocol. Then it
> > announces the objective.
> > 
> > The Bootstrap-Proxy constructs the transport address of the EST connection
> > by combining the TCP port from the payload EST-address with the IPv6 address
> > of the source. The TCP port of the source is only used to send/receive GRASP
> > messages from/to that ASA - which happens by the GRASP library. 
> > 
> > And of course, this could be a TCP or UDP port number. I personally think TCP
> > is easier even if there is more overhead because i don't need to bother about
> > fragmentation (for GRASP).
> 
> Fully agree about TCP. I haven't even coded the UDP option ;-)
> 
> > Normally, U) means "used",
> 
> Right. It needs to be deleted by a Least Recently Updated algorithm!
> 
> >> We can do that. It's a choice: extra bytes in every flood message versus
> >> no default expiry mechanism, just LRU cache expiry.
> 
> The alternative is a fixed lifetime but I don't like that for flooded
> objectives, because some of them (e.g. Intent) might have infinite validity.

My concern is to get dead instances of objectives quickly out of the world and
to have a simple and known stable update mechanism that scales well. Which
is periodic broadcast (aka: multicast) for these.

When we announce the lifetime in the GRASP flood objective, then it could be
very long. The issue making it shorter is primarily reliability. When we
use link-local-multicast, there is no guarantee that it will be received
by all nodes. I would actually prefer to use the ACP TCP connections also
for the flood-sync inside the ACP to make them reliable. One for each ACP
neighboring autonomic node. No UDP multicast.

And of course we can change easily from TCP to CoAP if need be (that's also reliable
AFAIK). But as Carsten reminded me: Even in a TCP connection, the sender has
no guarantee that a receiver has received a message unless it also sends some
payload acknowledgement ("500 OK" ;-). Which i think you have not defined for
GRASP messages, have you ? 

If we'd be using CoAP, then CoAP itself provides the ACK of messages ;-)

So, if we assume hop-by-hop reliable flood-sync, we could use very long
time-to-live. Could even be 0 = never expires.

BUT: This does not solve all problems:

What happens with a node newly rebooting. How does it get those existing
flood objectives ? 

     backbone/registrars - AN1 - AN2 - AN3 - AN4 - AN5
                                       |-------------|

Power outage in building with AN3,4,5. AN4,AN5 immediately recover after power1,
AN3 power supply blows, takes 8 hours longer to be brought back online.

How will AN3, AN4, AN5 get intent after AN3 comes up ?

AN3 i guess could explicitly query objectives from neighbors and will ege
get sync of objectives from AN2. But would AN4, AN5 continue for 8 hours
to do the same ? 

And lets imagine an objective that only edge-AN-nodes need, eg: AN5. For
those objectives not needed on every node, it gets even trickier.

Aka: periodic, low-frequency broadcast is a proven stable solution. Its
not the nicest. It may not be the only one we want in GRASP, but i'd
feel a lot safer if we could make that option work.

One solution for above problems:
  When AN3 comes up, it does send some GRASP query with a hop-count of 1
  that's specifically asking for content of all neighbors flood-sync
  cache. YOu hve to figure out how to craft that message (eg: IMHO
  it's a special query because it's for all cached objectives).

  When AN3 receives replies from AN2, those replies must have the same
  loopcount/remianin-gloopcount as when AN2 first resent the flood-sync
  from eg: a registrar in the backbone.

  So, AN3 then simply continues to flood-sync those replies it receives
  as if they where "normal" flood sync it receives.

Once this is defined and in our review we think it works reliable,
we can reduce frequency of retransmissions for those objectives
that we think are very long-lived. But we will still be left with
the issue of withdrawal...

Example: Registrar announces lifetime of LONG. WHole network learns it
AN3 goes down, AN4, AN5 stay up. Registrar is replaced by other registrar.
AN3 comes up. AN4, AN5 will still have ld registrar flood sync cache
entry - which will not time out for a LONG time.

This is something like state of the art problem. I have ideas how to
fix this, but not sure we should go to tht trouble initially, because
it gets quite complex.

Maybe rather look into what we specifically would need for intent to
vet how far we want to drive GRASP:

Do you have any thoughts on what Intent would be for grasp ? A
specific objective ? Announced by which autonomic node ? ...
> > 
> > I'll vote for extra bytes.
> 
> If we're happy that flooding is relatively rare, that seems OK. Of course that's
> another protocol change to be discussed with the whole WG.

Sure. Maybe we put together an update of the discussion on the list ?o

Cheers
    toerless
> 
> Rgds
>    Brian
> 
> > 
> > Cheers
> >     Toerless
> > 
> >> I'll review all your points after we clear these ones up.
> >>
> >> Thanks
> >>     Brian
> > 

-- 
---
Toerless Eckert, eckert@cisco.com