Re: [Anima] GRASP DULL, IPv6 LL scope and multicast and BSD sockets API

Brian E Carpenter <> Wed, 07 April 2021 05:09 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id ACB753A3FA9 for <>; Tue, 6 Apr 2021 22:09:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Cd8OC7QEBgx2 for <>; Tue, 6 Apr 2021 22:09:45 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::433]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 184CE3A3FA6 for <>; Tue, 6 Apr 2021 22:09:44 -0700 (PDT)
Received: by with SMTP id i190so3784439pfc.12 for <>; Tue, 06 Apr 2021 22:09:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=lYvbJkK4oxBToK1PgZ0Yg2YfuO05ENVjSr92eQVAk9Q=; b=FjgnCBQ6h/EQYxZ45PpBOtk2YwjOTRG23kOC5R3eD0i1MN6NNusjkwDaRKPEfhaXzg IuiB2cN0hw0YO9iIMIhOQNwAazUwRbqIsj8+ztwM/Jm9akH+OAVPFIhMKUZw2bGkzI7V aQiH9YuALQll8HKcKnvuTqmWsAv9ejiFa2ZREUgR/boF9QfT91eoVw/q3o6FtaO45rAR R9Kj0ExcrHHp1WQBvWpREagepJhaQJsYfBaWiGtwaEfoSPZzViQdtP0zoGBWe9WTxmfd gMOokJ9hrTcbFBEnX2WT30aipUETTyZrlBfT++X2VL9RTktseJmly+QpK4Ckgm7+RShA EBSA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=lYvbJkK4oxBToK1PgZ0Yg2YfuO05ENVjSr92eQVAk9Q=; b=uNYYSofLxL55EecY45TvqPvVsZWLkRu2pVoaYbPHj0sZ89xXaAcyxrrNHJkVSXpcox C3z0yyBIzyqq8XIfONKJFiCa6FzEQQmTVBJ7ZVzJPkV8Ex3vRzzTYUcs9fmOuz3dx8xb rH2JfhKFLnYYEsr9emm1u2Ra2DGCG4AFBVxdzXqjie/FVq4KdUDbYAT/KuEKyqDnIvWk GIu9jRruKvwLEtGGxBuoUd7JW8mwu2sjo5vmnfdLVW/B2wVq87ifKOocOF58Fv6bMDL1 /Kpj5zxJWPHlzZ3KsVu+NmnQeWULJHbzQBqAieaBeinJCZIZHzEQ8B9TU5ulPvE5/1o8 cnTw==
X-Gm-Message-State: AOAM531aJDC2owrJdExgtgxTP24YOI1T/AAd1h6Bwpw7hNTrRAmpoLLn uCUh/SA9YQkVePEiFpeip3c1yrqugnb1rw==
X-Google-Smtp-Source: ABdhPJxlcyTKU5mB1bABnqbXyvuERFCWwhAIgdVKeK11dMpd5OdraB3U2Uacg6SPhKF7LZuQXeMKpA==
X-Received: by 2002:a62:b606:0:b029:222:7cab:5b1 with SMTP id j6-20020a62b6060000b02902227cab05b1mr1319985pff.32.1617772183403; Tue, 06 Apr 2021 22:09:43 -0700 (PDT)
Received: from [] ([]) by with ESMTPSA id w188sm19825191pfb.4.2021. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Apr 2021 22:09:42 -0700 (PDT)
To: Michael Richardson <>
References: <7643.1617727789@localhost>
From: Brian E Carpenter <>
Message-ID: <>
Date: Wed, 7 Apr 2021 17:09:37 +1200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <7643.1617727789@localhost>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [Anima] GRASP DULL, IPv6 LL scope and multicast and BSD sockets API
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 07 Apr 2021 05:09:50 -0000

Hi Michael,

TL;DR: only for people who care about running code.

I'll tell you what I did in my implementation concerning sending and
receiving multicasts, which of course faced the same issues as you.
This is the "simple" version; we can take it off list if you want even
more details (and don't want to figure out my code).

One overall remark: coding a demo implementation in Python, I didn't
worry much about efficiency, and in particular, whenever I wanted a new
thread, I made one. (Python threads aren't operating system threads,
they run within a single Python process, so they are a convenience
rather than a performance tool. From what you wrote, I think that's
quite comparable to rust coroutines.)

1. The first thing my code does is build a list of all the interfaces
it has, each with its interface index (as an integer, not 'eth0' etc.)
and its link-local address. [A defect in my code is that it doesn't
dynamically track interface up/down events.] The result is a list
(Pythonese for array) called _ll_zone_ids. For example, on this
(Windows) machine that list currently consists of
[[7, IPv6Address('fe80::80b2:5c79:2266:e431')],[23, IPv6Address('fe80::db7:d041:a2d:ce65')]]

2. At startup, I create a socket for multicast sending for each interface in that list.
Details are in the function '_try_mcsock', including setting SO_REUSEADDR, 1,

3. To send M_FLOOD and M_DISCOVER, I loop over all those sockets, doing
sendto(msg_bytes,0,(str(ALL_GRASP_NEIGHBORS_6), GRASP_LISTEN_PORT))
on each socket.

4. To listen for multicasts, I have a thread that listens to a single socket
with SO_REUSEADDR, 1 that is bound to GRASP_LISTEN_PORT and explicitly joins
ALL_GRASP_NEIGHBORS_6 on each interface. I don't understand my own code (magic
borrowed from stackoverflow) but it's just after 'class _mclisten(threading.Thread)'

When a multicast arrives I queue it for a separate thread, so I can get
back to listening. Here's what goes in that queue for each new multicast:

[source_address, source_port, interface_index, message_body]

However, there are complications... one is that as far as I can tell, setting
IPV6_MULTICAST_LOOP to zero is not reliable. In any case, when testing a GRASP
instance against another in the same node, you *want* to receive your own
multicasts.  You can check if a multicast is from your own node by testing
whether its (link-local) source address is your own. So the crucial test
in my code is

    if _listen_self or (not [ifn, saddr] in _ll_zone_ids):

A multicast that fails that test is simply discarded. So the only messages
that get queued for processing are the ones from other nodes (unless the
user sets _listen_self).

I hope that helps. This is code I wrote years ago now, and I'm not proud
of it; in fact I'm frightened to touch it. A few remarks in line below:

On 07-Apr-21 04:49, Michael Richardson wrote:
> Brian, I think that I am making a mistake in how I am binding my IPv6
> multicast sockets on which I expect to hear GRASP DULL messages.
> 1) I create a socket with an unspecified address, and the GRASP_PORT (7017):
>         let rsin6 = SocketAddrV6::new(Ipv6Addr::UNSPECIFIED,
>                                      grasp::GRASP_PORT as u16, 0, ifindex);
>         let recv_try = UdpSocket::bind(rsin6).await;
>      Note that I had originally expected that I should bind it to the IPv6-LL of
>      the interface, but that meant that the socket does not match the
>      multicast destinations.
>      I mark SO_REUSEPORT, and SO_REUSEADDR on this socket.
> 2) I join the multicast address for that socket:
>                 let grasp_mcast = "FF02:0:0:0:0:0:0:13".parse::<Ipv6Addr>().unwrap();
>                 recv.join_multicast_v6(&grasp_mcast, ifindex).unwrap();
>    Note that "ifindex" is the scope of the interface that I want to bind.

Right. That's all the same as what I did, repeating the join_multicast step
for each valid ifindex. (In my code you'll see IPV6_JOIN_GROUP.)

> 3) If that works, then, in order to send, I create a new socket, which I do
>    not bind to multicast, but I do bind it to the interface:
>                 let ssin6 = SocketAddrV6::new(Ipv6Addr::UNSPECIFIED,
>                                               0 as u16, 0, ifindex);
>                 let send = UdpSocket::bind(ssin6).await.unwrap();
>    maybe I could use the above socket, but I think it is easier to have two
>    sockets as it simplies threads.

Assuming you mean to send *multicast* yes, I couldn't see a reasonable way
to avoid one socket per ifindex.

> I have been binding a multicast listening socket for each "physical" interface.

And that's what I did different - one socket, bound to the address but not to an

> In testing with more than one interface I realized two things:
> a) I hear myself on the same interface.  That is, M_FLOODs about address
>    fe80::1234, are heard on the multicast socket bound to fe80::1234.
>    Okay, I think, just filter those out since they come from "me"


> b) Oops, I hear myself on the other interfaces.
>    So, M_FLOODs about fe80::1234 on ifindex 2 are heard on ifindex 3,
>    (call it "fe80::abcd").  They aren't "me", so I actually have to talk my
>    list of interfaces and filter out all the "me".

Right. And for this reason you might as well get all your multicasts
on the same socket, because you have to check for your own echoes anyway.

>    I realized that the originating sockaddr provided in recvfrom() also
>    has an sin6_scope which is filled in, so the multicasts which loop back
>    internally can be filtered out by listening on for things which
>    are on the ifindex I wanted to listen to.
>    But, I still have to filter for the list of "me", because it could be
>    that fe80::1234 and fe80::abcd are actually two ports in the same L2 domain.

Exactly, and only you can know. In the end, that's why I invented my 
_ll_zone_ids[] array.

> I actually ran into the last bit when looking at the IPsec policies that were
> being created.  My IKEv2 daemon gets cross when you ask it to initiate to
> a peer which it is convinced it also self.

That must be a bit annoying in some test scenarios. I was delighted when I
realised I could test GRASP against itself within one machine. Vast saving
in debug time and bother.

> I may have made some mistakes setting the IP TTL of my packets.
> I think they are set to 1.

Did we even specify it?

> (The GRASP TTL was incorrectly set to that as well, which I fixed already.)
> I have been creating a GraspDaemon thread per interface.

I had to do that for the unicast traffic, not for multicast. But as noted
above, I split the multicast reception and initial checking thread from the
actual message processing thread, to avoid missing multicasts while processing
floods and discoveries.

> Since this is really just a Rust Async co-routine (a "green thread"), and not
> an actual system thread, I feel that the simplicity of just having the
> simplest of loops running outweighs the potential scaling issue of having
> hundreds of these running.  The co-routine mechanism means that this all just
> leads down to an event loop and a call to select(2)/poll(2)/epoll(2),
> etc. all handled by tokio and the compiler and not me.
> But now, I'm thinking that I should have just done a single Grasp(DULL)
> daemon receive thread, listening on all interfaces.  I can't really remember
> why I didn't do that.  Maybe because I thought I would need a multicast
> socket per physical interface.

Understood, but if you think about a full fledged rust version of GRASP,
I think you'd see what I was worried about: while the daemon is processing
a message, it isn't listening for the next one. Putting a queue between
the listener and the processor mitigates that issue for bursty traffic.

> On the transmit side, I have multiple loops sending, but that is easily
> merged into a single loop, and it would have the advantage that I could more
> easily stagger DULL announcements across different interfaces.

Yes, exactly.

> I'm not actually sure how often M_FLOODs are supposed to be sent.
> I scanned through ACP (section 6.4) and through GRASP-15, and I didn't see anything.

GRASP certainly doesn't specify a frequency, but it does say:
" An ASA that initiates a flood SHOULD repeat the flood
  at a suitable frequency, which MUST be consistent with the
  recommendations in [RFC8085] for low data-volume multicast."

Oh, you missed this under the AN_ACP objective:

"In the above example the RECOMMENDED period of sending of the 
objective is 60 seconds. The indicated ttl of 210000 msec means 
that the objective would be cached by ACP nodes even when two 
out of three messages are dropped in transit."

ACP says this for the SRV.est objective, which IMHO is better:

"The M_FLOOD message MUST be sent periodically. The default SHOULD be
60 seconds; the value SHOULD be operator configurable but SHOULD be
not smaller than 60 seconds."

But a bit of random jitter is a fine idea.


> That is:
>      loop {
>           sleep(60s +- rand(10));
>           send-M_FLOOD-on-next-interface;
>      }
> rather than:
>      loop {
>           sleep(60s);
>           for if in interfaces {
>               send-M_FLOOD-on(if)
>           }
>      }
> --
> Michael Richardson <>   . o O ( IPv6 IøT consulting )
>            Sandelman Software Works Inc, Ottawa and Worldwide