Re: [manet] RFC7181 (OLSRv2) trouble with ANSN and router restart

Christopher Dearlove <> Fri, 23 July 2021 12:41 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id E65E73A194D for <>; Fri, 23 Jul 2021 05:41:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id mkFbVHN04h8g for <>; Fri, 23 Jul 2021 05:41:39 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::334]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 360043A193D for <>; Fri, 23 Jul 2021 05:41:39 -0700 (PDT)
Received: by with SMTP id 14-20020a05600c028eb0290228f19cb433so3461444wmk.0 for <>; Fri, 23 Jul 2021 05:41:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=M69yLaEzLK7DaR3NFq6YCujIiuVhHF8iNmH+wctAmSI=; b=fxKmmSXCBFvMqsqjIhy1QgHkNZVkATygf6yF0fM/HHg7t2b4kFmCQ6d5rCFqMu1doh K2JF6psVCkplCW2WexsE4T2NQu6vHT5WaIGKGuTfSwLQWTbfRPUK1438Levk0cAfcjUx ArvBaSW10ofYLBi3Gv6iZiA4FBxp/RxWeCfzvuHHsUIsSdUA3tqULXPcDkcehvcGFwIp 2cqFgYxUWxwJsrYilr3WfBygemBFxj+LnCA409N143SV3wxAxcxn3LoPGlk3LusvZU6V 7Ndj1IhDrNT0ecbwhlfObYrr6Ig421nq37NGtqPVjFw8WOG6gVMG+KUvb3RJUJD+fNTW EpeQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=M69yLaEzLK7DaR3NFq6YCujIiuVhHF8iNmH+wctAmSI=; b=fWNNVNVear0lbiti2wYeY5jPdGTEvulzsDjSnF5kueSU5/UklXUBSCbwOqDCGC62ux 1+DUJeYtaFoTsP2iDGkocfDUDgqpLM7yB1vr0UxlddXjLCdGFgc4tgM41nUeDhf7rP/N Wf4+m6uKUpJy6YsGdamOPzgb0fnSBwi2v3Rwmvc9CwsSXmRuIdiwSrPnohgAamapM/sM RLOS1E2IVzXVYyjiYbf4VC3ESNktFC9NKFPbbt8fNH/sVWVd6oiNu4YlGDfYJVouihz2 aPizmNO77WovWqYahfda9aGdVlzjyObtD3OeDXkdStcxLNScsK0jpt8prcGs+fT5GBXl 4FGg==
X-Gm-Message-State: AOAM530rDvyVG9jl8Zj0MgAQ+rNCdaBWXhzxPl3xSsHHymIxDmVQXImk 7PGpgutRgZ9sHFTYyPPEIV0=
X-Google-Smtp-Source: ABdhPJxxmnC44XoqpOFxEzNrdtPjvDhxvCC6b6H94ak9ZjVsGyOr9KMZw68WOCY+4yQLlYRy9XpBug==
X-Received: by 2002:a05:600c:1ca3:: with SMTP id k35mr13707301wms.174.1627044096222; Fri, 23 Jul 2021 05:41:36 -0700 (PDT)
Received: from [] ( []) by with ESMTPSA id a10sm32378942wrm.12.2021. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 23 Jul 2021 05:41:35 -0700 (PDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.\))
From: Christopher Dearlove <>
In-Reply-To: <>
Date: Fri, 23 Jul 2021 13:41:33 +0100
Cc: "" <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <>
To: "Rogge, Henning" <>
X-Mailer: Apple Mail (2.3654.
Archived-At: <>
Subject: Re: [manet] RFC7181 (OLSRv2) trouble with ANSN and router restart
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mobile Ad-hoc Networks <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 23 Jul 2021 12:41:44 -0000

Hi to those who remember me, I’m still around, although retired now. (Unless a bit of consultancy pops up.)

Not so interested in DLEP, but keep an eye on OLSRv2 etc.

I’m failing to understand the specific problem here. There are restart issues that we never found the bandwidth to formally address, but they are different to this one.

Comments below.

> On 22 Jul 2021, at 08:12, Rogge, Henning <> wrote:
> Hi,
> I think I have potentially identified an issue with OLSRv2 that can lead to a stable desynchronization of the OLSRv2 TC database after a router restart.
> The trouble happens because the ANSN (Advertised Neighbor Set Number) can (and should) become stable when the locally reachable neighbors and their metrics don't restart anymore.
> The sequence that leads to the issue is:
> 1) router A restarts
> 2) router A (randomly?) selects a new Message Sequence Number which is HIGHER (in terms of cyclical comparison) than the last one it used
> 3) router A selects a new ANSN which is LOWER (or the same) than the last one it used

This is unfortunate, and part of the problem, router A will find it hard to get back into the network. The easiest way is simply to wait out expiry times, but that might not be an issue if those have become long (which is a bandwidth saving approach in a stable network - but nothing unfortunately is free).

But the suggestion here appears to be different.

> 4) router B sees the new message sequence number/ANSN in TCs from router A
>   => router B does not allow the old TC data to timeout (message sequence number is higher!)
>   => router B does NOT overwrite the old TC data (ANSN is lower)

Yes, the old data won’t be overwritten. But nothing affects the timeout. It doesn’t happen, but it isn’t extended. point 1, the TC message will be discarded. So Topology Tuples etc. aren’t modified and will in due course timeout. Maybe that’s too slow, but that’s a different issue.

> this situation will continue as long as the ANSN or router A (which can be stable for an arbitrary time) stays below the ANSN used by the router before its restart
> There are two parts to this problem, one of them easy to fix.
> a) the ANSN after the restart is lower than the ANSN before. We could just demand that a router does NOT increase the validity time of the TC entry in this case... or that it overwrites the TC entry (the combination of "new message seqno" and "old ANSN" should only happen after a restart)

(a) already happens, paragraph as quoted. No change needed.

> b) the ANSN after the restart is the SAME as before... this is tricky, I have no idea how to resolve this at the receiver without comparing the TC data with the database, which is not reliable when we deal with incomplete TCs.

This one might be a problem. Unlucky with 2^16 numbers to pick. But not unlucky enough. But we are into the region of how to restart a router, and advice to give it. Using more than one ANSN is one way to solve that. However the composite of MSN and ANSN is a tricky one. I can see (I think) a horribly inefficient way to make it work, but an efficient way - other than just wait out timeouts - needs more thought.

A restarting router that uses incomplete TCs though is not good behaviour. That would be part of the advice to restarting routers.

(I assume we are considering routers that restart with no memory of their last MSN and ANSN. Those with memory can simply continue.)

> What do you think?
> Henning Rogge
> _______________________________________________
> manet mailing list