[manet] RFC7181 (OLSRv2) trouble with ANSN and router restart

"Rogge, Henning" <henning.rogge@fkie.fraunhofer.de> Thu, 22 July 2021 07:12 UTC

Return-Path: <henning.rogge@fkie.fraunhofer.de>
X-Original-To: manet@ietfa.amsl.com
Delivered-To: manet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 941AB3A3BBA for <manet@ietfa.amsl.com>; Thu, 22 Jul 2021 00:12:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.999
X-Spam-Level:
X-Spam-Status: No, score=-6.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=fkie.fraunhofer.de
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id q-fMKhyDiqhF for <manet@ietfa.amsl.com>; Thu, 22 Jul 2021 00:12:32 -0700 (PDT)
Received: from mail-edgeS23.fraunhofer.de (mail-edges23.fraunhofer.de [153.97.7.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B39C53A3BBC for <manet@ietf.org>; Thu, 22 Jul 2021 00:12:30 -0700 (PDT)
IronPort-SDR: t/bic6qahHnfLP3du7RfnzTbqoNCNBRifqL6PWjHtu4HCh7xdtNMf3kSHwqXNZGFTml7IK1ur6 b48wzpT6/LxQ==
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: =?us-ascii?q?A2HSBQDFGPlg/xoBYJlaHgEBCxIMQIF?= =?us-ascii?q?OCwKBUYMRC41BiGCbYYF8CwEBAQEBAQEBAQk/AgQBAQMDh0sBJTUIDgIEAQE?= =?us-ascii?q?BEgEBBgEBAQEBBgQCAoEKhWgBDEMBEAGDAYEIAQEBAQEBAQEBAQEBAQEBAQE?= =?us-ascii?q?BARYCCFJMZgEBOBEBKTQjJwSGCwEBqgqBNIEBggcBAQaCWYUaCYE6AYlwhCK?= =?us-ascii?q?BZkOBFTaEMYk7gxc0gRYXZUBDn3udaAMEA4F9gSieOiuDHJIvkRSWCKU1gT8?= =?us-ascii?q?jAWQbgRNxgzhQFwIOjh8XhA2KK3M4AgYLAQEDCXyCV4Z6AYEQAQE?=
X-IPAS-Result: =?us-ascii?q?A2HSBQDFGPlg/xoBYJlaHgEBCxIMQIFOCwKBUYMRC41Bi?= =?us-ascii?q?GCbYYF8CwEBAQEBAQEBAQk/AgQBAQMDh0sBJTUIDgIEAQEBEgEBBgEBAQEBB?= =?us-ascii?q?gQCAoEKhWgBDEMBEAGDAYEIAQEBAQEBAQEBAQEBAQEBAQEBARYCCFJMZgEBO?= =?us-ascii?q?BEBKTQjJwSGCwEBqgqBNIEBggcBAQaCWYUaCYE6AYlwhCKBZkOBFTaEMYk7g?= =?us-ascii?q?xc0gRYXZUBDn3udaAMEA4F9gSieOiuDHJIvkRSWCKU1gT8jAWQbgRNxgzhQF?= =?us-ascii?q?wIOjh8XhA2KK3M4AgYLAQEDCXyCV4Z6AYEQAQE?=
X-IronPort-AV: E=Sophos;i="5.84,260,1620684000"; d="scan'208";a="30605362"
Received: from mail-mtaka26.fraunhofer.de ([153.96.1.26]) by mail-edgeS23.fraunhofer.de with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jul 2021 09:12:27 +0200
IronPort-SDR: 30p9NoEMM+WLxcCgqXWagQB9mlcVb0ko6Gr4rPQU16U6Bhq3CSlCdJPSzXNT//gBA85ijbMu3i 7+kyNA3I6WwpFRr3Tm0zhxOWopmTYfAuQ=
IronPort-HdrOrdr: =?us-ascii?q?A9a23=3AWfAWn6qw36lwZ68S0gMbQbYaV5q0eYIsim?= =?us-ascii?q?QD101hICG9JPbo7PxG/c5rpCMc5wx+ZJhNo7G90IfpewK5yXde2/h1AV7aZn?= =?us-ascii?q?idhILKFvAd0WKB+UyCJ8SkzJ8l6U4IScEXY+EYa2IbsS+Q2mWF+qMbsaW6Gd?= =?us-ascii?q?eT9JrjJhlWID2DW8tbhTuRIzzranFedU1tA4YjDpbZxscvnUvGRV0nKu68Gm?= =?us-ascii?q?IeU6zlr9nG/aiWByLufyRXijWmvHeN7rj2FhTd+AwfXTNJyaoj9maAqQbj5r?= =?us-ascii?q?y/2svLtSM1AQfogKhrpA=3D=3D?=
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0AWBwBtGflg/wUDB4BaHgEBCxIMQIF?= =?us-ascii?q?OC4FTgik3MQuNQYhgm2GBfAsBAwEBAQEBCQQ7AgQBAYdQJjUIDgIEAQEBEgE?= =?us-ascii?q?BAQQBAQECAQYCAQF7E4VoAQyHDAEBOBEBKTQjJwSGDAGqC4E0gQGCBwEBBoJ?= =?us-ascii?q?ZhRoJgTqJcYYIQ4EVNoQxiTuDFzSBFhdlQEOfe51oAwQDgX2BKJ46K4Mcki+?= =?us-ascii?q?RFJYIpTWBPyMBOCsbgRNxgzhQFwIOjh8XhA2KK3M4AgYLAQEDCXyJUQGBEAE?= =?us-ascii?q?B?=
X-IPAS-Result: =?us-ascii?q?A0AWBwBtGflg/wUDB4BaHgEBCxIMQIFOC4FTgik3MQuNQ?= =?us-ascii?q?Yhgm2GBfAsBAwEBAQEBCQQ7AgQBAYdQJjUIDgIEAQEBEgEBAQQBAQECAQYCA?= =?us-ascii?q?QF7E4VoAQyHDAEBOBEBKTQjJwSGDAGqC4E0gQGCBwEBBoJZhRoJgTqJcYYIQ?= =?us-ascii?q?4EVNoQxiTuDFzSBFhdlQEOfe51oAwQDgX2BKJ46K4Mcki+RFJYIpTWBPyMBO?= =?us-ascii?q?CsbgRNxgzhQFwIOjh8XhA2KK3M4AgYLAQEDCXyJUQGBEAEB?=
X-IronPort-AV: E=Sophos;i="5.84,260,1620684000"; d="scan'208";a="119423690"
X-IronPort-Outbreak-Status: No, level 0, Unknown - Unknown
Received: from mailguard.fkie.fraunhofer.de (HELO a.mx.fkie.fraunhofer.de) ([128.7.3.5]) by mail-mtaKA26.fraunhofer.de with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jul 2021 09:12:24 +0200
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=fkie.fraunhofer.de; s=dkim202105; h=MIME-Version:Content-Transfer-Encoding: Content-Type:Message-ID:Date:Subject:To:From:Sender:Reply-To:Cc:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=F4LHtulBsLqotdkSOJz27bjXutn2QSZoy+zLISwX2yg=; b=zgAWcETgdNHVU5zLAgpx9ORkug Kr3MqlCasEO8v3u7bzqu0Uc7uHIBh9PIBZqe/cKvu9urGnnQH/07Ht9Bx6huob+z56Wabgg1b6p6d nBOSqHXyGj0dxPlmy+MV0z7LIPrjgoRGSO1OFDRziR/Y86+M1u+G2qIex8rZb32UZqdOHkwxsQC0k schjSQdHFOEV6koh1HeoAtCj9IH+iMXyLD6UjRq0QDOkQ9LTIx/CopFGwRUxFWn1TzS35SdFhVpWj QE0aWQBTj3E1Q+bOrhtQ9UGUeUag9FMNFRinlLRV7d9wXsIyFmbSjjm36CV9aeF62wnp9Uew4wl8B dYWM0Ctw==;
Received: from srv-mailhost-b.fkie.fraunhofer.de ([128.7.10.131]) by a.mx.fkie.fraunhofer.de with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <henning.rogge@fkie.fraunhofer.de>) id 1m6Ssa-0000rH-6q for manet@ietf.org; Thu, 22 Jul 2021 09:12:24 +0200
Received: from srv-mail-01.fkie.fraunhofer.de ([128.7.11.16] helo=srv-mail-01.gaia.fkie.fraunhofer.de) by srv-mailhost-b.fkie.fraunhofer.de with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from <henning.rogge@fkie.fraunhofer.de>) id 1m6SsW-0007JU-73 for manet@ietf.org; Thu, 22 Jul 2021 09:12:20 +0200
Received: from srv-mail-02.gaia.fkie.fraunhofer.de (128.7.11.17) by srv-mail-01.gaia.fkie.fraunhofer.de (128.7.11.16) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 22 Jul 2021 09:12:23 +0200
Received: from srv-mail-02.gaia.fkie.fraunhofer.de ([fe80::4814:5c1e:3b5c:a4c1]) by srv-mail-02.gaia.fkie.fraunhofer.de ([fe80::4814:5c1e:3b5c:a4c1%13]) with mapi id 15.00.1497.018; Thu, 22 Jul 2021 09:12:22 +0200
From: "Rogge, Henning" <henning.rogge@fkie.fraunhofer.de>
To: "manet@ietf.org" <manet@ietf.org>
Thread-Topic: RFC7181 (OLSRv2) trouble with ANSN and router restart
Thread-Index: AQHXfseSiLtj09eyb0WM8ePG4fmzGg==
Date: Thu, 22 Jul 2021 07:12:22 +0000
Message-ID: <1626937943164.99401@fkie.fraunhofer.de>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [128.7.4.48]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/manet/I23DP2oUVQA5VL_nJwGjMCMmFvU>
Subject: [manet] RFC7181 (OLSRv2) trouble with ANSN and router restart
X-BeenThere: manet@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mobile Ad-hoc Networks <manet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/manet>, <mailto:manet-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/manet/>
List-Post: <mailto:manet@ietf.org>
List-Help: <mailto:manet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/manet>, <mailto:manet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jul 2021 07:12:37 -0000

Hi,

I think I have potentially identified an issue with OLSRv2 that can lead to a stable desynchronization of the OLSRv2 TC database after a router restart.

The trouble happens because the ANSN (Advertised Neighbor Set Number) can (and should) become stable when the locally reachable neighbors and their metrics don't restart anymore.

The sequence that leads to the issue is:

1) router A restarts
2) router A (randomly?) selects a new Message Sequence Number which is HIGHER (in terms of cyclical comparison) than the last one it used
3) router A selects a new ANSN which is LOWER (or the same) than the last one it used
4) router B sees the new message sequence number/ANSN in TCs from router A
   => router B does not allow the old TC data to timeout (message sequence number is higher!)
   => router B does NOT overwrite the old TC data (ANSN is lower)

this situation will continue as long as the ANSN or router A (which can be stable for an arbitrary time) stays below the ANSN used by the router before its restart

There are two parts to this problem, one of them easy to fix.

a) the ANSN after the restart is lower than the ANSN before. We could just demand that a router does NOT increase the validity time of the TC entry in this case... or that it overwrites the TC entry (the combination of "new message seqno" and "old ANSN" should only happen after a restart)
b) the ANSN after the restart is the SAME as before... this is tricky, I have no idea how to resolve this at the receiver without comparing the TC data with the database, which is not reliable when we deal with incomplete TCs.

What do you think?

Henning Rogge