Re: [Rift] AD Review of draft-ietf-rift-rift-12 (Part 2a)

"Pascal Thubert (pthubert)" <pthubert@cisco.com> Wed, 07 April 2021 13:22 UTC

Return-Path: <pthubert@cisco.com>
X-Original-To: rift@ietfa.amsl.com
Delivered-To: rift@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A1B223A452D; Wed, 7 Apr 2021 06:22:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -11.817
X-Spam-Level:
X-Spam-Status: No, score=-11.817 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, HTML_TAG_BALANCE_BODY=0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_NONE=0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com header.b=NQkmi2yX; dkim=pass (1024-bit key) header.d=cisco.onmicrosoft.com header.b=k9zTTfU3
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jDqhUmCC1cuB; Wed, 7 Apr 2021 06:22:16 -0700 (PDT)
Received: from rcdn-iport-6.cisco.com (rcdn-iport-6.cisco.com [173.37.86.77]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 090653A452B; Wed, 7 Apr 2021 06:22:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=922242; q=dns/txt; s=iport; t=1617801735; x=1619011335; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=doHMOqpLIf+EYZBvLCcoX1hzBcX6rqWymuSzKrfp9bU=; b=NQkmi2yXJII7sRtyJWYjE3tqzNE5F6vMjs23StZnAoNxnWSuk1/ycb+5 kYckB3+J29DIytBw6uJUOILlrnGv1137/uEilzxprQlxV9Ayae/0Ki+XE lDMYfYoUGWc5/sVdqASdE+OBoK5/Lr+R/EAKAYp+sM/gNNVgE4vRc5CR+ s=;
X-Files: draft-ietf-rift-13.txt : 355767
IronPort-PHdr: A9a23:SjVs9RQiSrXMjQ8p7NWOt1f+99pso0/LVj590bIulq5Of6K//p/rIE3Y47B3gUTUWZnAg9pLjuPXt+brXmlTqZqCsXVXdptKWldFjMgNhAUvDYaDDlGzN//laSE2XaEgHF9o9n22Kw5ZTcD5YVCBrXi77DpUERL6ZkJ5I+3vEdvUiMK6n+m555zUZVBOgzywKbN/JRm7t0PfrM4T1IBjMa02jBDOpyggRg==
IronPort-HdrOrdr: A9a23:ovQVbK6SHnAVjJCT7QPXwbGFI+orLtY04lQ7vn1ZYSd+NuSFisGjm+ka3xfoiDAXHEotg8yEJbPoexLh3LZPy800Ma25VAfr/FGpIoZr8Jf4z1TbdRHW3tV2kZ1te60WMrLNJHBxh8ri/U2cG9Ev3NGI/MmT9Jnj5l1GJDsaDJ1IxQF/FwqdDwlSTA5JGZI2GPOnl7t6jhCnfmkaadn+O2kdU4H41pz2vb/FQTpDPR4o7wGSkSilgYSbLzG01goTOgk/uosK3GTLnxfj6qjmnv2/ygDRzH+71eUtpPLP0d1Gbfb87/Q9CjKpsQqwYZQkZrvqhkFLnMiKyHIH1ObBuA0hOcMb0QKQQkiQrQH20wftlBYCgkWStGOwunforcznSD9SMaMo7ug1Hmq7migdlepx365R02WSu4A/N2K9oA3G+9PKWxt2/3DEx0YKrO8Jg3RTFasYZbNBxLZvhH99LZYaECr2rL0gCellZfusncp+TFXyVQG8gkBfhPiXGlgjFBaPRUYP/uaP1SJNoXx/x0wEgOQCg3Yp7vsGOtp5ztWBFp4tuKBFT8cQY644LvwGW9GLBmvERg+JGH6OIG7gCLoMNxv22tzKyYRwwNvvVI0DzZM0lpiEekhfr3QOd0XnDtDL+5FX7BbXQiGYUS72ws9To7h104eMAYbDAGmmchQDgsGgq/IQDonwQPCoIq9bBPflMC/gAoBM0wriW4RDKHUXXcEP0+xLHG6mk4buEMnHp+bbePHcKP7GCjA/QF7yBXMFQXzyKax7nwaWc069pCKUd2Lme0T58541OrPd5fIvxI8EMZAJtgAUjF++99yaMDEqiN1uQGJOZJfc1o+rr2i/+mjFq09zPABGM0pT6LL8F3VQpQELNEvwea0Zu8qWfH1T2HfvHG46c+rmVCpk43hn86O+KJKdgQo4Dci8D26ch3wP4G6RQ4wEga2F78f9cpY+BpIrMZYBTTnjJlhQo0JHuW1DYAgLSgvjDTvok7yil4FRLvrYbcNAjACiJtN0pXrTuV6Hn9wmQmIWUleVIJWqqDdrYwARp1Vqt4cDnbKLmF+UWBsCqdV9FGcJVUO6L/ZtCh+faIBdh7bxETsAPluitHi9kBE8em3j6kMIoHfuRBfkI83jMx56pm1S1Lrs/RdScGiQFngANkxSgMlaCXnMvGp13KuwQpeLl0GValcE34gmQWz4SDMPPwJjwM223haJmDCEUW4r3IkqI/a1NsVQT5jDnnyqM4GGjqcAArtd+4tkLsnntqsRXfuYYBL9FkK1N8o5nwiUrG0iIi96tT0tlu7pwgTs6AGDrTQCKOuXJFRtXLcAJd6Aq2DiWvaTyZ18ydY4p/G5PGm0atmIz8jsHnN+AwKWpW69VOczr59I+ao0qbtoBpHeFSLSy2sv5mR2EO7k0EcFBKhr6rHIPYFiO8QUZiJC51Is0NCCNlEivAD6CvI3FGtdwEPzLpeM+f7FuLAvCkqOqE/rNV6T/zZU8v3FUyGAvIRqQJ4YMCBTcgwx+X5i9OSNe8nMEw2sbfhE50f/PXmncrNRIZL1V4k4v1J/+ZWPkOCWfSajh1yVsjt/P65U82GoBcm1GxmBHOZU89q8fVSA65Hal/KbnXPyU3+8bU9dmIhOMUoXZc5HgiM5jII23jOpI5aH6n4Ngh9b+3V/ilXp2oK6+2/VEkFNLB3BjvxtLE1uG2nNid6A7POR23v86iVUwJXPFE9feddVBtgbJ7KHWRtGOIwXp76n/60mnyRFblMvFgcH+UXA498=
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CiBADjsG1g/5FdJa27ZoZ2Sx0DBgSNbDrAa5Fd
X-IronPort-AV: E=Sophos;i="5.82,203,1613433600"; d="txt'?scan'208,217";a="883156343"
Received: from rcdn-core-9.cisco.com ([173.37.93.145]) by rcdn-iport-6.cisco.com with ESMTP/TLS/DHE-RSA-SEED-SHA; 07 Apr 2021 13:22:13 +0000
Received: from mail.cisco.com (xbe-rcd-002.cisco.com [173.37.102.17]) by rcdn-core-9.cisco.com (8.15.2/8.15.2) with ESMTPS id 137DMD4I005683 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=OK); Wed, 7 Apr 2021 13:22:13 GMT
Received: from xfe-rcd-005.cisco.com (173.37.227.253) by xbe-rcd-002.cisco.com (173.37.102.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.792.3; Wed, 7 Apr 2021 08:22:01 -0500
Received: from xfe-rcd-005.cisco.com (173.37.227.253) by xfe-rcd-005.cisco.com (173.37.227.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.792.3; Wed, 7 Apr 2021 08:21:59 -0500
Received: from NAM12-DM6-obe.outbound.protection.outlook.com (72.163.14.9) by xfe-rcd-005.cisco.com (173.37.227.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.792.3 via Frontend Transport; Wed, 7 Apr 2021 08:21:59 -0500
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=be6MoQJk/ZvipJWn/N+CQllfPrMmJfR4AjJr/BIJQIqozw4oihLQ1G+azEDXAbhiuoR6d9ihvxvX07dkGob4fziPIgWK8Rmkiuj5Re5zir7UCMQGUA/A4wvbl3FTWkIj39zZ8hSuOGWdrpE8oYIo8e6y+jvjodqU1/FyiRHw47gq+NAcTXpUI2f5wF6/K4GTC14QjH16S5h04RyJkiyMm2zgxZpEXyCU7GoNR2+t2Qgl5KtYG8iTuXq3Nl6ZRqijsO/xdLgt6txnpJ5dHcnH0PYIzbwQFkcdCMJCHGLBf79WuPrsrvq4MlIgYlPg0lj1rgBRbzFUGDPLcFt5DqM5/A==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bGup4X6phdEc7/Xet+zKnvRknklDZAZD804e2tsTlnA=; b=P3ghNFSnRvF1kzK4FnUs/hTbGUdpzIE6UrNrfPPFuWkcCb7P2I8JoH8ED9r1en7suAeBiRiKP/7hlaH2sIiwUlrjRPwsgUzJnjswxhO8moRgBhAxHq15V4MyBi6/qn5eB/524VonH6Pes2APgN8eRDhPHXb7k6Ba1uuGkTIpFWXfd16Top1I4f2UmCVeUS0F5gPVt5T0CCB2qSWlzci0RXwUIY6c6Heo+bHsaSuaPisgv7WEFGlyAWbIKSZ/K76iiWokPS5vtnZVzyKqK2WXZycoBzVaJY08IUSqJD56MjC0UB/Y8Lm4ETgkH7mUi8vaIPwRVs+ZsOJhtArPK0UTPg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cisco.com; dmarc=pass action=none header.from=cisco.com; dkim=pass header.d=cisco.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cisco.onmicrosoft.com; s=selector2-cisco-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bGup4X6phdEc7/Xet+zKnvRknklDZAZD804e2tsTlnA=; b=k9zTTfU3SFhcFvZ5f/3M3Tt+kUy/II/SCPaFKg/M4hLi72xCizS/G2bmlZ9+XNMsQb4U+IGle8N0mgfoT9yGJbQJHq3fzOqvFA4BKqR1GRR5ujrxXaDrABJ9rZtQGJxMph2cBiLQJt+3Co3ImLH+xpXKSGfogahgf/gSsor5vLo=
Received: from CO1PR11MB4881.namprd11.prod.outlook.com (2603:10b6:303:91::20) by MWHPR1101MB2365.namprd11.prod.outlook.com (2603:10b6:300:74::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.18; Wed, 7 Apr 2021 13:21:55 +0000
Received: from CO1PR11MB4881.namprd11.prod.outlook.com ([fe80::cd01:ffc9:6592:b1d5]) by CO1PR11MB4881.namprd11.prod.outlook.com ([fe80::cd01:ffc9:6592:b1d5%6]) with mapi id 15.20.3999.032; Wed, 7 Apr 2021 13:21:55 +0000
From: "Pascal Thubert (pthubert)" <pthubert@cisco.com>
To: Alvaro Retana <aretana.ietf@gmail.com>, "draft-ietf-rift-rift@ietf.org" <draft-ietf-rift-rift@ietf.org>
CC: "zhang.zheng@zte.com.cn" <zhang.zheng@zte.com.cn>, "rift-chairs@ietf.org" <rift-chairs@ietf.org>, "rift@ietf.org" <rift@ietf.org>
Thread-Topic: [Rift] AD Review of draft-ietf-rift-rift-12 (Part 2a)
Thread-Index: AQHXEdVVJZXc044GCU2P7+ux6/HvCaqnjXGQ
Date: Wed, 07 Apr 2021 13:20:51 +0000
Deferred-Delivery: Wed, 7 Apr 2021 13:18:38 +0000
Message-ID: <CO1PR11MB48817A1D56902CAB54013A95D8759@CO1PR11MB4881.namprd11.prod.outlook.com>
References: <CAMMESsxBr0+UriSaTDVZMrFzU6DSiuC3-wO4+7HgX4nX7SLHmg@mail.gmail.com>
In-Reply-To: <CAMMESsxBr0+UriSaTDVZMrFzU6DSiuC3-wO4+7HgX4nX7SLHmg@mail.gmail.com>
Accept-Language: fr-FR, en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
authentication-results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=cisco.com;
x-originating-ip: [2a01:cb1d:4ec:2200:d461:9b17:f630:23a6]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 5be65b88-c118-47a5-6314-08d8f9c81d4a
x-ms-traffictypediagnostic: MWHPR1101MB2365:
x-microsoft-antispam-prvs: <MWHPR1101MB2365983F1EC4813BCCFFE0E8D8759@MWHPR1101MB2365.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:7219;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: IVgL9QFTn8fGtqNcj4fcQYd7FZM1GJ48bhUTxmx4X/HhMEy/WOF6b3ed/yI6xy9P081Nzzcfz+yoSpNfOg3qfBVP8VunrOtZdy52jA2n2wIQu3plYc3Xhof6Haybiqs00c2AL9Yo5yTCSfJ4i+XNXoURbnwIEOU1iOb5PQkyXGpL2oyTkgsjNpup8rGlaC+YB5SsyDGPVYBu9iQK05cLUuZ6ZK0L4EmPVd/ekRrfDekNOoXIgtvmH+ZdZMu+go/2NJvUUT4CcETQ+OJEFESUPDOyNsx3DFU43SdijN86fBuCh9ntBruSrwIk3zGE3LZwV3UVtquzzr42QSp9HB7gc/uKJYjLcNvDsvRxAQeLC990thWEWQz+pn/Pz70ErDsz2qCrjd30ihhCAslTR8lmb6Zu/Y1gK2pQEjHTiuzCwMInTnJ0i02KFD/hmqxGo8saAF3VOCEJViXGOBXGn9othCovwfFH18tEi4W7Jqv/qZobQekKddbain7daklilFXV1PCFIXt090e+/+qXa/3Yb+WZN5vvg6n2tTXSU2IAl/t8rhKIsKRJeB8bSeDduo0q/iceWticouvo0LmNvBrrxGbaZp5o5lNyeZjhUmQrfLLkhvMp8M/lzxr+xKyoYbrGjAHdqgrBaEEPaXeMbJbU5NrYFK4so7Nnb1UXv3NrjeVQBHaFd5V9ChhfqZlMs96vzbX7guH+iBdzEUBMLDSkmtl7nWs77RQJlReVh9gk8jPUXhVPuA/meid1qVUj60iG
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CO1PR11MB4881.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(39860400002)(346002)(136003)(376002)(396003)(366004)(5660300002)(18074004)(66476007)(66556008)(6506007)(4326008)(53546011)(52536014)(38100700001)(66946007)(66616009)(186003)(64756008)(7696005)(76116006)(33656002)(66574015)(30864003)(9686003)(54906003)(478600001)(2906002)(6666004)(166002)(99936003)(316002)(86362001)(8936002)(966005)(66446008)(71200400001)(8676002)(55016002)(83380400001)(110136005)(559001)(569008); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: sdB8Q1rChueohTRHJYHSDMorTdO0lTup6BUWMzH5gruQhsifDlcQ7//lCO3+l4u36sPIqnCeq08TOGcJtAFiaQlDw/Y/DIr+aY+xRUQZvXqh64y4hPsVmUvaMT2ucCDZXj4hADZOcwIftmdC1ELTj3EZnFsB+Wj2iDffnp7I2c5j9B4kO06hxnm+JrTI2vpVcI2cdhPvWABy+Pl0XotSaw8+HpQlKoeCC/THID4Y0DJ+xy+j9sJX6Ugmn8a7JW6FmZ05bWl5GFDzNgNoSE+hpIJTxylY6K3WqdPi95JeNV8HloHU+oSWAYbk2WpEuS5bxC/W5WW0lo0YEUnY5zz5C+OZDblUa3VhlYLqhYT9wRiDsoErym2ggdqRCNXAtlWYs2QJ+xI1woktZYoho2zgPipgnxtXcUW9+8KrPSCmrxWEua0y0Sb3GEtOcY1X2iPKIiPuyGIGBJVCct5zdPUC4enl+omefXkZsp1sNeEuOSiK8tSAez7qtzgYiPygU+mImMhSKNVzUwDuHUUuhj+yh0V5LIIknbViZBEdDltPuYmq8rWig0BcphKfrFcm5DIvNmNMvRNWtBM5Q1HgluQrpYpB5iLIbm8ckdwUKTZGUTYFjEiLpuTu4ivCx2NJNk7epSkfMqmx9jRAfREb6/27tcMqae/f/2mVLR+qwBNnorqzS8jq1WGDLalBwDwPfsluHUO0VH14HqWwiONPinN8khnHPz9u2k8KudxbrUB77wB8BMfKSsocJCSr6hVT+jvdDIE23DG3RXL8X3HAPNkTcOORmmVsAyHN9Tzpa0Fmz6+xCYPjx/bMc9uwXPBs3DFcMo5HRDCxM7UwX9bpMQYcz9PapxvbwRB7fmPfAOl8c2rT4x/p8xIibHF6SV0aQhzW551Cj/qp/YHsDsMoybTC8kx3bZqg9tnN9HqSYp85T+5HlvD0Y2Vqnat/0XutBEsgZ7WU4VCy4+xpKbCcDY/KbtAHrocJyWZl4JZy9OaQIQxUUtEE6PJIlozo+tLK7H1DODcD3XGXfr+DhU0FL2pg44q2kojOnjfnnOpWzhLwAEvVn11R4VRATWS80v5My3PEJjwEbZN0TUdR9cHLXHOMzG3T8+SrPY9uJeLWob0NKEbpy50q4PWjOfiJIVpWX5bPWMDhrUTtxr3HbjdnjltdzJCZmF1a0OfBM44dQasWgJlsdGjbSuq7JCyq8G+PMhEPuAPeXHyvdwFOFc1MPdtJwkKIwq3VF42ZSzURLtsvxLyGVJAAp2g1nnUftJAyakUhkoLS3JOgNbelXzvonIikX/dp6R++bcr5mbkF809IeOV9z5aUvbCke5SLuvn69TqI4C/UxJHIszI5Cs3ZEJ3siGh7BubxaounQIC9gFZT1zHI9YdgqhWmVlDP6w4loYrK
x-ms-exchange-transport-forked: True
Content-Type: multipart/mixed; boundary="_004_CO1PR11MB48817A1D56902CAB54013A95D8759CO1PR11MB4881namp_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: CO1PR11MB4881.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 5be65b88-c118-47a5-6314-08d8f9c81d4a
X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Apr 2021 13:21:55.1621 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5ae1af62-9505-4097-a69a-c1553ef7840e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: l0oDMy+qeAXmcsyN2oT4mhM5/+pHwhKCyTns+rwo9wYRBMTdEAQlNZSTV+qYg90g+0SQd5w+HocJgsikb9RYEA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR1101MB2365
X-OriginatorOrg: cisco.com
X-Outbound-SMTP-Client: 173.37.102.17, xbe-rcd-002.cisco.com
X-Outbound-Node: rcdn-core-9.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/rift/OkHRTuIxhNLT5ah43I0S67LGE4I>
Subject: Re: [Rift] AD Review of draft-ietf-rift-rift-12 (Part 2a)
X-BeenThere: rift@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of Routing in Fat Trees <rift.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rift>, <mailto:rift-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rift/>
List-Post: <mailto:rift@ietf.org>
List-Help: <mailto:rift-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rift>, <mailto:rift-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Apr 2021 13:22:26 -0000

Dear all;

Continuing changes starting at line 1014 of review part 1. I attached the result for Alvaro’s convenience.

The main commits are https://github.com/przygienda/rift-draft/commit/8d1d382c9ae2321fa7d488a4ba1881361c87e542
and https://github.com/przygienda/rift-draft/commit/237655f8d04cd33f1e5b151e3c5f9ae190b34594

please see below:

1013      with this, positive disaggregation can heal all failures and still
1014      allow all the ToF nodes to see each other via south reflection.
1015      Disaggregation will be explained in further detail in Section 4.2.5.

[nit] s/deployment is it introduces/deployment is, it introduces

Done

1017      In order to scale beyond the "single plane limit", the Top-of-Fabric
1018      can be partitioned by a N number of identically wired planes where N
1019      is an integer divider of K_LEAF.  The 1:1 ratio and the desired
1020      symmetry are still served, this time with (K_TOP * N) ToF nodes, each
1021      of (P * K_LEAF / N) ports.  N=1 represents a non-partitioned Spine
1022      and N=K_LEAF is a maximally partitioned Spine.  Further, if R is any
1023      integer divisor of K_LEAF, then N=K_LEAF/R is a feasible number of
1024      planes and R a redundancy factor.  If proves convenient for
1025      deployments to use a radix for the leaf nodes that is a power of 2 so
1026      they can pick a number of planes that is a lower power of 2.  The
1027      example in Figure 11 splits the Spine in 2 planes with a redundancy
1028      factor R=3, meaning that there are 3 non-intersecting paths between
1029      any leaf node and any ToF node.  A ToF node must have, in this case,
1030      at least 3*P ports, and be directly connected to 3 of the 6 PoD-ToP
1031      nodes (spines) in each PoD.

[nit] s/by a N number/by an N number

Done


[minor] "(K_TOP * N) ToF nodes, each of (P * K_LEAF / N) ports"
Again, the use of the terminology without a reference assumes a
specific interpretation by the reader.

Yes the definition was screwed somehow. The best I can do is probably to revise/fix the definition in the terminology
“



   K: Denotes half of the radix of a symmetrical switch, meaning that

      the switch has K ports pointing north and K ports pointing south.

      K_LEAF (K of a leaf) thus represents both the number of access

      ports in a leaf Node and the maximum number of planes in the

      fabric, whereas K_TOP (K of a ToP) represents the number of leaves

      in the PoD and the number of ports pointing north in a ToP Node

      towards a higher spine level, thus the number of ToF nodes in a

      plane.  To simplify the visual aids, notations and further

      considerations, we assume that the switches are symmetrical, so K

      is set to Radix/2.
“

[minor] "if R is any integer divisor of K_LEAF, then N=K_LEAF/R is a
feasible number of planes and R a redundancy factor."  Please expand
on the meaning of the redundancy factor.

Added:
“
                                       R a redundancy factor that denotes the
    number of independent paths between 2 leaves within a plane.
“


[minor] "6 PoD-ToP nodes"  I count 8.

The leaves are vertical, half of their port in each plane. The have 6 ports north so there are 6 ToP nodes.
The ToP nodes are horizontal. 6 of them. What you count to 8 is K-TOP, the number of ports in a ToP Node.
My biggest regret is that the actual drawing is not built into a real switch. I patented the general structure of that 3-D box: https://pdfpiw.uspto.gov/.piw?docid=10973148&SectionNum=1&IDKey=CAD3F825B319&HomeUrl=http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2%2526Sect2=HITOFF%2526p=1%2526u=%25252Fnetahtml%25252FPTO%25252Fsearch-bool.html%2526r=1%2526f=G%2526l=50%2526co1=AND%2526d=PTXT%2526s1=%252522thubert%252Bpascal%252522.INNM.%2526OS=IN/%252522thubert%252Bpascal%252522%2526RS=IN/%252522thubert%252Bpascal%252522

Back to the text, I added:
“
    The ToP nodes are represented horizontally with K_TOP=8 ports
    northwards each.
“



1033           +---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+
1034         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1035         | | o |  | o |  | o |  | o |  | o |  | o |  | o |  | o | |
1036         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1037         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1038         | | o |  | o |  | o |  | o |  | o |  | o |  | o |  | o | |
1039         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1040         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1041         | | o |  | o |  | o |  | o |  | o |  | o |  | o |  | o | |
1042         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1043           +---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+

1045         Plane 1
1046        ----------- . ------------ . ------------ . ------------ . --------
1047         Plane 2

1049           +---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+
1050         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1051         | | o |  | o |  | o |  | o |  | o |  | o |  | o |  | o | |
1052         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1053         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1054         | | o |  | o |  | o |  | o |  | o |  | o |  | o |  | o | |
1055         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1056         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1057         | | o |  | o |  | o |  | o |  | o |  | o |  | o |  | o | |
1058         +-|   |--|   |--|   |--|   |--|   |--|   |--|   |--|   |-+
1059           +---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+
1060                    ^
1061                    |
1062                    |      ----------------
1063                    +----- Top-of-Fabric node
1064                          "across" depth
1065                           ----------------

1067       Figure 11: Northern View of a Multi-Plane ToF Level, K_LEAF=6, N=2

1069      At the extreme end of the spectrum it is even possible to fully
1070      partition the spine with N = K_LEAF and R=1, while maintaining
1071      connectivity between each leaf node and each Top-of-Fabric node.  In
1072      that case the ToF node connects to a single Port per PoD, so it
1073      appears as a single port in the projected view represented in
1074      Figure 12.  The number of ports required on the Spine Node is more or
1075      equal to P, the number of PoDs.

[minor] "more or equal to P"  ??

Changed to:
“
more than or equal to P
“

...
1121    4.1.3.  Fallen Leaf Problem
...
1140      In a maximally partitioned fabric, the redundancy factor is R= 1, so
1141      any breakage in the fabric may cause one or more fallen leaves.
1142      However, not all cases require disaggregation.  The following cases
1143      do not require particular action in such scenario:

[major] A quick look at §4.2.5.1 doesn't explicitly mention how a node
considers the redundancy factor...but that may be included in the "DAG
computation" mentioned in the first step.  I'm putting this comment
here so I don't forget later...


There is no operation in RIFT that is based on the value of R.
It’s just that after R breakages in a plane, it becomes possible that a leaf falls in that plane. If R=1, it’s guaranteed that a breakage will cause fallen leaves. R=2 guarantees that a single breakage will not cause fallen leaves.

Changed to:
“

   In a maximally partitioned fabric, the redundancy factor is R= 1, so
   any breakage in the fabric will cause one or more fallen leaves in
   the affected plane.  R=2 guarantees that a single breakage will not
   cause a fallen leaf.
“


1145         If a southern link on a node goes down, then connectivity through
1146         that node is lost for all nodes south of it.  There is no need to
1147         disaggregate since the connectivity to this node is lost for all
1148         spine nodes in a same fashion.

1150         If a ToF Node goes down, then northern traffic towards it is
1151         routed via alternate ToF nodes in the same plane and there is no
1152         need to disaggregate routes.
...
1159         If the breakage is the last northern link from a ToP node to a ToF
1160         node going down, then the fallen leaf problem affects only The ToF
1161         node, and the connectivity to all the nodes in the PoD is lost
1162         from that ToF node.  This can be observed by other ToF nodes
1163         within the plane where the ToP node is located and positively
1164         disaggregated within that plane.

[nit] s/only The ToF/only the ToF

done

1166      On the other hand, there is a need to disaggregate the routes to
1167      Fallen Leaves in a transitive fashion, all the way to the other
1168      leaves in the following cases:

[] Without having seen the specific mechanism, this overview is hard to digest.

Yes, I’d expect the reader to come back here later; this is for advanced rifters ; ) changed to
“
    On the other hand, there is a need to disaggregate the routes to
    Fallen Leaves within the plane in a transitive fashion, that is,
    all the way to the other leaves, in the following cases
“


1170      o  If the breakage is the last northern link from a leaf node within
1171         a plane (there is only one such link in a maximally partitioned
1172         fabric) that goes down, then connectivity to all unicast prefixes
1173         attached to the leaf node is lost within the plane where the link
1174         is located.  Southern Reflection by a leaf node, e.g., between ToP
1175         nodes, if the PoD has only 2 levels, happens in between planes,
1176         allowing the ToP nodes to detect the problem within the PoD where
1177         it occurs and positively disaggregate.  The breakage can be
1178         observed by the ToF nodes in the same plane through the North
1179         flooding of TIEs from the ToP nodes.  The ToF nodes however need
1180         to be aware of all the affected prefixes for the negative,
1181         possibly transitive disaggregation to be fully effective (i.e.  a
1182         node advertising in control plane that it cannot reach a certain
1183         more specific prefix than default whereas such disaggregation must
1184         in extreme condition propagate further down southbound).  The
1185         problem can also be observed by the ToF nodes in the other planes
1186         through the flooding of North TIEs from the affected leaf nodes,
1187         together with non-node North TIEs which indicate the affected
1188         prefixes.  To be effective in that case, the positive
1189         disaggregation must reach down to the nodes that make the plane
1190         selection, which are typically the ingress leaf nodes.  The
1191         information is not useful for routing in the intermediate levels.

[nit] s/in control plane/in the control plane
done


[nit] s/in extreme condition/in the extreme condition
done


1193      o  If the breakage is a ToP node in a maximally partitioned fabric -
1194         in which case it is the only ToP node serving the plane in that
1195         PoD - goes down, then the connectivity to all the nodes in the PoD
1196         is lost within the plane where the ToP node is located.
1197         Consequently, all leaves of the PoD fall in this plane.  Since the
1198         Southern Reflection between the ToF nodes happens only within a
1199         plane, ToF nodes in other planes cannot discover fallen leaves in
1200         a different plane.  They also cannot determine beyond their local
1201         plane whether a leaf node that was initially reachable has become
1202         unreachable.  As the breakage can be observed by the ToF nodes in
1203         the plane where the breakage happened, the ToF nodes in the plane
1204         need to be aware of all the affected prefixes for the negative
1205         disaggregation to be fully effective.  The problem can also be
1206         observed by the ToF nodes in the other planes through the flooding
1207         of North TIEs from the affected leaf nodes, if there are only 3
1208         levels and the ToP nodes are directly connected to the leaf nodes,
1209         and then again it can only be effective it is propagated
1210         transitively to the leaf, and useless above that level.

[nit] s/fabric -...- goes down,/fabric -...-,

Actually I meant
“

               If the breakage is a ToP node in a maximally
               partitioned fabric - in which case it is the only ToP
               node serving the plane in that PoD that goes down -
“



1212      For the sake of easy comprehension let us roll the abstractions back
1213      into a simple example and observe that in Figure 3 the loss of link
1214      Spine 122 to Leaf 122 will make Leaf 122 a fallen leaf for Top-of-
1215      Fabric plane B.  Worse, if the cabling was never present in first
1216      place, plane B will not even be able to know that such a fallen leaf
1217      exists.  Hence partitioning without further treatment results in two
1218      grave problems:

[] "For the sake of easy comprehension...Figure 3..."  Finally!
Hmmm...sorry...I mean, it is a little ironic that after all the new
terminology, detailed descriptions and figures, the clearer
explanation uses the simplest drawing.

I leave that to Tony


[nit] s/in first place/in the first place
done


1220      o  Leaf 111 trying to route to Leaf 122 MUST choose Spine 111 in
1221         plane A as its next hop since plane B will inevitably blackhole
1222         the packet when forwarding using default routes or do excessive
1223         bow tying.  This information must be in its routing table.

[major] s/MUST/must   This is not a Normative statement, just a
statement of fact (inside an example).
done


1225      o  Any kind of "flooding" or distance vector trying to deal with the
1226         problem by distributing host routes will be able to converge only
1227         using paths through leaves.  The flooding of information on Leaf
1228         122 would have to go up to Top-of-Fabric A and then "loopback"
1229         over other leaves to ToF B leading in extreme cases to traffic for
1230         Leaf 122 when presented to plane B taking an "inverted fabric"
1231         path where leaves start to serve as TOFs, at least for the
1232         duration of a protocol's convergence.

[] "Any kind of "flooding" or distance vector..."  I can guess the
meaning, but it would be better that I don't have to.   Maybe
something like: "Any advertisement..."


Changed to:
“
  o  A path computation trying to deal with the problem by distributing
      host routes may only form paths through leaves.

“


[minor] "information on Leaf 122"  s/on/ about (?), or maybe from. ??

Used “about”


1234    4.1.4.  Discovering Fallen Leaves

1236      As illustrated later, and without further proof, the way to deal with
1237      fallen leaves in multi-plane designs, when aggregation is used, is
1238      that RIFT requires all the ToF nodes to share the same north topology
1239      database.  This happens naturally in single plane design by the means
1240      of northbound flooding and south reflection but needs additional
1241      considerations in multi-plane fabrics.  To satisfy this RIFT, in
1242      multi-plane designs, relies at the ToF level on ring interconnection
1243      of switches in multiple planes.  Other solutions are possible but
1244      they either need more cabling or end up having much longer flooding
1245      paths and/or single points of failure.

[minor] "As illustrated later..."  Where?

See next

[] "and without further proof"  I hope this is at least specified at
that later point.


No need to prove anything. We made a design decision. We’re not routing though leaves -which are the only place where the planes meet in classical designs- so we need to create a wormhole between planes. The natural place for that iss the superspine. See next.


[nit] s/To satisfy this RIFT, in multi-plane designs, relies/To
satisfy this need in multi-plane designs, RIFT relies

I changed the text a little bit more:
“
   When aggregation is used, RIFT deals with fallen leaves by ensuring
   that all the ToF nodes share the same north topology database.  This
   happens naturally in single plane design by the means of northbound
   flooding and south reflection but needs additional considerations in
   multi-plane fabrics.  To enable routing to fallen leaves in multi-
   plane designs, RIFT requires additional interconnection across planes
   between the ToF nodes, e.g., using rings as illustrated in Figure 13.
   Other solutions are possible but they either need more cabling or end
   up having much longer flooding paths and/or single points of failure.

“




1247      In detail, by reserving two ports on each Top-of-Fabric node it is
1248      possible to connect them together by interplane bi-directional rings
1249      as illustrated in Figure 13.  The rings will be used to exchange full
1250      north topology information between planes.  All ToFs having same
1251      north topology allows by the means of transitive, negative
1252      disaggregation described in Section 4.2.5.2 to efficiently fix any
1253      possible fallen leaf scenario.  Somewhat as a side-effect, the
1254      exchange of information fulfills the ask to present full view of the
1255      fabric topology at the Top-of-Fabric level, without the need to
1256      collate it from multiple points by additional complexity of
1257      technologies like [RFC7752].

[nit] s/fulfills the ask to present full view/fulfills the requirement
to have a full view
done



[] "..., without the need to collate it from multiple points by
additional complexity of technologies like [RFC7752]."  This last
phrase is unnecessary: because carrying RIFT information in BGP-LS is
not defined, and more importantly, there's no need to criticize other
technology to make RIFT look better.
done


1259              +---+  +---+  +---+  +---+  +---+  +---+  +--------+
1260           |   |  |   |  |   |  |   |  |   |  |   |  |        |
1261           |      |      |      |      |      |      |        |
1262            +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+      |
1263          +-|   |--|   |--|   |--|   |--|   |--|   |--|   |-+    |
1264          | | o |  | o |  | o |  | o |  | o |  | o |  | o | |    | Plane A
1265          +-|   |--|   |--|   |--|   |--|   |--|   |--|   |-+    |
1266            +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+      |
1267          |      |      |      |      |      |      |         |
1268            +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+      |
1269          +-|   |--|   |--|   |--|   |--|   |--|   |--|   |-+    |
1270          | | o |  | o |  | o |  | o |  | o |  | o |  | o | |    | Plane B
1271          +-|   |--|   |--|   |--|   |--|   |--|   |--|   |-+    |
1272            +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+      |
1273           |      |      |      |      |      |      |        |
1274                               ...                            |
1275           |      |      |      |      |      |      |        |
1276            +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+      |
1277          +-|   |--|   |--|   |--|   |--|   |--|   |--|   |-+    |
1278          | | o |  | o |  | o |  | o |  | o |  | o |  | o | |    | Plane X
1279          +-|   |--|   |--|   |--|   |--|   |--|   |--|   |-+    |
1280            +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+  +-o-+      |
1281           |      |      |      |      |      |      |        |
1282           |   |  |   |  |   |  |   |  |   |  |   |  |        |
1283              +---+  +---+  +---+  +---+  +---+  +---+  +--------+
1284    Rings    1      2      3      4      5      6      7

1286        Figure 13: Connecting Top-of-Fabric Nodes Across Planes by Rings

[minor] Is that one ring per plane, multiple rings per plane or a big
ring for all the planes?  The drawing is not clear to me. :-(

Maybe you missed that the Rings are numbered at the bottom?
I tweaked the image as follows, I hope this helps:
“


Ring   1       2       3       4       5       6        7
      /   \   /   \   /   \   /   \   /   \   /   \   /     \
      |   .   |   .   |   .   |   .   |   .   |   .   |     .
      |   .   |   .   |   .   |   .   |   .   |   .   |     .
    +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+   .
  +-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-+ .
  | | O | . | O | . | O | . | O | . | O | . | O | . | O | | .
  +-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-+ .
    +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+   .
      |   .   |   .   |   .   |   .   |   .   |   .   |     .   Plane A
   ---|---.---|---.---|---.---|---.---|---.---|---.---|-----.-----------
      |   .   |   .   |   .   |   .   |   .   |   .   |     .   Plane B
    +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+   .
  +-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-+ .
  | | O | . | O | . | O | . | O | . | O | . | O | . | O | | .
  +-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-+ .
    +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+   .
      |   .   |   .   |   .   |   .   |   .   |   .   |     .   Plane B
   ---|---.---|---.---|---.---|---.---|---.---|---.---|-----.-----------
      |   .   |   .   |   .   |   .   |   .   |   .   |     .

                               ...

      |   .   |   .   |   .   |   .   |   .   |   .   |     .
   ---|---.---|---.---|---.---|---.---|---.---|---.---|-----.-----------
      |   .   |   .   |   .   |   .   |   .   |   .   |     .   Plane X
    +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+   .
  +-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-+ .
  | | O | . | O | . | O | . | O | . | O | . | O | . | O | | .
  +-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-.-|   |-+ .
    +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+ . +-o-+   .
      |   .   |   .   |   .   |   .   |   .   |   .   |     .
      |   .   |   .   |   .   |   .   |   .   |   .   |     .
      \   /   \   /   \   /   \   /   \   /   \   /   \     /
Ring   1       2       3       4       5       6        7

“




1288    4.1.5.  Addressing the Fallen Leaves Problem

1290      One consequence of the "Fallen Leaf" problem is that some prefixes
1291      attached to the fallen leaf become unreachable from some of the ToF
1292      nodes.  RIFT proposes two methods to address this issue, the positive
1293      and the negative disaggregation.  Both methods flood South TIEs to
1294      advertise the impacted prefix(es).

[nit] s/RIFT proposes two methods/RIFT defines two methods
Done

Many tanks again for the careful reading. Sorry that the 3-D representation takes time and effort to get used to.

Please let me know how you fell about the changes above.

Keep safe

Pascal

[End of Review - Part 1]


From: RIFT <rift-bounces@ietf.org> On Behalf Of Alvaro Retana
Sent: vendredi 5 mars 2021 16:36
To: draft-ietf-rift-rift@ietf.org
Cc: zhang.zheng@zte.com.cn; rift-chairs@ietf.org; rift@ietf.org
Subject: [Rift] AD Review of draft-ietf-rift-rift-12 (Part 2a)


Dear authors:

This is Part 2a of my review of this document.  I wanted this second installment to be longer, but given the IETF meeting next week, I am sending this fragment out now.

Part 2a starts with §4.2 (Specification) and goes just through §4.2.2 -- I just started reading the LIE FSM/§4.2.2.1.  I included a couple of comments on the schema itself.


While reading this part of the document I've been learning thrift and applying my programming and modeling experience (*).  In general, the model seems straight forward (so far).  It is important to highlight somewhere at the start of §4.2 that Appendix B is Normative.  The existing references don't explicitly say so.


In this first part of the specification there are a number of assumptions made that should be explained explicitly and not by reference.  For example, it seems to be assumed that the reader knows what a three-way handshake is -- while most readers of this specification will be familiar with other routing protocols, this document is *the* RIFT spec and should not rely on other protocol specifications, much less when the specification is not (can't!) be the same -- this is what §4.2.2 says:

   It uses a three-way handshake mechanism which is a cleaned up version of
   [RFC5303].  Observe that for easier comprehension the terminology of
   one/two and three-way states does NOT align with OSPF or ISIS FSMs albeit
   they use roughly same mechanisms.

IOW, "similar to IS-IS, but not the same."   There are more specific comments inline.


"Traditional" specifications usually present packet formats and describe the fields before explaining their use.  §4.2.2 jumped right into talking about a specific flag (v4_forwarding_capable) without introducing it or the place where it is carried (LIEPacket).  It is important to describe the different structures, and provide a way to make them simpler to find in Appendix B.

In my comments I'm asking for a lot of pointers to things that have not been discussed.  I don't know (at this point) whether the overall organization of the document makes sense as is or if it might be worth rethinking (i.e. shuffle things around).  Food for thought.


These comments/questions are for the Chairs/Shepherd:

- It looks like an early TSV review was requested, but I didn't see the review come in.  Did I miss it?

- §8.1 includes a set of suggested UDP ports, but they are not reflected in the registry.  Early allocation has not been requested -- you should consider doing so.  See rfc7120.


Thanks!

Alvaro.

(*) I don't have any. ;-)


[Line numbers from idnits.]

...
1341   4.2.  Specification
...
1350     "On Entry" actions on FSM state are performed every time and right
1351     before the according state is entered, i.e. after any transitions
1352     from previous state.

[nit] Suggestion>
   "On Entry" actions at an FSM state are performed right before the
   corresponding state is entered, i.e. after any transitions from a
   previous state.


1354     "On Exit" actions are performed every time and immediately when a
1355     state is exited, i.e. before any transitions towards target state are
1356     performed.

[nit] Suggestion>
   "On Exit" actions are performed immediately when a state is exited,
   i.e. before a transition towards a target state is performed.


1358     Any attempt to transition from a state towards another on reception
1359     of an event where no action is specified MUST be considered an
1360     unrecoverable error.

[major] Any type of action, or are you referring to an "on exit" one?


[major] "MUST be considered an unrecoverable error"  What is the result of that?  Should the adjacencies be reset, the rift process restarted, the interfaces shut down, etc.??   There's no other place in the document that mentions "unrecoverable".


1362     The FSMs and procedures are normative in the sense that an
1363     implementation MUST implement them either literally or an
1364     implementation MUST exhibit externally observable behavior that is
1365     identical to the execution of the specified FSMs.

[major] How can you tell the difference?  rfc2119 keywords "MUST only be used where it is actually required for interoperation".  In this case there are options, and there is no way to verify one or the other as long as the implementations "exhibit externally observable behavior that is identical".  IOW, I am not comfortable with using MUST if it is not required.  Also, I believe the statement to generally be true for any specification and not needed.

Suggestion (borrowing from rfc4271):
   The data structures and FSMs described in this document are conceptual
   and do not have to be implemented precisely as described here, as long
   as the implementations support the described functionality and exhibit
   the same externally visible behavior.


1367     Where a FSM representation is inconvenient, i.e. the amount of
1368     procedures and kept state exceeds the amount of transitions, we defer
1369     to a more procedural description on data structures.

[] This paragraph can also be eliminated.


1371   4.2.1.  Transport

1373     All packet formats are defined in Thrift [thrift] models in
1374     Appendix B.

[] Can we get an index/TOC or some type of guide in the appendix to make it easier to find specific parts?  For example, finding where a LIE is defined is not straight forward -- I happened to stumble on it in B.2.  Another option would be to point in §4.2.2 to look for LIEPacket in B.2.


[major] It must be stated somewhere that the contents of Appendix B are Normative.


1376     The serialized model is carried in an envelope within a UDP frame
1377     that provides security and allows validation/modification of several
1378     important fields without de-serialization for performance and
1379     security reasons.

[minor] This is a very short section -- please put references to other places where these topics are explained more, if any.


1381   4.2.2.  Link (Neighbor) Discovery (LIE Exchange)

1383     RIFT LIE exchange auto-discovers neighbors, negotiates ZTP parameters
1384     and discovers miscablings.  It uses a three-way handshake mechanism
1385     which is a cleaned up version of [RFC5303].  Observe that for easier
1386     comprehension the terminology of one/two and three-way states does
1387     NOT align with OSPF or ISIS FSMs albeit they use roughly same
1388     mechanisms.  The formation progresses under normal conditions from
1389     one-way to two-way and then three-way state at which point it is
1390     ready to exchange TIEs per Section 4.2.3.

[minor] "...which is a cleaned up version of [RFC5303]."  There's no need to hint at the fact that other implementations/protocols may not be as good/clean as rift.  Please take this part of the sentence out.


[minor] "Observe that for easier comprehension the terminology of one/two and three-way states does NOT align with OSPF or ISIS FSMs albeit they use roughly same mechanisms."   I don't know how changing everything I know makes comprehension easier. ;-)   Seriously: we don't need to tell the reader what is not used.


1392     LIE exchange happens over well-known administratively locally scoped
1393     and configured or otherwise well-known IPv4 multicast address
1394     [RFC2365] and/or link-local multicast scope [RFC4291] for IPv6
1395     [RFC8200] using a configured or otherwise a well-known destination
1396     UDP port defined in Appendix C.1.  LIEs SHOULD be sent with an IPv4
1397     Time to Live (TTL) / IPv6 Hop Limit (HL) of 1 to prevent RIFT
1398     information reaching beyond a single L3 next-hop in the topology.
1399     LIEs SHOULD be sent with network control precedence.

[] The first sentence is long and convoluted.  I think I understood that the assigned destination IP address and UDP port are used, but that they also can be configured.  Is that right?

Suggestion>

   LIEs are exchanged over the well-known multicast addresses and UDP
   port, as summarized in Appendix C.1.  An implementation MAY allow
   for the local configuration of these parameters.


[] If configured, do both sides need to be configured beforehand or is there a discovery mechanism?


[major] "LIEs SHOULD be sent with..."  When is it ok to not use these values?  IOW, why is this action recommended and not required?  (The next comment is related.)


[minor] "IPv4 Time to Live (TTL) / IPv6 Hop Limit (HL) of 1"  Was GTSM (rfc5082) considered?  Several documents (for example, rfc8085 and draft-ietf-opsec-v6) suggest its use.  You may be asked about it later.  It would be good to preempt those questions by adding some text in the Security Considerations about any risks...or using it.


[major] "LIEs SHOULD be sent with network control precedence."  When is it ok to not do so?  IOW, when is this behavior recommended and not required.  BTW, please add a reference.


1401     Originating port of the LIE has no further significance other than
1402     identifying the origination point.  LIEs are exchanged over all links
1403     running RIFT.

[nit] s/Originating port/The originating port


1405     An implementation MAY listen and send LIEs on IPv4 and/or IPv6
1406     multicast addresses.  A node MUST NOT originate LIEs on an address
1407     family if it does not process received LIEs on that family.  LIEs on
1408     same link are considered part of the same negotiation independent of
1409     the address family they arrive on.  Observe further that the LIE
1410     source address may not identify the peer uniquely in unnumbered or
1411     link-local address cases so the response transmission MUST occur over
1412     the same interface the LIEs have been received on.  A node MAY use
1413     any of the adjacency's source addresses it saw in LIEs on the
1414     specific interface during adjacency formation to send TIEs.  That
1415     implies that an implementation MUST be ready to accept TIEs on all
1416     addresses it used as source of LIE frames.

[major] "An implementation MAY listen and send LIEs on IPv4 and/or IPv6 multicast addresses.  A node MUST NOT originate LIEs on an address family if it does not process received LIEs on that family."

The first sentence says that it is optional to listen and send LIEs over either IPv4/IPv6 (IOW, a router doesn't have to do either).  The second one gives the impression that originating a LIE on a specific AF means that the router is willing to only receive/process LIEs on it (e.g. if an IPv6 LIE is not generated then there's no positive indication that the router can receive/process IPv6 LIEs).  The combination results in a potential lock at startup:

For example, if a router chooses to only listen on IPv4 (MAY = optional) and it's neighbor on IPv6, then they will be sending LIEs and never talk to each other.

I'm assuming that the intent is to allow for maximum flexibility: allow to send/receive on either (or both).  But, what if only one AF is supported?  Should an implementation be capable to control which AF is used?  If I have an IPv6-only deployment, should the router be expected to receive LIEs over IPv4?


[nit] s/LIEs on an address family/LIEs using an address family


[nit] s/LIEs on same link/All LIEs on a link


[minor] "part of the same negotiation"  Negotiation of what?  The text at the start of this section mentions that LIEs negotiate ZTP parameters -- is that it?  Other actions don't seem to be negotiations: discovery, for example.   I'm not sure if something like "session" would capture the intended meaning.


[major] "MAY use any of the adjacency's source addresses it saw in LIEs...to send TIEs."  This use of "MAY" makes using the source address from a LIE optional.  §4.2.3.3 says that "TIEs...using the destination address on which the LIE adjacency has been formed", which doesn't make the use optional.

OLD>
   A node MAY use any of the adjacency's source addresses it saw in
   LIEs on the specific interface during adjacency formation to send
   TIEs.

NEW>
   A node may use any of the adjacency's source addresses from the
   LIEs received on the specific interface during adjacency formation
   to send TIEs (Section 4.2.3.3).

[See related question in 4.2.3.3.]


1418     A three-way adjacency over any address family implies support for
1419     IPv4 forwarding if the `v4_forwarding_capable` flag is set to true
1420     and a node can use [RFC5549] type of forwarding in such a situation.
1421     It is expected that the whole fabric supports the same type of
1422     forwarding of address families on all the links.  Operation of a
1423     fabric where only some of the links are supporting forwarding on an
1424     address family and others do not is outside the scope of this
1425     specification.

[minor] "three-way adjacency"   I know there's a definition in the terminology section, but it is not a specification of what a three-way adjacency is.  Please add a forward reference to where it is specified.


[major] "`v4_forwarding_capable` flag"   (This comment is not specific to this flag.)

The specific LIE packet format has not been presented, introduced or even referenced up to now.  Pointing to a flag, which happens to be optional in an optional part of the LIEPacket, comes out of the blue and feels completely out of place without prior explanation.

I can see that some of the fields are described in Appendix B, but there should at least be a pointer to that.

I also expect a description somewhere of expectations and error conditions. For example, see my comments in the appendix related to LinkCapabilities.


[major] "a node can use [RFC5549] type of forwarding"

First of all, rfc5549 is a BGP-specific RFC.  How do the BGP mechanisms defined there (and in rfc8950) map to RIFT?  IOW, please specify how RIFT works.

Because you listed rfc5549 as a Normative reference, I assume you want this behavior to be normative.  s/can use/MUST/SHOULD/MAY ??


[major] "It is expected that the whole fabric supports the same type of forwarding of address families on all the links.  Operation of a fabric where only some of the links are supporting forwarding on an address family and others do not is outside the scope of this specification."

Please remind me, is IPv4 optional or required?  I'm assuming that IPv6 is required and IPv4 optional, but I can't find anything definitive in the document.

Even if IPv4 is optional, I'm assuming that most deployments will support both.  But as time goes on that may (slowly) change to IPv6-only.  Is that a fair assumption?

Because the ability to support IPv4 is signaled per link (v4_forwarding_capable), then the expectation that "the whole fabric supports the same type of forwarding of address families" may be "easily" broken if one link doesn't support IPv4 (including the case where a rogue node may not advertise v4_forwarding_capable to mess things up).  What should a node that detects an inconsistency (e.g. some links use dual-stack, but others only one AF) do?  I'm assuming that it SHOULD/MUST (?) go as far as disabling the adjacencies as described below, right?   Please be explicit and specific on the actions to be taken.

1427     The protocol does NOT support selective disabling of address
1428     families, disabling v4 forwarding capability or any local address
1429     changes in three-way state, i.e. if a link has entered three-way IPv4
1430     and/or IPv6 with a neighbor on an adjacency and it wants to stop
1431     supporting one of the families or change any of its local addresses
1432     or stop v4 forwarding, it has to tear down and rebuild the adjacency.
1433     It also has to remove any information it stored about the adjacency
1434     such as LIE source addresses seen.

1436     Unless ZTP as described in Section 4.2.7 is used, each node is
1437     provisioned with the level at which it is operating.  It MAY be also
1438     provisioned with its PoD.  If any of those values is undefined, then
1439     accordingly a default level and/or an "undefined" PoD are assumed.
1440     This means that leaves do not need to be configured at all if initial
1441     configuration values are all left at "undefined" value.  Nodes above
1442     ToP MUST remain at "any" PoD value which has the same value as
1443     "undefined" PoD.  This information is propagated in the LIEs
1444     exchanged.

[major] There are references made in this paragraph to things that are defined elsewhere (""any" PoD"..."propagated in the LIEs").  Please put pointers to where these things are defined.  As mentioned above, the LIE format hasn't been presented yet.

The text above gives the impression that the type of the PoD is somehow related to the level, so I went in search of that and found this in the LIEPacket:

   struct LIEPacket {
  ...
       /** Node's PoD. */
     7: optional common.PodType            pod =
             common.default_pod;

...but then I can't find anything that corresponds to PodType matching the values above: undefined, any...

...but I did find this:

   /** Common RIFT packet header. */
   struct PacketHeader {
...
     /** Level of the node sending the packet, required on everything
         except LIEs. Lack of presence on LIEs indicates UNDEFINED_LEVEL
         and is used in ZTP procedures.
      */
     4: optional common.LevelType            level;
   }

...and a couple of constants:

  const LevelType   top_of_fabric_level         = 24
  const LevelType   leaf_level                  = 0
  const LevelType   default_level               = leaf_level

If ZTP is not used, are there operational/deployment considerations for an operator to provision the levels?  Should they simply start counting from 0, is incrementing by one good enough or should it be by 2?  Or maybe there are considerations about the use of ZTP?  Please take a look at rfc5706.


[minor] "It MAY be also provisioned with its PoD."  What does this mean?  Does it mean that all the nodes in the PoD have the same level?  ...


[major] BTW, this reminds me, what are the manageability considerations related to RIFT?  Among other things, it would be nice to have a summary of parameters that are expected to be configured and their defaults.  Please take a look at rfc5706.


1446     Further definitions of leaf flags are found in Section 4.2.7 given
1447     they have implications in terms of level and adjacency forming here.

[] "leaf flags"   What are those?  I think this is the first mention.


[] Suggestion>
   Further definitions of leaf flags are found in Section 4.2.7.


1449     A node tries to form a three-way adjacency if and only if

[nit] s/tries to form/starts to form


[major] "if and only if"   Can this be translated into a Normative statement?  Are all these required (MUST) before the formation of a 3-way adjacency started?


1451     1.  the node is in the same PoD or either the node or the neighbor
1452         advertises "undefined/any" PoD membership (PoD# = 0) AND

[minor] "the node is in the same PoD"  As what?   I guess you mean in the same PoD as what a neighbor indicated in its LIE, right?


1454     2.  the neighboring node is running the same MAJOR schema version AND

[] The formats have not been introduced yet.

I peeked into Appendix B and found the PacketHeader.  As with #1, you're assuming a LIE packet has been received from the neighbor, but that is not mentioned.


[] The versioning system has not been introduced.  Please put a reference to it.


[nit] This is the only place where "MAJOR" (all caps) is used.  Please be consistent.


[minor] Are there any requirements related to the minor version?


1456     3.  the neighbor is not member of some PoD while the node has a
1457         northbound adjacency already joining another PoD AND

[nit] s/not member of some PoD/not a member of the same PoD


[major] "not [a] member of [the] some PoD"  #1 says that the "node is in the same PoD" -- it looks like there's some context missing -- I guess about the ability to join other PoDs.  Is this only a northbound characteristic?


1459     4.  the neighboring node uses a valid System ID AND

[minor] "System ID" and "SystemID" are both used.  Please pick one.


[major] What is a "valid System ID"?  The only clue that I found is in §4.2.7.2, which makes we wonder whether it is specific to ZTP or not (??):

   RIFT nodes require a 64 bit SystemID which SHOULD be derived as
   EUI-64 MA-L derive according to [EUI64].  The organizationally
   governed portion of this ID (24 bits) can be used to generate
   multiple IDs if required to indicate more than one RIFT instance."

If this is a recommendation (SHOULD), and assuming that it applies beyond ZTP, what is a "valid System ID"?

I also only found one place that talks about an IllegalSystemID, but that is probably not the same as an invalid one:

   /** 0 is illegal for SystemID */
   const SystemIDType IllegalSystemID        = 0


1461     5.  the neighboring node uses a different System ID than the node
1462         itself

[minor] Is there a way to resolve/address duplicate System IDs?  It seems to me that a rogue node may reflect the sender's System ID and prevent an adjacency from forming (along with a number of other values).


[minor] Also, I hope that the selection/assignment of the System ID is discussed elsewhere.  A pointer might be nice.


1464     6.  the advertised MTUs match on both sides AND

[] Ahhh...this in in the LIEPacket too.

[major]
   struct LIEPacket {
...
     /** Layer 3 MTU, used to discover to mismatch. */
     4: optional common.MTUSizeType        link_mtu_size =
             common.default_mtu_size;

If MTUSizeType is not included, where is it specified that the default_mtu_size must be considered?  Without that specification it is not possible to satisfy this condition.


1466     7.  both nodes advertise defined level values AND

[major] The advertisement of LevelType in PacketHeader is optional.  That means that even if there is a default value a node may not "*advertise* defined level values".  Also, there are only 2 level values defined:

   const LevelType   top_of_fabric_level         = 24
   const LevelType   leaf_level                  = 0

What am I missing?


1468     8.  [

1470            i) the node is at level 0 and has no three way adjacencies
1471            already to nodes at Highest Adjacency Three-Way level (HAT as
1472            defined later in Section 4.2.7.1) with level different than
1473            the adjacent node OR

[nit] s/three way/three-way/g


...
1478            iii) both nodes are at level 0 AND both indicate support for
1479            Section 4.3.8 OR

[minor] "support for Section 4.3.8"  Is there a name for that?  How is it indicated?  §4.3.8 says that a leaf can "advertise the LEAF_2_LEAF flag in its node capabilities"; there is nothing called "LEAF_2_LEAF" in the schema, but "leaf_only_and_leaf_2_leaf_procedures" does show up.  Please be consistent in the terminology.  [Also, there are places where "leaf-2-leaf" is used.]


...
1486     The rules checking PoD numbering MAY be optionally disregarded by a
1487     node if PoD detection is undesirable or has to be ignored.  This will
1488     not affect the correctness of the protocol except preventing
1489     detection of certain miscabling cases.

[nit] s/rules checking/rules for checking


[nit] s/MAY be optionally/MAY be    (redundant)


[minor] What are that the "rules [for] checking PoD numbering"?  Where are they specified?


[major] "MAY be optionally disregarded by a node if PoD detection is undesirable or has to be ignored."

This is the only place where "PoD detection" is used.  What is it?  Where is it specified?  Why would be it undesirable?  When would it have to be ignored?  Are there operational considerations for making that decision?


[major] "...except preventing detection of certain miscabling cases."  Where is miscabling detection described?  Which cases may not be covered?  §4.2.7.6 says that "internal ruleset flags a possible miscabling", but that is the closest that I came to finding a description (and rulesets are not mentioned anywhere else).


1491     A node configured with "undefined" PoD membership MUST, after
1492     building first northbound three way adjacencies to a node being in a
1493     defined PoD, advertise that PoD as part of its LIEs.  In case that
1494     adjacency is lost, from all available northbound three way
1495     adjacencies the node with the highest System ID and defined PoD is
1496     chosen.  That way the northmost defined PoD value (normally the ToP
1497     nodes) can diffuse southbound towards the leaves "forcing" the PoD
1498     value on any node with "undefined" PoD.

[nit] s/building first...adjacencies/building its first...adjacency


[nit] s/to a node being in a defined PoD/to a node in a defined PoD


[minor] "...defined PoD is chosen."  ...chosen as the PoD value to advertise.  (or somehting along those lines)


[nit] s/northmost/northernmost


[] I'm missing how using the System ID to select will always result in the north direction...


[minor] "...diffuse southbound towards the leaves "forcing" the PoD value on any node with "undefined" PoD."  An example of this would be nice.


1500     LIEs arriving with IPv4 Time to Live (TTL) / IPv6 Hop Limit (HL)
1501     larger than 1 MUST be ignored.

[major] "MUST be ignored"  The specification of the sender (§4.2.2) says that "LIEs SHOULD be sent with an IPv4 Time to Live (TTL) / IPv6 Hop Limit (HL) of 1".  This is a Normative conflict because it is only recommended to the sender, but required by the receiver.


1503     A node SHOULD NOT send out LIEs without defined level in the header
1504     but in certain scenarios it may be beneficial for trouble-shooting
1505     purposes.

[] See the related comments in #7 above.


[nit] s/trouble-shooting/troubleshooting


[major] #7 above makes advertising defined values a requirement to forming a three-way adjacency, but it is only recommended here.  Unless there are specific troubleshooting cases described, we should retain the requirement.  I believe that a packet can be crafted in any way for troubleshooting/testing, but that doesn't have to be indicated in a specification which should deal with the correct operation of the protocol.  IOW, I think we can do without this paragraph.


1507   4.2.2.1.  LIE FSM

1509     This section specifies the precise, normative LIE FSM and can be
1510     omitted unless the reader is pursuing an implementation of the
1511     protocol.

[nit] s/and can be omitted..../and can be omitted.


[] It would be very nice if there was an upfront description (at least a list) of the different elements: states, events, etc.. to make reading the diagrams easier.  Also, the terminology (PUSH, etc.) should also be upfront.


1513     Initial state is `OneWay`.

1515     Event `MultipleNeighbors` occurs normally when more than two nodes
1516     see each other on the same link or a remote node is quickly
1517     reconfigured or rebooted without regressing to `OneWay` first.  Each
1518     occurrence of the event SHOULD generate a clear, according
1519     notification to help operational deployments.

[] This event is the only one described.  Is there anything special about it?  Is this the only one for which generating a notification is recommended?  Is that why it is described here?


[minor] "occurs normally when..."   Are there other conditions (which are not considered "normal")?


[major] "more than two nodes see each other on the same link"  The description below is different (and I think more accurate): "more than one neighbor seen on interface".  Also, it doesn't include the second part about a "reconfigured or rebooted" node.  What is the characteristic of that behavior?


[major] "SHOULD generate a clear, according notification to help operational deployments."   Normatively, what is "clear"?  Is there information that you have in mind that should be included to make it "clear"?  I assume that a notification is a message logged on the node, is that true?

Suggestion>
   ...SHOULD log a message including the System IDs of each node.

   (Anything else?)


[] This seems to be the only occurrence of a log resulting from a message or other action.  I think I'm missing the significance of this event.


[Ok.  This is where I am in my reading.  There are more comments below.]


...
2069   4.2.3.3.  Flooding
...
2082     As described before, TIEs themselves are transported over UDP with
2083     the ports indicated in the LIE exchanges and using the destination
2084     address on which the LIE adjacency has been formed.  For unnumbered
2085     IPv4 interfaces same considerations apply as in equivalent OSPF case.

[major] (Related to a comment in 4.2.2.) "using the destination address on which the LIE adjacency has been formed"  If multiple addresses are used (from different AFs), which one is selected?  Is there a "primary" address that is considered when the adjacency is formed?


...
6165   10.2.  Informative References
...
6175     [DOT]      Ellson, J. and L. Koutsofios, "Graphviz: open source graph
6176                drawing tools", Springer-Verlag , 2001.

[major] Is there an URI that can be used as a stable pointer?


...
6362   B.1.  common.thrift
...
6730   /** Link capabilities. */
6731   struct LinkCapabilities {
6732       /** Indicates that the link is supporting BFD. */
6733       1: optional bool                           bfd =
6734               common.bfd_default;
6735       /** Indicates whether the interface will support v4 forwarding.

6737           @note: This MUST be set to true when LIEs from a v4 address are
6738                  sent and MAY be set to true in LIEs on v6 address. If v4
6739                  and v6 LIEs indicate contradicting information the
6740                  behavior is unspecified. */

6742       2: optional bool                           v4_forwarding_capable =
6743               true;
6744   }

[major] Not knowing thrift, and assuming that this structure is some sort of enum/list, what should a receiver do if the undefined/unknown value 3 is received?


[major] "contradicting information the behavior is unspecified."  How can the behavior be unspecified if this *is* the specification?   More to the point: if the indication is required in IPv4 LIEs, but only optional in IPv6 LIEs, why wouldn't the information from the IPv4 LIEs be considered as the truth?


[nit] s/interface will support v4 forwarding/interface supports IPv4 forwarding/g


[nit] s/v4/IPv4/g  s/v6/IPv6/g


[major] "Indicates that the link is supporting BFD."  I was going to ask if setting bfd to TRUE means that the link can support BFD (is capable) or if it BFS is actively enabled (supporting), but I found this somewhere else:

   /** default link being BFD capable */
   const bool            bfd_default                = true

...so I'm assuming that it just means that the link is capable, which makes sense given that is it in LinkCapablities.

s/Indicates that the link is supporting BFD./Indicates that the link is BFD-capable./g

Also, note that §4.3.5 says that "RIFT MAY incorporate BFD", which is not completely aligned with the default indication of the link being BFD-capable.  Using BFD is different than being BFD-capable, so it is ok for its use to be optional (§4.3.5).  However, there is no stated requirement for a link to be BFD-capable, as indicated in the default setting.

[End of Review 2a]