[109all] Network close and update
Sean Croghan <sean@xagsolutions.com> Fri, 20 November 2020 09:56 UTC
Return-Path: <sean@xagsolutions.com>
X-Original-To: 109all@ietfa.amsl.com
Delivered-To: 109all@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A7D873A1B38 for <109all@ietfa.amsl.com>; Fri, 20 Nov 2020 01:56:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IGPXKvE32CwV for <109all@ietfa.amsl.com>; Fri, 20 Nov 2020 01:56:05 -0800 (PST)
Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2093.outbound.protection.outlook.com [40.107.94.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 320713A1B36 for <109all@ietf.org>; Fri, 20 Nov 2020 01:56:05 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HvfsgODB4ziyvfYVOalIwvKFOHgqWCQ7QIDXvUVKT6yYFx/7sMnU2/LkstxgVPTMZ3T4FdyK1yzH4rOjRRl94nnmpZFhMXlc2h3ufQjf4v17n9/enrsfNIDJozO6hwqDjxTbG9azHFZ/E7MMkEBv+NgYQAITvSwb+n8LpG/+O3lh0gfSVSqtyENH1H2hdIX2G3TG4Gq3LX+fxSTYwIYOaxTGS7fxvjv6Q8Uk3Z4nBLyFZ1/wntD6/v1M5QZ30wA5YtWQnK0NRAtjB4AsIY9ErDgi1x+BROgxCqxayVlsTt9FHx/C5BaYQmklQR0C3biBE4M09uhHUtJKqDYvkUAJQw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OYqbQ57K0baPrNxSh7E6FzhBozwQsBP3bFeoSOqcd+Y=; b=jZfZcPaf7LUEELQcP1T4RpNdoAi89EuAr5Yc0R4YrzSlPaQMaJn8SiiYLYlyfd/0i2HcsBLbgkJu/eUauiFt9l05hqsKmYzTxOU7c9P9PiS3hGK0Z+b8vnG3MuT//1OnvMkvCBer7+SccXhOITC5CKckCR5PRhsVc8mtU//7p725tE9F7Ch8gmc5pXoycAbPoCqHjvo6fI1gPPi4LAc9rXrUsfVRGOhNgce5QDqZWOKyp8hVPQBNXdra/Tubk+anxDmLLJLXqqp/YkjlEW8Qis6BZi4fEK1Ko82mZu/xbbYk2sCbjX3EX0g4v/rkeJx+5KqV9xbk9g7IYSm5r6WdFw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=xagsolutions.com; dmarc=pass action=none header.from=xagsolutions.com; dkim=pass header.d=xagsolutions.com; arc=none
Received: from SN6PR06MB4031.namprd06.prod.outlook.com (2603:10b6:805:17::22) by SN4PR0601MB3632.namprd06.prod.outlook.com (2603:10b6:803:49::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3589.21; Fri, 20 Nov 2020 09:56:01 +0000
Received: from SN6PR06MB4031.namprd06.prod.outlook.com ([fe80::c978:fce6:f5d0:a3ee]) by SN6PR06MB4031.namprd06.prod.outlook.com ([fe80::c978:fce6:f5d0:a3ee%3]) with mapi id 15.20.3589.020; Fri, 20 Nov 2020 09:56:01 +0000
From: Sean Croghan <sean@xagsolutions.com>
To: "109all@ietf.org" <109all@ietf.org>
Thread-Topic: Network close and update
Thread-Index: AQHWvyNa09bZGz+TVUqZrFPCxCQ3Ww==
Date: Fri, 20 Nov 2020 09:56:01 +0000
Message-ID: <CD61BFF5-CECD-43A0-B41B-CAAF533B2EF1@xagsolutions.com>
References: <1A8D2B8D-CC33-4430-B4FB-61995B3CF5EF@xagsolutions.com> <F2645AEB-4CFD-4629-9463-AF6DA019DFB7@xagsolutions.com>
In-Reply-To: <F2645AEB-4CFD-4629-9463-AF6DA019DFB7@xagsolutions.com>
Reply-To: "109attendees@ietf.org" <109attendees@ietf.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3608.120.23.2.4)
authentication-results: ietf.org; dkim=none (message not signed) header.d=none;ietf.org; dmarc=none action=none header.from=xagsolutions.com;
x-originating-ip: [50.53.103.206]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 0a40d073-194b-47e6-0ba4-08d88d3a7d10
x-ms-traffictypediagnostic: SN4PR0601MB3632:
x-microsoft-antispam-prvs: <SN4PR0601MB3632F34E6A1B9832194A8E69B8FF0@SN4PR0601MB3632.namprd06.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: ZDhL/shjfa1aRlZ1c+vn4oSr788Lm6VFaEJy7gtembgjfz6QhBCjXyQc7QG5r+b1kiSZjjWTPtYCC7FK4yhRf20PsRj2ukfU+q+pDa8egg7HSCfK29zG01oBMYj0IO6NMyZtYbY9ApJg79bgtaz1JL6MX32NWnsBH9qz7cUYwUah29O4rDFj3G660NuV4Bv3k+gpyHwJCeogKsGnZXYpmUEcKHQl2L/11SgMV7DQw7EwLugeYVcZjvwfRrZtQ7rNt+NOtcgS4ZEo8C/4EiQYTxQcKuRKLBde7pWVUDe3MESEbtQrK7fgTEedPY0w6knNDz++0kgMEmYbuO24DVOoHXWCpMAtNhcizJLkBPpSM7Hn8Auk1bs6LUacIIk/8kKAWG2+Q8Zptg40ew6PW2AshA==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR06MB4031.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(376002)(346002)(39830400003)(136003)(366004)(396003)(33656002)(83380400001)(8676002)(3480700007)(53546011)(166002)(8796002)(91956017)(71200400001)(2906002)(6506007)(15650500001)(8936002)(76116006)(64756008)(6486002)(316002)(966005)(186003)(66806009)(66574015)(86362001)(6512007)(36756003)(66556008)(66446008)(66946007)(5660300002)(478600001)(26005)(2616005)(6916009)(66476007); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata: Nb5o4BudZl7BxS708nl7kedpGA1x23DdTvXfA9tBXNHKcxiUxp7brZIwit3kbsvJf1n3uyzihpJO7IXi/Jia7keWVIZScdq1c4P8XxM8CiwSSSgdiHxKDYa3NQ7jQTlCEhGNME7+FFONrLTJERCzZa9hjWC6iv3AV5o5r13mNTaZy4nBuzNMYe3fgR738O8Qj573VieJ+7e5owSN+1BFAQOTSeg3mbSK8hDSjAP4vf/pURspYLJ0npdd6RuLmvWpo8KCihYz3y0C2mcYjPzUgyFjRwKVZp/00/EBHVefwdYUdom6b8xGvsZYHyBpsOvItF5zBoeKC6oVECOBh8V4BZ5YbvLvEyWClZMo/aZjFGiC+K7JkhBWN3V0Zj8wKsUU5f7fnde/vaKguL953dcYOGF1SlMgkhJbKAOM4gqfrzeAJshxvLdkYHZeMhPraCTrvFq+6CPxdIMPlpVJrMoLm4H1TyDJhrn0woc6cU3z/gO1dUfG1/WeGZ6MQkGdawTPFQbwSa/mZOUQUCB/YpEgupgQzdtM54lZruOGKMIboTVnsyl+VPi2Inx2wfPMamxyhfAJhwX4Yj9b21qq2A/LjFSBWOZUX3AZAKmzhtoQ+Pf+lt+6itE9t6Rhl9BqD7H0SIDXqoH6ipT2sEyQ/Pr16Q==
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_CD61BFF5CECD43A0B41BCAAF533B2EF1xagsolutionscom_"
MIME-Version: 1.0
X-OriginatorOrg: xagsolutions.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: SN6PR06MB4031.namprd06.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 0a40d073-194b-47e6-0ba4-08d88d3a7d10
X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Nov 2020 09:56:01.7988 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 3905f813-0012-41b2-a403-d7a3a748e3c3
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 50c7vZ0JS29w/SR4vl3MnW4rFgu2QyWqLtmOJgoxffFFB46plwB4hRu1JaKBRkcC/DeTBlLIoeZsFdCjnVD03Q==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN4PR0601MB3632
Archived-At: <https://mailarchive.ietf.org/arch/msg/109all/aJgwSC7X9ZhPJaDLM5ReBCQ9jmg>
Subject: [109all] Network close and update
X-BeenThere: 109all@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Official communication about IETF 109 <109all.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/109all>, <mailto:109all-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/109all/>
List-Post: <mailto:109all@ietf.org>
List-Help: <mailto:109all-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/109all>, <mailto:109all-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Nov 2020 09:56:08 -0000
Everyone, First an update to the network interruption during iabopen and the plenary. We have confirmed that backend maintenance events occurred during the outages. Azure Direct Support provided the following detail for the events. We identified the occurrence was caused by an Azure initiated memory-preserving update action. Note that RDP and SSH connections to the VM, or requests to any other services running inside the VM, could have failed during this time. This update is part of routine maintenance performed on the underlying hosts for this VM. During these updates, the VM is frozen for up to 30 seconds and then resumed. We apologize for any inconvenience this may have caused you. We are continuously working to improve the platform to reduce incidences of virtual machine unavailability. For 110 we will have processes in place to mitigate/eliminate this issue. So as we close this meeting I wish you all a safe walk to your bed and a quick adjustment back to your normal time. Also a big thank you to the NOC Team of Volunteers and Staff, a truly amazing group of people. • Rick Alfvin (Linespeed) • Alessandro Amirante (Meetecho) • Hirochika Asai (Preferred Networks/WIDE) • Rob Austein (Arrcus/DRL) • Tobia Castaldi (Meetecho) • Joe Clarke (Cisco) • Bill Fenner (Arista) • Bill Jensen (University of Wisconsin–Madison) • Hans Kuhn (NSRC) • Nick Kukich (Linespeed) • Warren Kumari (Google) • Lucy Lynch • Lorenzo Miniero (Meetecho) • Karen O'Donoghue (ISOC) • Con Reilly (Linespeed) • Simon Pietro Romano (Meetecho) • Paolo Saviano (Meetecho) • Clemens Schrimpe I hope you all have a safe and wonderful New Year! — Sean On Nov 18, 2020, at 5:56 PM, Sean Croghan <sean@xagsolutions.com<mailto:sean@xagsolutions.com>> wrote: As previously reported, we tracked down the cause of the interruption of the iabopen session to an issue with an unexpected Azure network interface removal event on network interfaces provisioned with SR-IOV. To prevent this happening again we intended to remove SR-IOV networking entirely. Unfortunately it now transpires that this change did not get applied to 2 of the 16 VMs including the application VM for the Plenary. So to add to the list of reasons to want 2020 to be over, towards the end of Plenary the same network interface removal event occurred and triggered an outage long enough to affect everyone. I can confirm that the SR-IOV provisioning has now been removed from all VMs, which we believe eliminates the risk of the same thing happening again. We continue to work with Azure Direct Support to determine the underlying cause of the removal events. Please let me know if you have any questions. Sean On Nov 17, 2020, at 4:56 PM, Sean Croghan wrote: I have an update for those of you affected by the outage in yesterdays IABOPEN session. We have isolated this to a interrupt to the virtual machines network interface. We currently have no explanation for this outage. We have engaged the hardware and network team with Azure to determine the cause of this event but do not have an explanation at this time. I will provide an update when we have received more information. For those interested in details: At 07:56:36 UTC the network interface (eth0) went link down and the interface was removed from the VM At 08:00:28 UTC then a new interface was added to the VM At 08:00:29 UTC (eth1) went link up Yes the VM added a new interface. The servers were provisioned with SR-IOV and we suspect that a migration event occurred that moved the VM to different hardware causing the NIC driver to be reloaded. We have found some evidence that would support our theory that a migration or unscheduled maintenance event occurred and are working to verify if that happened during this event. We have removed SR-IOV from the network interfaces on all servers. I hope you are having a good and productive week — The IEFT NOC Team -- 109all mailing list 109all@ietf.org<mailto:109all@ietf.org> https://www.ietf.org/mailman/listinfo/109all
- [109all] NOC update Sean Croghan
- [109all] NOC update #2 Sean Croghan
- [109all] Network close and update Sean Croghan