Re: [109attendees] [109all] NOC update

Lucy Lynch <llynch@civil-tongue.net> Wed, 18 November 2020 01:10 UTC

Return-Path: <llynch@civil-tongue.net>
X-Original-To: 109attendees@ietfa.amsl.com
Delivered-To: 109attendees@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D5073A11A2 for <109attendees@ietfa.amsl.com>; Tue, 17 Nov 2020 17:10:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p3sONGRdmbsF for <109attendees@ietfa.amsl.com>; Tue, 17 Nov 2020 17:10:53 -0800 (PST)
Received: from hans.rg.net (hans.rg.net [IPv6:2001:418:1::42]) by ietfa.amsl.com (Postfix) with ESMTP id 2AC1D3A11AC for <109attendees@ietf.org>; Tue, 17 Nov 2020 17:10:51 -0800 (PST)
Received: from [192.168.11.98] (c-73-96-132-59.hsd1.or.comcast.net [73.96.132.59]) (authenticated bits=0) by hans.rg.net (8.16.1/8.15.2) with ESMTPSA id 0AI1AnGJ009313 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT); Wed, 18 Nov 2020 01:10:50 GMT (envelope-from llynch@civil-tongue.net)
X-Authentication-Warning: hans.rg.net: Host c-73-96-132-59.hsd1.or.comcast.net [73.96.132.59] claimed to be [192.168.11.98]
Content-Type: multipart/alternative; boundary="Apple-Mail-964EAC11-5EFD-4715-BA63-E232C98ADC56"
Content-Transfer-Encoding: 7bit
From: Lucy Lynch <llynch@civil-tongue.net>
Mime-Version: 1.0 (1.0)
Message-Id: <236EEFCF-4462-43DD-8526-2A1026E08376@civil-tongue.net>
References: <1A8D2B8D-CC33-4430-B4FB-61995B3CF5EF@xagsolutions.com>
In-Reply-To: <1A8D2B8D-CC33-4430-B4FB-61995B3CF5EF@xagsolutions.com>
To: "109attendees@ietf.org" <109attendees@ietf.org>
Date: Tue, 17 Nov 2020 17:10:44 -0800
X-Mailer: iPad Mail (18B92)
Archived-At: <https://mailarchive.ietf.org/arch/msg/109attendees/SRvZKKpYHpqiIuqI6wf_ns-PMXw>
Subject: Re: [109attendees] [109all] NOC update
X-BeenThere: 109attendees@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mailing list for IETF 109 attendees <109attendees.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/109attendees>, <mailto:109attendees-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/109attendees/>
List-Post: <mailto:109attendees@ietf.org>
List-Help: <mailto:109attendees-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/109attendees>, <mailto:109attendees-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Nov 2020 01:10:55 -0000


> On Nov 17, 2020, at 4:57 PM, Sean Croghan <sean@xagsolutions.com> wrote:
> 
> 
> 
> 
> I have an update for those of you affected by the outage in yesterdays IABOPEN session. We have isolated this to a interrupt to the virtual machines network interface. We currently have no explanation for this outage. We have engaged the hardware and network team with Azure to determine the cause of this event but do not have an explanation at this time. 
> 

So I’ve noticed an odd thing which may or may not be relevant - 

I joined all the relevant jabber streams for my favorite WGs early using my iPad client -
When I join the meetecho stream I’m also spliced into the chat / jabber

Both seem to work but I wonder if distributing the joins across VMs would cause some kind of OAuth r e condition? 

I will note that people seem to complain about losing services piece by piece so there is some kind of fracture or mis-match in play.

Don’t know enough to be smart so feel free to write this off as a stupid remark but I will note that token flows are sucking hard


> I will provide an update when we have received more information. 
> 
> 
> For those interested in details:
> 
> At 07:56:36 UTC the network interface (eth0) went link down and the interface was removed from the VM
> At 08:00:28 UTC then a new interface was added to the VM
> At 08:00:29 UTC (eth1) went link up 
> 
> Yes the VM added a new interface. The servers were provisioned with SR-IOV and we suspect that a migration event occurred that moved the VM to different hardware causing the NIC driver to be reloaded. We have found some evidence that would support our theory that a migration or unscheduled maintenance event occurred and are working to verify if that happened during this event. We have removed SR-IOV from the network interfaces on all servers. 
> 
> I hope you are having a good and productive week 
> 
> 
> — The IEFT NOC Team
> 
> -- 
> 109all mailing list
> 109all@ietf.org
> https://www.ietf.org/mailman/listinfo/109all